From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E361FC433E2 for ; Thu, 28 May 2020 03:57:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CEA7F207D3 for ; Thu, 28 May 2020 03:57:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727062AbgE1D5Z (ORCPT ); Wed, 27 May 2020 23:57:25 -0400 Received: from mga09.intel.com ([134.134.136.24]:61851 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726770AbgE1D5Y (ORCPT ); Wed, 27 May 2020 23:57:24 -0400 IronPort-SDR: kkNArJ+smRrZ9sjhjx+bvXgPRL9jbTlNPwjW+0ptniy7p6zmZL0pGoSSYIm2+FIqx1rYfF7f5P lQDYgWMus1EA== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 May 2020 20:57:24 -0700 IronPort-SDR: 3S6L/iXfskcvbYhIzRmImXaUvfA2+HKW5UtcpSeXhdWNdrII+EDGo/MfL2BQhBOrSfj2BQ2edd z31UXpJcgA5Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,443,1583222400"; d="scan'208";a="414459895" Received: from davidowe-mobl.amr.corp.intel.com (HELO [10.255.229.1]) ([10.255.229.1]) by orsmga004.jf.intel.com with ESMTP; 27 May 2020 20:57:23 -0700 Subject: Re: [PATCH v1 1/1] PCI/ERR: Handle fatal error recovery for non-hotplug capable devices To: Yicong Yang , bhelgaas@google.com Cc: jay.vosburgh@canonical.com, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, ashok.raj@intel.com References: <18609.1588812972@famine> <2569c75c-41a6-d0f3-ee34-0d288c4e0b61@linux.intel.com> <8dd2233c-a636-59fa-4c6e-5da08556d09e@hisilicon.com> <9b2aecd8-b474-31b7-7cd2-1a8633a2485d@linux.intel.com> From: "Kuppuswamy, Sathyanarayanan" Message-ID: <7b15ffd4-ac9b-6753-63a7-dc6e2bfa30c8@linux.intel.com> Date: Wed, 27 May 2020 20:57:23 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 5/26/20 11:41 PM, Yicong Yang wrote: >>> We should do slot reset if driver required, but it's different from the `slot reset` in pci_bus_error_reset(). >>> Previously we don't do a slot reset and call ->slot_reset() directly, I don't know the certain reason. >> IIUC, your concern is whether it is correct to trigger reset for >> pci_channel_io_normal case right ? Please correct me if my >> assumption is incorrect. > right. > >> If its true, then why would report_error_detected() will return >> PCI_ERS_*_NEED_RESET for pci_channel_io_normal case ? If >> report_error_detected() requests reset in pci_channel_io_normal >> case then I think we should give preference to it. > If we get PCI_ERS_*_NEED_RESET, we should do slot reset, no matter it's a > hotpluggable slot or not. pci_slot_reset() function itself has dependency on hotplug ops. So what kind of slot reset is needed for non-hotplug case? static int pci_slot_reset(struct pci_slot *slot, int probe) { int rc; if (!slot || !pci_slot_resetable(slot)) return -ENOTTY; if (!probe) pci_slot_lock(slot); might_sleep(); rc = pci_reset_hotplug_slot(slot->hotplug, probe); if (!probe) pci_slot_unlock(slot); return rc; } static int pci_reset_hotplug_slot(struct hotplug_slot *hotplug, int probe) { int rc = -ENOTTY; if (!hotplug || !try_module_get(hotplug->owner)) return rc; if (hotplug->ops->reset_slot) rc = hotplug->ops->reset_slot(hotplug, probe); module_put(hotplug->owner); return rc; } And we shouldn't do it here in reset_link(), that's > two separate things. The `slot reset` done in aer_root_reset() is only for *link > reset*, as there may have some side effects to perform secondary bus reset directly > for hotpluggable slot, as mentioned in commit c4eed62a2143, so it use slot reset > to do the reset link things. > > As for slot reset required by the driver, we should perform it later just before the > ->slot_reset(). I noticed the TODO comments there and we should implement > it if it's necessary. I agree. > > It lies in line 183, drivers/pcie/err.c: > > if (status == PCI_ERS_RESULT_NEED_RESET) { > /* > * TODO: Should call platform-specific > * functions to reset slot before calling > * drivers' slot_reset callbacks? > */ > status = PCI_ERS_RESULT_RECOVERED; > pci_dbg(dev, "broadcast slot_reset message\n"); > pci_walk_bus(bus, report_slot_reset, &status); > } > >