From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758362Ab3ETWsa (ORCPT ); Mon, 20 May 2013 18:48:30 -0400 Received: from mail-ia0-f175.google.com ([209.85.210.175]:53837 "EHLO mail-ia0-f175.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758024Ab3ETWs2 (ORCPT ); Mon, 20 May 2013 18:48:28 -0400 Date: Mon, 20 May 2013 16:48:24 -0600 From: Bjorn Helgaas To: "Zhang, LongX" Cc: "linasvepstas@gmail.com" , "linux-pci@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "yanmin_zhang@linux.intel.com" , "Joseph.Liu@Emulex.Com" Subject: Re: Subject : [ PATCH ] pci-reset-error_state-to-pci_channel_io_normal-at-report_slot_reset Message-ID: <20130520224824.GA31740@google.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Apr 26, 2013 at 06:28:59AM +0000, Zhang, LongX wrote: > From: Zhang Long > > Specific pci device drivers might have many functions to call > pci_channel_offline to check device states. When slot_reset happens, > drivers' slot_reset callback might call such functions and eventually > abort the reset. > > The patch resets pdev->error_state to pci_channel_io_normal at > the begining of report_slot_reset. > > Thank Liu Joseph for pointing it out. > > Signed-off-by: Zhang Yanmin > Signed-off-by: Zhang Long > --- > drivers/pci/pcie/aer/aerdrv_core.c | 1 + > drivers/pci/pcie/portdrv_pci.c | 12 +++++------- > 2 files changed, 6 insertions(+), 7 deletions(-) > > diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c > index 564d97f..c61fd44 100644 > --- a/drivers/pci/pcie/aer/aerdrv_core.c > +++ b/drivers/pci/pcie/aer/aerdrv_core.c > @@ -286,6 +286,7 @@ static int report_slot_reset(struct pci_dev *dev, void *data) > result_data = (struct aer_broadcast_data *) data; > > device_lock(&dev->dev); > + dev->error_state = pci_channel_io_normal; > if (!dev->driver || > !dev->driver->err_handler || > !dev->driver->err_handler->slot_reset) > diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c > index ed4d094..7abefd9 100644 > --- a/drivers/pci/pcie/portdrv_pci.c > +++ b/drivers/pci/pcie/portdrv_pci.c > @@ -332,13 +332,11 @@ static pci_ers_result_t pcie_portdrv_slot_reset(struct pci_dev *dev) > pci_ers_result_t status = PCI_ERS_RESULT_RECOVERED; > int retval; > > - /* If fatal, restore cfg space for possible link reset at upstream */ > - if (dev->error_state == pci_channel_io_frozen) { > - dev->state_saved = true; > - pci_restore_state(dev); > - pcie_portdrv_restore_config(dev); > - pci_enable_pcie_error_reporting(dev); > - } > + /* restore cfg space for possible link reset at upstream */ > + dev->state_saved = true; > + pci_restore_state(dev); > + pcie_portdrv_restore_config(dev); > + pci_enable_pcie_error_reporting(dev); > > /* get true return value from &status */ > retval = device_for_each_child(&dev->dev, &status, slot_reset_iter); I think this patch changes the behavior in the case of a non-fatal error where one of the .error_detected() methods returned PCI_ERS_RESULT_NEED_RESET. In that case, pcie_portdrv_slot_reset() previously did not restore config space, but after your patch, it *will* restore it. We need an explanation of why this is safe. I think you should split this into two patches: the first would remove the "if (dev->error_state == pci_channel_io_frozen)" test from portdrv_pci.c and explain the reason, and the second would make the aerdrv_core.c change. I'm also concerned that in that same case (a non-fatal error where one of the .error_detected() methods returned PCI_ERS_RESULT_NEED_RESET), I don't think we actually *do* any kind of device reset. This isn't related to your patch, of course, so if you resolve the config space restore question, we can deal with the reset question later. Bjorn