From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754808Ab2DWUHk (ORCPT ); Mon, 23 Apr 2012 16:07:40 -0400 Received: from mx1.redhat.com ([209.132.183.28]:1027 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754535Ab2DWUHd (ORCPT ); Mon, 23 Apr 2012 16:07:33 -0400 Message-ID: <4F95B67D.7020706@redhat.com> Date: Mon, 23 Apr 2012 16:07:25 -0400 From: Don Dutile User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.23) Gecko/20110927 Red Hat/3.1.15-1.el6_1 Thunderbird/3.1.15 MIME-Version: 1.0 To: "Rafael J. Wysocki" CC: Bjorn Helgaas , Linus Torvalds , Mikko Vinni , "linux-input@vger.kernel.org" , "linux-kernel@vger.kernel.org" , Dmitry Torokhov , Allen Kay , Jesse Barnes , "linux-pci@vger.kernel.org" Subject: Re: [PATCH] PCI: Fix regression in pci_restore_state() References: <1334310754.17013.YahooMailNeo@web161804.mail.bf1.yahoo.com> <201204152140.40747.rjw@sisk.pl> <201204232153.57351.rjw@sisk.pl> In-Reply-To: <201204232153.57351.rjw@sisk.pl> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 04/23/2012 03:53 PM, Rafael J. Wysocki wrote: > On Monday, April 23, 2012, Bjorn Helgaas wrote: >> On Sun, Apr 15, 2012 at 1:40 PM, Rafael J. Wysocki wrote: >>> On Sunday, April 15, 2012, Linus Torvalds wrote: >>>> On Sun, Apr 15, 2012 at 11:47 AM, Rafael J. Wysocki wrote: >>>>> >>>>> mdelay(10) doesn't really look good either to me in this case, though. >>>> >>>> Oh, I agree. What kind of ass-backwards device actually needs that >>>> kind of crazy delays? It is almost certainly buggy. >>>> >>>> With retries, 10ms delays are totally unacceptable. There's something wrong. >>>> >>>> A single ms *may* be ok. >>>> >>>> Anyway, can you also split the actual "write _one_ register with >>>> retry" into a function of its own? The code looks like crap with those >>>> multiple levels of looping, with conditionals inside them etc. With a >>>> simple helper function, you could change the break into return, and it >>>> would look much better, I bet. >>> >>> Sure. It appears cleaner this way. >>> >>> --- >>> From: Rafael J. Wysocki >>> Subject: PCI: Fix regression in pci_restore_state(), v3 >>> >>> Commit 26f41062f28de65e11d3cf353e52d0be73442be1 >>> >>> PCI: check for pci bar restore completion and retry >>> >>> attempted to address problems with PCI BAR restoration on systems >>> where FLR had not been completed before pci_restore_state() was >>> called, but it did that in an utterly wrong way. >>> >>> First off, instead of retrying the writes for the BAR registers >>> only, it did that for all of the PCI config space of the device, >>> including the status register (whose value after the write quite >>> obviously need not be the same as the written one). Second, it >>> added arbitrary delay to pci_restore_state() even for systems >>> where the PCI config space restoration was successful at first >>> attempt. Finally, the mdelay(10) it added to every iteration of the >>> writing loop was way too much of a delay for any reasonable device. >>> >>> All of this actually caused resume failures for some devices on >>> the Mikko's system. >>> >>> To fix the regression, make pci_restore_state() only retry the >>> writes for BAR registers and only wait if the first read from >>> the register doesn't return the written value. Additionaly, make >>> it wait for 1 ms, instead of 10 ms, after every failing attempt >>> to write into config space. >>> >>> Reported-by: Mikko Vinni >>> Signed-off-by: Rafael J. Wysocki >>> --- >>> drivers/pci/pci.c | 57 ++++++++++++++++++++++++++++++++++++------------------ >>> 1 file changed, 39 insertions(+), 18 deletions(-) >>> >>> Index: linux/drivers/pci/pci.c >>> =================================================================== >>> --- linux.orig/drivers/pci/pci.c >>> +++ linux/drivers/pci/pci.c >>> @@ -967,16 +967,47 @@ pci_save_state(struct pci_dev *dev) >>> return 0; >>> } >>> >>> +static void pci_restore_config_dword(struct pci_dev *pdev, int offset, >>> + u32 saved_val, int retry) >>> +{ >>> + u32 val; >>> + >>> + pci_read_config_dword(pdev, offset,&val); >>> + if (val == saved_val) >>> + return; >>> + >>> + for (;;) { >>> + dev_dbg(&pdev->dev, "restoring config space at offset " >>> + "%#x (was %#x, writing %#x)\n", offset, val, saved_val); >>> + pci_write_config_dword(pdev, offset, saved_val); >>> + if (retry--<= 0) >>> + return; >>> + >>> + pci_read_config_dword(pdev, offset,&val); >>> + if (val == saved_val) >>> + return; >>> + >>> + mdelay(1); >>> + } >>> +} >>> + >>> +static void pci_restore_config_space(struct pci_dev *pdev, int start, int end, >>> + int retry) >>> +{ >>> + int index; >>> + >>> + for (index = end; index>= start; index--) >>> + pci_restore_config_dword(pdev, 4 * index, >>> + pdev->saved_config_space[index], >>> + retry); >>> +} >>> + >>> /** >>> * pci_restore_state - Restore the saved state of a PCI device >>> * @dev: - PCI device that we're dealing with >>> */ >>> void pci_restore_state(struct pci_dev *dev) >>> { >>> - int i; >>> - u32 val; >>> - int tries; >>> - >>> if (!dev->state_saved) >>> return; >>> >>> @@ -984,24 +1015,14 @@ void pci_restore_state(struct pci_dev *d >>> pci_restore_pcie_state(dev); >>> pci_restore_ats_state(dev); >>> >>> + pci_restore_config_space(dev, 10, 15, 0); >>> /* >>> * The Base Address register should be programmed before the command >>> * register(s) >>> */ >>> - for (i = 15; i>= 0; i--) { >>> - pci_read_config_dword(dev, i * 4,&val); >>> - tries = 10; >>> - while (tries&& val != dev->saved_config_space[i]) { >>> - dev_dbg(&dev->dev, "restoring config " >>> - "space at offset %#x (was %#x, writing %#x)\n", >>> - i, val, (int)dev->saved_config_space[i]); >>> - pci_write_config_dword(dev,i * 4, >>> - dev->saved_config_space[i]); >>> - pci_read_config_dword(dev, i * 4,&val); >>> - mdelay(10); >>> - tries--; >>> - } >>> - } >>> + pci_restore_config_space(dev, 4, 9, 10); >>> + pci_restore_config_space(dev, 0, 3, 0); >>> + >>> pci_restore_pcix_state(dev); >>> pci_restore_msi_state(dev); >>> pci_restore_iov_state(dev); >> >> I'd feel better about this if there were a way to delay in the FLR >> path instead. If we delay in the restore path, we're only fixing one >> of the many ways config space can be written. Other paths that write >> config space will just silently fail. >> >> The PCIe spec (r3.0, sec 6.6.2) mentions waiting for the "pre-FLR >> value for Completion Timeout," but I don't see anything that looks >> like that in pcie_flr() or pci_af_flr(). Are there any other direct >> ways we can detect when the FLR is complete? > > I'm not aware of any. > > Thanks, > Rafael I don't think so, either. I believe an ECN is being worked in the PCI-SIG to add such a notification, though. Even if adopted, need to wait for another crank of the hw before the notification can be used. > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html