From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751866AbdHCS5R (ORCPT ); Thu, 3 Aug 2017 14:57:17 -0400 Received: from mail.kernel.org ([198.145.29.99]:35694 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751600AbdHCS5P (ORCPT ); Thu, 3 Aug 2017 14:57:15 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EDE5422BE2 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=helgaas@kernel.org Date: Thu, 3 Aug 2017 13:57:13 -0500 From: Bjorn Helgaas To: Oza Oza Cc: Bjorn Helgaas , Ray Jui , Scott Branden , Jon Mason , BCM Kernel Feedback , Andy Gospodarek , linux-pci , linux-kernel@vger.kernel.org, Oza Pawandeep Subject: Re: [PATCH v5 1/2] PCI: iproc: Retry request when CRS returned from EP Message-ID: <20170803185713.GV20308@bhelgaas-glaptop.roam.corp.google.com> References: <1499310582-21193-1-git-send-email-oza.oza@broadcom.com> <1499310582-21193-2-git-send-email-oza.oza@broadcom.com> <20170802210455.GF20308@bhelgaas-glaptop.roam.corp.google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Aug 03, 2017 at 01:50:29PM +0530, Oza Oza wrote: > On Thu, Aug 3, 2017 at 2:34 AM, Bjorn Helgaas wrote: > > On Thu, Jul 06, 2017 at 08:39:41AM +0530, Oza Pawandeep wrote: > >> For Configuration Requests only, following reset it is possible for a > >> device to terminate the request but indicate that it is temporarily unable > >> to process the Request, but will be able to process the Request in the > >> future – in this case, the Configuration Request Retry Status 100 (CRS) > >> Completion Status is used. > > > > Please include the spec reference for this. > > > > The "100" looks like it's supposed to be the Completion Status Field > > value for CRS from PCIe r3.1, sec 2.2.9, Table 2-34, but the value in > > the table is actually 0b010, not 0b100. I don't think this level of > > detail is necessary unless your hardware exposes those values to the > > driver. > > > > I will remove the above description from the comment. > > >> As per PCI spec, CRS Software Visibility only affects config read of the > >> Vendor ID, for config write or any other config read the Root must > >> automatically re-issue configuration request again as a new request. > >> Iproc based PCIe RC (hw) does not retry request on its own. > > > > I think sec 2.3.2 is the relevant part of the spec. It basically > > says that when an RC receives a CRS completion for a config request: > > > > - If CRS software visibility is not enabled, the RC must reissue the > > config request as a new request. > > > > - If CRS software visibility is enabled, > > - for a config read of Vendor ID, the RC must return 0x0001 data > > - for all other config reads/writes, the RC must reissue the > > request > > > > yes, above is more relevant, and I will include it in the description. > > > Apparently iProc *never* reissues config requests, regardless of > > whether CRS software visibility is enabled? > > > > that is true. > > > But your CFG_RETRY_STATUS definition suggests that iProc always > > fabricates config read data of 0xffff0001 when it sees CRS status, no > > matter whether software visibility is enabled and no matter what > > config address we read? > > yes, that is precisely the case. Not according to the documentation you quoted below. > > What about CRS status for a config *write*? There's nothing here to > > reissue those. > > No, we do not need there, because read will always be issued first > before any write. > so we do not need to implement write. How so? As far as I know, there's nothing in the spec that requires the first config access to a device to be a read, and there are reasons why we might want to do a write first: http://lkml.kernel.org/r/5952D144.8060609@oracle.com > > Is there a hardware erratum that describes the actual hardware > > behavior? > > this is documented in our PCIe core hw manual. > >> > 4.7.3.3. Retry Status On Configuration Cycle > Endpoints are allowed to generate retry status on configuration > cycles. In this case, the RC needs to re-issue the request. The IP > does not handle this because the number of configuration cycles needed > will probably be less > than the total number of non-posted operations needed. When a retry status > is received on the User RX interface for a configuration request that > was sent on the User TX interface, it will be indicated with a > completion with the CMPL_STATUS field set to 2=CRS, and the user will > have to find the address and data values and send a new transaction on > the User TX interface. > When the internal configuration space returns a retry status during a > configuration cycle (user_cscfg = 1) on the Command/Status interface, > the pcie_cscrs will assert with the pcie_csack signal to indicate the > CRS status. > When the CRS Software Visibility Enable register in the Root Control > register is enabled, the IP will return the data value to 0x0001 for > the Vendor ID value and 0xffff (all 1’s) for the rest of the data in > the request for reads of offset 0 that return with CRS status. This > is true for both the User RX Interface and for the Command/Status > interface. When CRS Software Visibility is enabled, the CMPL_STATUS > field of the completion on the User RX Interface will not be 2=CRS and > the pcie_cscrs signal will not assert on the Command/Status interface. > >> > Broadcom does not sell PCIe core IP, so above information is not > publicly available in terms of hardware erratum or any similar note. > > > > > >> As a result of the fact, PCIe RC driver (sw) should take care of CRS. > >> This patch fixes the problem, and attempts to read config space again in > >> case of PCIe code forwarding the CRS back to CPU. > >> It implements iproc_pcie_config_read which gets called for Stingray, > >> Otherwise it falls back to PCI generic APIs. > >> > >> Signed-off-by: Oza Pawandeep > >> Reviewed-by: Ray Jui > >> Reviewed-by: Scott Branden > >> > >> diff --git a/drivers/pci/host/pcie-iproc.c b/drivers/pci/host/pcie-iproc.c > >> index 0f39bd2..b0abcd7 100644 > >> --- a/drivers/pci/host/pcie-iproc.c > >> +++ b/drivers/pci/host/pcie-iproc.c > >> @@ -68,6 +68,9 @@ > >> #define APB_ERR_EN_SHIFT 0 > >> #define APB_ERR_EN BIT(APB_ERR_EN_SHIFT) > >> > >> +#define CFG_RETRY_STATUS 0xffff0001 > >> +#define CFG_RETRY_STATUS_TIMEOUT_US 500000 /* 500 milli-seconds. */ > >> + > >> /* derive the enum index of the outbound/inbound mapping registers */ > >> #define MAP_REG(base_reg, index) ((base_reg) + (index) * 2) > >> > >> @@ -448,6 +451,55 @@ static inline void iproc_pcie_apb_err_disable(struct pci_bus *bus, > >> } > >> } > >> > >> +static int iproc_pcie_cfg_retry(void __iomem *cfg_data_p) > >> +{ > >> + int timeout = CFG_RETRY_STATUS_TIMEOUT_US; > >> + unsigned int ret; > >> + > >> + /* > >> + * As per PCI spec, CRS Software Visibility only > >> + * affects config read of the Vendor ID. > >> + * For config write or any other config read the Root must > >> + * automatically re-issue configuration request again as a > >> + * new request. Iproc based PCIe RC (hw) does not retry > >> + * request on its own, so handle it here. > >> + */ > >> + do { > >> + ret = readl(cfg_data_p); > >> + if (ret == CFG_RETRY_STATUS) > >> + udelay(1); > >> + else > >> + return PCIBIOS_SUCCESSFUL; > >> + } while (timeout--); > > > > Shouldn't this check *where* in config space we're reading? > > > No, we do not require, because with SPDK it was reading BAR space > which points to MSI-x table. > what I mean is, it could be anywhere. The documentation above says the IP returns data of 0x0001 for *reads of offset 0*. Your current code reissues the read if *any* read anywhere returns data that happens to match CFG_RETRY_STATUS. That may be a perfectly valid register value for some device's config space. In that case, you do not want to reissue the read and you do not want to timeout. > > Shouldn't we check whether CRS software visiblity is enabled? Does > > iProc advertise CRS software visibility support in Root Capabilities? > > No, this is also not required because irrespective of that, IP will > re-issue the request, and it will forwards error code to host bridge. > and in-turn host-bridge is implemented to return 0xffff0001 as code. The documentation says the IP does not handle reissuing the request. > > If CRS software visibility is enabled, we should return the 0x0001 > > data if we're reading the Vendor ID and see CRS status. It looks like > > this loop always retries if we see 0xffff0001 data. > > > our internal host bridge is implemented this way, and it returns 0xffff0001. > so there, we have no choice. The PCIe spec says that if CRS software visibility is enabled and we get a CRS completion for a config read of the Vendor ID at offset 0, the read should not be retried, and the read should return 0x0001. For example, pci_bus_read_dev_vendor_id() should see that 0x0001 value. That's not what this code does. If the readl() returns CFG_RETRY_STATUS, you do the delay and reissue the read. You never return 0x0001 to the caller of pci_bus_read_config_word(). > As I understand from your comments that, I have to change comment > description and include right PCI spec reference with section. > apart form that I have tried to answer your IP specific details. Please include the documentation quote above in the changelog when you revise this patch. > >> + > >> + return PCIBIOS_DEVICE_NOT_FOUND; > >> +} > >> + > >> +static void __iomem *iproc_pcie_map_ep_cfg_reg(struct iproc_pcie *pcie, > >> + unsigned int busno, > >> + unsigned int slot, > >> + unsigned int fn, > >> + int where) > >> +{ > >> + u16 offset; > >> + u32 val; > >> + > >> + /* EP device access */ > >> + val = (busno << CFG_ADDR_BUS_NUM_SHIFT) | > >> + (slot << CFG_ADDR_DEV_NUM_SHIFT) | > >> + (fn << CFG_ADDR_FUNC_NUM_SHIFT) | > >> + (where & CFG_ADDR_REG_NUM_MASK) | > >> + (1 & CFG_ADDR_CFG_TYPE_MASK); > >> + > >> + iproc_pcie_write_reg(pcie, IPROC_PCIE_CFG_ADDR, val); > >> + offset = iproc_pcie_reg_offset(pcie, IPROC_PCIE_CFG_DATA); > >> + > >> + if (iproc_pcie_reg_is_invalid(offset)) > >> + return NULL; > >> + > >> + return (pcie->base + offset); > >> +} > >> + > >> /** > >> * Note access to the configuration registers are protected at the higher layer > >> * by 'pci_lock' in drivers/pci/access.c > >> @@ -499,13 +551,48 @@ static void __iomem *iproc_pcie_map_cfg_bus(struct pci_bus *bus, > >> return (pcie->base + offset); > >> } > >> > >> +static int iproc_pcie_config_read(struct pci_bus *bus, unsigned int devfn, > >> + int where, int size, u32 *val) > >> +{ > >> + struct iproc_pcie *pcie = iproc_data(bus); > >> + unsigned int slot = PCI_SLOT(devfn); > >> + unsigned int fn = PCI_FUNC(devfn); > >> + unsigned int busno = bus->number; > >> + void __iomem *cfg_data_p; > >> + int ret; > >> + > >> + /* root complex access. */ > >> + if (busno == 0) > >> + return pci_generic_config_read32(bus, devfn, where, size, val); > >> + > >> + cfg_data_p = iproc_pcie_map_ep_cfg_reg(pcie, busno, slot, fn, where); > >> + > >> + if (!cfg_data_p) > >> + return PCIBIOS_DEVICE_NOT_FOUND; > >> + > >> + ret = iproc_pcie_cfg_retry(cfg_data_p); > >> + if (ret) > >> + return ret; > >> + > >> + *val = readl(cfg_data_p); > >> + > >> + if (size <= 2) > >> + *val = (*val >> (8 * (where & 3))) & ((1 << (size * 8)) - 1); > >> + > >> + return PCIBIOS_SUCCESSFUL; > >> +} > >> + > >> static int iproc_pcie_config_read32(struct pci_bus *bus, unsigned int devfn, > >> int where, int size, u32 *val) > >> { > >> int ret; > >> + struct iproc_pcie *pcie = iproc_data(bus); > >> > >> iproc_pcie_apb_err_disable(bus, true); > >> - ret = pci_generic_config_read32(bus, devfn, where, size, val); > >> + if (pcie->type == IPROC_PCIE_PAXB_V2) > >> + ret = iproc_pcie_config_read(bus, devfn, where, size, val); > >> + else > >> + ret = pci_generic_config_read32(bus, devfn, where, size, val); > >> iproc_pcie_apb_err_disable(bus, false); > >> > >> return ret; > >> -- > >> 1.9.1 > >>