* [PATCH] PCI: aardvark: Don't touch PCIe registers if no card connected @ 2020-05-28 14:31 Pali Rohár 2020-05-28 16:26 ` Bjorn Helgaas ` (2 more replies) 0 siblings, 3 replies; 32+ messages in thread From: Pali Rohár @ 2020-05-28 14:31 UTC (permalink / raw) To: Thomas Petazzoni, Lorenzo Pieralisi, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium Cc: linux-pci, linux-arm-kernel, linux-kernel When there is no PCIe card connected and advk_pcie_rd_conf() or advk_pcie_wr_conf() is called for PCI bus which doesn't belong to emulated root bridge, the aardvark driver throws the following error message: advk-pcie d0070000.pcie: config read/write timed out Obviously accessing PCIe registers of disconnected card is not possible. Extend check in advk_pcie_valid_device() function for validating availability of PCIe bus. If PCIe link is down, then the device is marked as Not Found and the driver does not try to access these registers. Signed-off-by: Pali Rohár <pali@kernel.org> --- drivers/pci/controller/pci-aardvark.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c index 90ff291c24f0..53a4cfd7d377 100644 --- a/drivers/pci/controller/pci-aardvark.c +++ b/drivers/pci/controller/pci-aardvark.c @@ -644,6 +644,9 @@ static bool advk_pcie_valid_device(struct advk_pcie *pcie, struct pci_bus *bus, if ((bus->number == pcie->root_bus_nr) && PCI_SLOT(devfn) != 0) return false; + if (bus->number != pcie->root_bus_nr && !advk_pcie_link_up(pcie)) + return false; + return true; } -- 2.20.1 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-05-28 14:31 [PATCH] PCI: aardvark: Don't touch PCIe registers if no card connected Pali Rohár @ 2020-05-28 16:26 ` Bjorn Helgaas 2020-05-28 16:38 ` Pali Rohár 2020-07-01 8:20 ` [PATCH v2] " Pali Rohár 2020-07-02 8:30 ` [PATCH v3] " Pali Rohár 2 siblings, 1 reply; 32+ messages in thread From: Bjorn Helgaas @ 2020-05-28 16:26 UTC (permalink / raw) To: Pali Rohár Cc: Thomas Petazzoni, Lorenzo Pieralisi, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium, linux-pci, linux-arm-kernel, linux-kernel On Thu, May 28, 2020 at 04:31:41PM +0200, Pali Rohár wrote: > When there is no PCIe card connected and advk_pcie_rd_conf() or > advk_pcie_wr_conf() is called for PCI bus which doesn't belong to emulated > root bridge, the aardvark driver throws the following error message: > > advk-pcie d0070000.pcie: config read/write timed out > > Obviously accessing PCIe registers of disconnected card is not possible. > > Extend check in advk_pcie_valid_device() function for validating > availability of PCIe bus. If PCIe link is down, then the device is marked > as Not Found and the driver does not try to access these registers. > > Signed-off-by: Pali Rohár <pali@kernel.org> > --- > drivers/pci/controller/pci-aardvark.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c > index 90ff291c24f0..53a4cfd7d377 100644 > --- a/drivers/pci/controller/pci-aardvark.c > +++ b/drivers/pci/controller/pci-aardvark.c > @@ -644,6 +644,9 @@ static bool advk_pcie_valid_device(struct advk_pcie *pcie, struct pci_bus *bus, > if ((bus->number == pcie->root_bus_nr) && PCI_SLOT(devfn) != 0) > return false; > > + if (bus->number != pcie->root_bus_nr && !advk_pcie_link_up(pcie)) > + return false; I don't think this is the right fix. This makes it racy because the link may go down after we call advk_pcie_valid_device() but before we perform the config read. I have no objection to removing the "config read/write timed out" message. The "return PCIBIOS_SET_FAILED" in the read case probably should be augmented by setting "*val = 0xffffffff". > return true; > } > > -- > 2.20.1 > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-05-28 16:26 ` Bjorn Helgaas @ 2020-05-28 16:38 ` Pali Rohár 2020-05-28 16:49 ` Bjorn Helgaas 2020-06-30 13:51 ` Bjorn Helgaas 0 siblings, 2 replies; 32+ messages in thread From: Pali Rohár @ 2020-05-28 16:38 UTC (permalink / raw) To: Bjorn Helgaas Cc: Thomas Petazzoni, Lorenzo Pieralisi, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium, linux-pci, linux-arm-kernel, linux-kernel On Thursday 28 May 2020 11:26:04 Bjorn Helgaas wrote: > On Thu, May 28, 2020 at 04:31:41PM +0200, Pali Rohár wrote: > > When there is no PCIe card connected and advk_pcie_rd_conf() or > > advk_pcie_wr_conf() is called for PCI bus which doesn't belong to emulated > > root bridge, the aardvark driver throws the following error message: > > > > advk-pcie d0070000.pcie: config read/write timed out > > > > Obviously accessing PCIe registers of disconnected card is not possible. > > > > Extend check in advk_pcie_valid_device() function for validating > > availability of PCIe bus. If PCIe link is down, then the device is marked > > as Not Found and the driver does not try to access these registers. > > > > Signed-off-by: Pali Rohár <pali@kernel.org> > > --- > > drivers/pci/controller/pci-aardvark.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c > > index 90ff291c24f0..53a4cfd7d377 100644 > > --- a/drivers/pci/controller/pci-aardvark.c > > +++ b/drivers/pci/controller/pci-aardvark.c > > @@ -644,6 +644,9 @@ static bool advk_pcie_valid_device(struct advk_pcie *pcie, struct pci_bus *bus, > > if ((bus->number == pcie->root_bus_nr) && PCI_SLOT(devfn) != 0) > > return false; > > > > + if (bus->number != pcie->root_bus_nr && !advk_pcie_link_up(pcie)) > > + return false; > > I don't think this is the right fix. This makes it racy because the > link may go down after we call advk_pcie_valid_device() but before we > perform the config read. Yes, it is racy, but I do not think it cause problems. Trying to read PCIe registers when device is not connected cause just those timeouts, printing error message and increased delay in advk_pcie_wait_pio() due to polling loop. This patch reduce unnecessary access to PCIe registers when advk_pcie_wait_pio() polling just fail. I think it is a good idea to not call blocking advk_pcie_wait_pio() when it is not needed. We could have faster enumeration of PCIe buses when card is not connected. > I have no objection to removing the "config read/write timed out" > message. The "return PCIBIOS_SET_FAILED" in the read case probably > should be augmented by setting "*val = 0xffffffff". > > > return true; > > } > > > > -- > > 2.20.1 > > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-05-28 16:38 ` Pali Rohár @ 2020-05-28 16:49 ` Bjorn Helgaas 2020-05-29 8:30 ` Pali Rohár 2020-06-30 13:51 ` Bjorn Helgaas 1 sibling, 1 reply; 32+ messages in thread From: Bjorn Helgaas @ 2020-05-28 16:49 UTC (permalink / raw) To: Pali Rohár Cc: Thomas Petazzoni, Lorenzo Pieralisi, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium, linux-pci, linux-arm-kernel, linux-kernel On Thu, May 28, 2020 at 06:38:09PM +0200, Pali Rohár wrote: > On Thursday 28 May 2020 11:26:04 Bjorn Helgaas wrote: > > On Thu, May 28, 2020 at 04:31:41PM +0200, Pali Rohár wrote: > > > When there is no PCIe card connected and advk_pcie_rd_conf() or > > > advk_pcie_wr_conf() is called for PCI bus which doesn't belong to emulated > > > root bridge, the aardvark driver throws the following error message: > > > > > > advk-pcie d0070000.pcie: config read/write timed out > > > > > > Obviously accessing PCIe registers of disconnected card is not possible. > > > > > > Extend check in advk_pcie_valid_device() function for validating > > > availability of PCIe bus. If PCIe link is down, then the device is marked > > > as Not Found and the driver does not try to access these registers. > > > > > > Signed-off-by: Pali Rohár <pali@kernel.org> > > > --- > > > drivers/pci/controller/pci-aardvark.c | 3 +++ > > > 1 file changed, 3 insertions(+) > > > > > > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c > > > index 90ff291c24f0..53a4cfd7d377 100644 > > > --- a/drivers/pci/controller/pci-aardvark.c > > > +++ b/drivers/pci/controller/pci-aardvark.c > > > @@ -644,6 +644,9 @@ static bool advk_pcie_valid_device(struct advk_pcie *pcie, struct pci_bus *bus, > > > if ((bus->number == pcie->root_bus_nr) && PCI_SLOT(devfn) != 0) > > > return false; > > > > > > + if (bus->number != pcie->root_bus_nr && !advk_pcie_link_up(pcie)) > > > + return false; > > > > I don't think this is the right fix. This makes it racy because the > > link may go down after we call advk_pcie_valid_device() but before we > > perform the config read. > > Yes, it is racy, but I do not think it cause problems. Trying to read > PCIe registers when device is not connected cause just those timeouts, > printing error message and increased delay in advk_pcie_wait_pio() due > to polling loop. This patch reduce unnecessary access to PCIe registers > when advk_pcie_wait_pio() polling just fail. > > I think it is a good idea to not call blocking advk_pcie_wait_pio() when > it is not needed. We could have faster enumeration of PCIe buses when > card is not connected. Maybe advk_pcie_check_pio_status() and advk_pcie_wait_pio() could be combined so we could get the correct error status as soon as it's available, without waiting for a timeout? In any event, the "return PCIBIOS_SET_FAILED" needs to be fixed. Most callers of config read do not check for failure, but most of the ones that do, check for "val == ~0". Only a few check for a status of other than PCIBIOS_SUCCESSFUL. > > I have no objection to removing the "config read/write timed out" > > message. The "return PCIBIOS_SET_FAILED" in the read case probably > > should be augmented by setting "*val = 0xffffffff". > > > > > return true; > > > } > > > > > > -- > > > 2.20.1 > > > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-05-28 16:49 ` Bjorn Helgaas @ 2020-05-29 8:30 ` Pali Rohár 2020-06-30 12:31 ` Pali Rohár 0 siblings, 1 reply; 32+ messages in thread From: Pali Rohár @ 2020-05-29 8:30 UTC (permalink / raw) To: Bjorn Helgaas Cc: Thomas Petazzoni, Lorenzo Pieralisi, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium, linux-pci, linux-arm-kernel, linux-kernel On Thursday 28 May 2020 11:49:38 Bjorn Helgaas wrote: > On Thu, May 28, 2020 at 06:38:09PM +0200, Pali Rohár wrote: > > On Thursday 28 May 2020 11:26:04 Bjorn Helgaas wrote: > > > On Thu, May 28, 2020 at 04:31:41PM +0200, Pali Rohár wrote: > > > > When there is no PCIe card connected and advk_pcie_rd_conf() or > > > > advk_pcie_wr_conf() is called for PCI bus which doesn't belong to emulated > > > > root bridge, the aardvark driver throws the following error message: > > > > > > > > advk-pcie d0070000.pcie: config read/write timed out > > > > > > > > Obviously accessing PCIe registers of disconnected card is not possible. > > > > > > > > Extend check in advk_pcie_valid_device() function for validating > > > > availability of PCIe bus. If PCIe link is down, then the device is marked > > > > as Not Found and the driver does not try to access these registers. > > > > > > > > Signed-off-by: Pali Rohár <pali@kernel.org> > > > > --- > > > > drivers/pci/controller/pci-aardvark.c | 3 +++ > > > > 1 file changed, 3 insertions(+) > > > > > > > > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c > > > > index 90ff291c24f0..53a4cfd7d377 100644 > > > > --- a/drivers/pci/controller/pci-aardvark.c > > > > +++ b/drivers/pci/controller/pci-aardvark.c > > > > @@ -644,6 +644,9 @@ static bool advk_pcie_valid_device(struct advk_pcie *pcie, struct pci_bus *bus, > > > > if ((bus->number == pcie->root_bus_nr) && PCI_SLOT(devfn) != 0) > > > > return false; > > > > > > > > + if (bus->number != pcie->root_bus_nr && !advk_pcie_link_up(pcie)) > > > > + return false; > > > > > > I don't think this is the right fix. This makes it racy because the > > > link may go down after we call advk_pcie_valid_device() but before we > > > perform the config read. > > > > Yes, it is racy, but I do not think it cause problems. Trying to read > > PCIe registers when device is not connected cause just those timeouts, > > printing error message and increased delay in advk_pcie_wait_pio() due > > to polling loop. This patch reduce unnecessary access to PCIe registers > > when advk_pcie_wait_pio() polling just fail. > > > > I think it is a good idea to not call blocking advk_pcie_wait_pio() when > > it is not needed. We could have faster enumeration of PCIe buses when > > card is not connected. > > Maybe advk_pcie_check_pio_status() and advk_pcie_wait_pio() could be > combined so we could get the correct error status as soon as it's > available, without waiting for a timeout? Any idea how to achieve it? First call is polling function advk_pcie_wait_pio() and second call is advk_pcie_check_pio_status() which just reads status register and prints error message to dmesg. So for me it looks like that combining these two functions into one does not change anything. We always need to call polling code prior to checking status register. And therefore need to wait for timeout. Unless something like in this proposed patch is not used (to skip whole register access if it would fail). > In any event, the "return PCIBIOS_SET_FAILED" needs to be fixed. Most > callers of config read do not check for failure, but most of the ones > that do, check for "val == ~0". Only a few check for a status of > other than PCIBIOS_SUCCESSFUL. > > > > I have no objection to removing the "config read/write timed out" > > > message. The "return PCIBIOS_SET_FAILED" in the read case probably > > > should be augmented by setting "*val = 0xffffffff". Now I see, "*val = 0xffffffff" should be really set when function advk_pcie_rd_conf() fails. > > > > return true; > > > > } > > > > > > > > -- > > > > 2.20.1 > > > > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-05-29 8:30 ` Pali Rohár @ 2020-06-30 12:31 ` Pali Rohár 0 siblings, 0 replies; 32+ messages in thread From: Pali Rohár @ 2020-06-30 12:31 UTC (permalink / raw) To: Bjorn Helgaas Cc: Thomas Petazzoni, Lorenzo Pieralisi, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium, linux-pci, linux-arm-kernel, linux-kernel Hello! On Friday 29 May 2020 10:30:13 Pali Rohár wrote: > On Thursday 28 May 2020 11:49:38 Bjorn Helgaas wrote: > > On Thu, May 28, 2020 at 06:38:09PM +0200, Pali Rohár wrote: > > > On Thursday 28 May 2020 11:26:04 Bjorn Helgaas wrote: > > > > On Thu, May 28, 2020 at 04:31:41PM +0200, Pali Rohár wrote: > > > > > When there is no PCIe card connected and advk_pcie_rd_conf() or > > > > > advk_pcie_wr_conf() is called for PCI bus which doesn't belong to emulated > > > > > root bridge, the aardvark driver throws the following error message: > > > > > > > > > > advk-pcie d0070000.pcie: config read/write timed out > > > > > > > > > > Obviously accessing PCIe registers of disconnected card is not possible. > > > > > > > > > > Extend check in advk_pcie_valid_device() function for validating > > > > > availability of PCIe bus. If PCIe link is down, then the device is marked > > > > > as Not Found and the driver does not try to access these registers. > > > > > > > > > > Signed-off-by: Pali Rohár <pali@kernel.org> > > > > > --- > > > > > drivers/pci/controller/pci-aardvark.c | 3 +++ > > > > > 1 file changed, 3 insertions(+) > > > > > > > > > > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c > > > > > index 90ff291c24f0..53a4cfd7d377 100644 > > > > > --- a/drivers/pci/controller/pci-aardvark.c > > > > > +++ b/drivers/pci/controller/pci-aardvark.c > > > > > @@ -644,6 +644,9 @@ static bool advk_pcie_valid_device(struct advk_pcie *pcie, struct pci_bus *bus, > > > > > if ((bus->number == pcie->root_bus_nr) && PCI_SLOT(devfn) != 0) > > > > > return false; > > > > > > > > > > + if (bus->number != pcie->root_bus_nr && !advk_pcie_link_up(pcie)) > > > > > + return false; > > > > > > > > I don't think this is the right fix. This makes it racy because the > > > > link may go down after we call advk_pcie_valid_device() but before we > > > > perform the config read. > > > > > > Yes, it is racy, but I do not think it cause problems. Trying to read > > > PCIe registers when device is not connected cause just those timeouts, > > > printing error message and increased delay in advk_pcie_wait_pio() due > > > to polling loop. This patch reduce unnecessary access to PCIe registers > > > when advk_pcie_wait_pio() polling just fail. > > > > > > I think it is a good idea to not call blocking advk_pcie_wait_pio() when > > > it is not needed. We could have faster enumeration of PCIe buses when > > > card is not connected. > > > > Maybe advk_pcie_check_pio_status() and advk_pcie_wait_pio() could be > > combined so we could get the correct error status as soon as it's > > available, without waiting for a timeout? > > Any idea how to achieve it? > > First call is polling function advk_pcie_wait_pio() and second call is > advk_pcie_check_pio_status() which just reads status register and prints > error message to dmesg. > > So for me it looks like that combining these two functions into one does > not change anything. We always need to call polling code prior to > checking status register. And therefore need to wait for timeout. Unless > something like in this proposed patch is not used (to skip whole > register access if it would fail). So to answer your question, correct status is possible to retrieve only after waiting for timeout. As status would be available only after timeout expires. Therefore my proposed patch in this (or some other) form is needed if we want to prevent trying to read from registers and waiting for answer when card is disconnected. I would really like to see this issue fixed, so booting linux kernel on board without connected PCIe card would not be delayed. Thomas, Lorenzo, Bjorn: do you have any idea how to fix it differently? Or if not, could be my proposed patch accepted in some form? > > In any event, the "return PCIBIOS_SET_FAILED" needs to be fixed. Most > > callers of config read do not check for failure, but most of the ones > > that do, check for "val == ~0". Only a few check for a status of > > other than PCIBIOS_SUCCESSFUL. > > > > > > I have no objection to removing the "config read/write timed out" > > > > message. The "return PCIBIOS_SET_FAILED" in the read case probably > > > > should be augmented by setting "*val = 0xffffffff". > > Now I see, "*val = 0xffffffff" should be really set when function > advk_pcie_rd_conf() fails. I have already sent separate patch which fixes this issue. > > > > > return true; > > > > > } > > > > > > > > > > -- > > > > > 2.20.1 > > > > > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-05-28 16:38 ` Pali Rohár 2020-05-28 16:49 ` Bjorn Helgaas @ 2020-06-30 13:51 ` Bjorn Helgaas 2020-06-30 14:04 ` Pali Rohár 1 sibling, 1 reply; 32+ messages in thread From: Bjorn Helgaas @ 2020-06-30 13:51 UTC (permalink / raw) To: Pali Rohár Cc: Thomas Petazzoni, Lorenzo Pieralisi, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium, linux-pci, linux-arm-kernel, linux-kernel On Thu, May 28, 2020 at 06:38:09PM +0200, Pali Rohár wrote: > On Thursday 28 May 2020 11:26:04 Bjorn Helgaas wrote: > > On Thu, May 28, 2020 at 04:31:41PM +0200, Pali Rohár wrote: > > > When there is no PCIe card connected and advk_pcie_rd_conf() or > > > advk_pcie_wr_conf() is called for PCI bus which doesn't belong to emulated > > > root bridge, the aardvark driver throws the following error message: > > > > > > advk-pcie d0070000.pcie: config read/write timed out > > > > > > Obviously accessing PCIe registers of disconnected card is not possible. > > > > > > Extend check in advk_pcie_valid_device() function for validating > > > availability of PCIe bus. If PCIe link is down, then the device is marked > > > as Not Found and the driver does not try to access these registers. > > > > > > Signed-off-by: Pali Rohár <pali@kernel.org> > > > --- > > > drivers/pci/controller/pci-aardvark.c | 3 +++ > > > 1 file changed, 3 insertions(+) > > > > > > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c > > > index 90ff291c24f0..53a4cfd7d377 100644 > > > --- a/drivers/pci/controller/pci-aardvark.c > > > +++ b/drivers/pci/controller/pci-aardvark.c > > > @@ -644,6 +644,9 @@ static bool advk_pcie_valid_device(struct advk_pcie *pcie, struct pci_bus *bus, > > > if ((bus->number == pcie->root_bus_nr) && PCI_SLOT(devfn) != 0) > > > return false; > > > > > > + if (bus->number != pcie->root_bus_nr && !advk_pcie_link_up(pcie)) > > > + return false; > > > > I don't think this is the right fix. This makes it racy because the > > link may go down after we call advk_pcie_valid_device() but before we > > perform the config read. > > Yes, it is racy, but I do not think it cause problems. Trying to read > PCIe registers when device is not connected cause just those timeouts, > printing error message and increased delay in advk_pcie_wait_pio() due > to polling loop. This patch reduce unnecessary access to PCIe registers > when advk_pcie_wait_pio() polling just fail. What happens when the device is removed after advk_pcie_link_up() returns true, but before we actually do the config access? ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-06-30 13:51 ` Bjorn Helgaas @ 2020-06-30 14:04 ` Pali Rohár 2020-06-30 14:58 ` Bjorn Helgaas 0 siblings, 1 reply; 32+ messages in thread From: Pali Rohár @ 2020-06-30 14:04 UTC (permalink / raw) To: Bjorn Helgaas Cc: Thomas Petazzoni, Lorenzo Pieralisi, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium, linux-pci, linux-arm-kernel, linux-kernel On Tuesday 30 June 2020 08:51:27 Bjorn Helgaas wrote: > On Thu, May 28, 2020 at 06:38:09PM +0200, Pali Rohár wrote: > > On Thursday 28 May 2020 11:26:04 Bjorn Helgaas wrote: > > > On Thu, May 28, 2020 at 04:31:41PM +0200, Pali Rohár wrote: > > > > When there is no PCIe card connected and advk_pcie_rd_conf() or > > > > advk_pcie_wr_conf() is called for PCI bus which doesn't belong to emulated > > > > root bridge, the aardvark driver throws the following error message: > > > > > > > > advk-pcie d0070000.pcie: config read/write timed out > > > > > > > > Obviously accessing PCIe registers of disconnected card is not possible. > > > > > > > > Extend check in advk_pcie_valid_device() function for validating > > > > availability of PCIe bus. If PCIe link is down, then the device is marked > > > > as Not Found and the driver does not try to access these registers. > > > > > > > > Signed-off-by: Pali Rohár <pali@kernel.org> > > > > --- > > > > drivers/pci/controller/pci-aardvark.c | 3 +++ > > > > 1 file changed, 3 insertions(+) > > > > > > > > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c > > > > index 90ff291c24f0..53a4cfd7d377 100644 > > > > --- a/drivers/pci/controller/pci-aardvark.c > > > > +++ b/drivers/pci/controller/pci-aardvark.c > > > > @@ -644,6 +644,9 @@ static bool advk_pcie_valid_device(struct advk_pcie *pcie, struct pci_bus *bus, > > > > if ((bus->number == pcie->root_bus_nr) && PCI_SLOT(devfn) != 0) > > > > return false; > > > > > > > > + if (bus->number != pcie->root_bus_nr && !advk_pcie_link_up(pcie)) > > > > + return false; > > > > > > I don't think this is the right fix. This makes it racy because the > > > link may go down after we call advk_pcie_valid_device() but before we > > > perform the config read. > > > > Yes, it is racy, but I do not think it cause problems. Trying to read > > PCIe registers when device is not connected cause just those timeouts, > > printing error message and increased delay in advk_pcie_wait_pio() due > > to polling loop. This patch reduce unnecessary access to PCIe registers > > when advk_pcie_wait_pio() polling just fail. > > What happens when the device is removed after advk_pcie_link_up() > returns true, but before we actually do the config access? Do you mean to remove device physically at runtime? I was told that our board would crash or issue reset. Removing device from mini PCIe slot without power off is not supported. Anyway, currently we are trying to read from device registers even when no device is connected. So when advk_pcie_link_up() returns true and after that device is not connected (somehow board and kernel would be still alive) I guess that it would behave as without applying this patch. So kernel starts reading from register and would wait until timeout expires. As device is not connected there would be no answer, so kernel print error message to dmesg (same as in commit message) and returns error that read failed. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-06-30 14:04 ` Pali Rohár @ 2020-06-30 14:58 ` Bjorn Helgaas 2020-07-01 8:08 ` Pali Rohár 0 siblings, 1 reply; 32+ messages in thread From: Bjorn Helgaas @ 2020-06-30 14:58 UTC (permalink / raw) To: Pali Rohár Cc: Thomas Petazzoni, Lorenzo Pieralisi, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium, linux-pci, linux-arm-kernel, linux-kernel On Tue, Jun 30, 2020 at 04:04:20PM +0200, Pali Rohár wrote: > On Tuesday 30 June 2020 08:51:27 Bjorn Helgaas wrote: > > On Thu, May 28, 2020 at 06:38:09PM +0200, Pali Rohár wrote: > > > On Thursday 28 May 2020 11:26:04 Bjorn Helgaas wrote: > > > > On Thu, May 28, 2020 at 04:31:41PM +0200, Pali Rohár wrote: > > > > > When there is no PCIe card connected and advk_pcie_rd_conf() or > > > > > advk_pcie_wr_conf() is called for PCI bus which doesn't belong to emulated > > > > > root bridge, the aardvark driver throws the following error message: > > > > > > > > > > advk-pcie d0070000.pcie: config read/write timed out > > > > > > > > > > Obviously accessing PCIe registers of disconnected card is not possible. > > > > > > > > > > Extend check in advk_pcie_valid_device() function for validating > > > > > availability of PCIe bus. If PCIe link is down, then the device is marked > > > > > as Not Found and the driver does not try to access these registers. > > > > > > > > > > Signed-off-by: Pali Rohár <pali@kernel.org> > > > > > --- > > > > > drivers/pci/controller/pci-aardvark.c | 3 +++ > > > > > 1 file changed, 3 insertions(+) > > > > > > > > > > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c > > > > > index 90ff291c24f0..53a4cfd7d377 100644 > > > > > --- a/drivers/pci/controller/pci-aardvark.c > > > > > +++ b/drivers/pci/controller/pci-aardvark.c > > > > > @@ -644,6 +644,9 @@ static bool advk_pcie_valid_device(struct advk_pcie *pcie, struct pci_bus *bus, > > > > > if ((bus->number == pcie->root_bus_nr) && PCI_SLOT(devfn) != 0) > > > > > return false; > > > > > > > > > > + if (bus->number != pcie->root_bus_nr && !advk_pcie_link_up(pcie)) > > > > > + return false; > > > > > > > > I don't think this is the right fix. This makes it racy because the > > > > link may go down after we call advk_pcie_valid_device() but before we > > > > perform the config read. > > > > > > Yes, it is racy, but I do not think it cause problems. Trying to read > > > PCIe registers when device is not connected cause just those timeouts, > > > printing error message and increased delay in advk_pcie_wait_pio() due > > > to polling loop. This patch reduce unnecessary access to PCIe registers > > > when advk_pcie_wait_pio() polling just fail. > > > > What happens when the device is removed after advk_pcie_link_up() > > returns true, but before we actually do the config access? > > Do you mean to remove device physically at runtime? I was told that our > board would crash or issue reset. Removing device from mini PCIe slot > without power off is not supported. Right, I don't think PCIe mini cards support hotplug. > Anyway, currently we are trying to read from device registers even when > no device is connected. So when advk_pcie_link_up() returns true and > after that device is not connected (somehow board and kernel would be > still alive) I guess that it would behave as without applying this > patch. So kernel starts reading from register and would wait until > timeout expires. As device is not connected there would be no answer, > so kernel print error message to dmesg (same as in commit message) and > returns error that read failed. OK, so if I understand correctly, checking advk_pcie_link_up() is strictly an optimization. If we guess wrong (e.g., after calling advk_pcie_link_up(), the link went down because the card was removed, DPC triggered, etc), the only bad thing is that we wait for a timeout; it never causes a crash. If that's the case, I'm fine with this. But please add a comment to that effect. I think several other drivers check for the link being up because we actually crash if we try to read config space when the link is down. That's what I was trying to avoid here. Bjorn ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-06-30 14:58 ` Bjorn Helgaas @ 2020-07-01 8:08 ` Pali Rohár 0 siblings, 0 replies; 32+ messages in thread From: Pali Rohár @ 2020-07-01 8:08 UTC (permalink / raw) To: Bjorn Helgaas Cc: Thomas Petazzoni, Lorenzo Pieralisi, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium, linux-pci, linux-arm-kernel, linux-kernel On Tuesday 30 June 2020 09:58:48 Bjorn Helgaas wrote: > On Tue, Jun 30, 2020 at 04:04:20PM +0200, Pali Rohár wrote: > > On Tuesday 30 June 2020 08:51:27 Bjorn Helgaas wrote: > > > On Thu, May 28, 2020 at 06:38:09PM +0200, Pali Rohár wrote: > > > > On Thursday 28 May 2020 11:26:04 Bjorn Helgaas wrote: > > > > > On Thu, May 28, 2020 at 04:31:41PM +0200, Pali Rohár wrote: > > > > > > When there is no PCIe card connected and advk_pcie_rd_conf() or > > > > > > advk_pcie_wr_conf() is called for PCI bus which doesn't belong to emulated > > > > > > root bridge, the aardvark driver throws the following error message: > > > > > > > > > > > > advk-pcie d0070000.pcie: config read/write timed out > > > > > > > > > > > > Obviously accessing PCIe registers of disconnected card is not possible. > > > > > > > > > > > > Extend check in advk_pcie_valid_device() function for validating > > > > > > availability of PCIe bus. If PCIe link is down, then the device is marked > > > > > > as Not Found and the driver does not try to access these registers. > > > > > > > > > > > > Signed-off-by: Pali Rohár <pali@kernel.org> > > > > > > --- > > > > > > drivers/pci/controller/pci-aardvark.c | 3 +++ > > > > > > 1 file changed, 3 insertions(+) > > > > > > > > > > > > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c > > > > > > index 90ff291c24f0..53a4cfd7d377 100644 > > > > > > --- a/drivers/pci/controller/pci-aardvark.c > > > > > > +++ b/drivers/pci/controller/pci-aardvark.c > > > > > > @@ -644,6 +644,9 @@ static bool advk_pcie_valid_device(struct advk_pcie *pcie, struct pci_bus *bus, > > > > > > if ((bus->number == pcie->root_bus_nr) && PCI_SLOT(devfn) != 0) > > > > > > return false; > > > > > > > > > > > > + if (bus->number != pcie->root_bus_nr && !advk_pcie_link_up(pcie)) > > > > > > + return false; > > > > > > > > > > I don't think this is the right fix. This makes it racy because the > > > > > link may go down after we call advk_pcie_valid_device() but before we > > > > > perform the config read. > > > > > > > > Yes, it is racy, but I do not think it cause problems. Trying to read > > > > PCIe registers when device is not connected cause just those timeouts, > > > > printing error message and increased delay in advk_pcie_wait_pio() due > > > > to polling loop. This patch reduce unnecessary access to PCIe registers > > > > when advk_pcie_wait_pio() polling just fail. > > > > > > What happens when the device is removed after advk_pcie_link_up() > > > returns true, but before we actually do the config access? > > > > Do you mean to remove device physically at runtime? I was told that our > > board would crash or issue reset. Removing device from mini PCIe slot > > without power off is not supported. > > Right, I don't think PCIe mini cards support hotplug. > > > Anyway, currently we are trying to read from device registers even when > > no device is connected. So when advk_pcie_link_up() returns true and > > after that device is not connected (somehow board and kernel would be > > still alive) I guess that it would behave as without applying this > > patch. So kernel starts reading from register and would wait until > > timeout expires. As device is not connected there would be no answer, > > so kernel print error message to dmesg (same as in commit message) and > > returns error that read failed. > > OK, so if I understand correctly, checking advk_pcie_link_up() is > strictly an optimization. If we guess wrong (e.g., after calling > advk_pcie_link_up(), the link went down because the card was removed, > DPC triggered, etc), the only bad thing is that we wait for a timeout; > it never causes a crash. Yes. > If that's the case, I'm fine with this. But please add a comment to > that effect. Ok, I will send V2 with updated commit message. > I think several other drivers check for the link being up because we > actually crash if we try to read config space when the link is down. > That's what I was trying to avoid here. > > Bjorn ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v2] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-05-28 14:31 [PATCH] PCI: aardvark: Don't touch PCIe registers if no card connected Pali Rohár 2020-05-28 16:26 ` Bjorn Helgaas @ 2020-07-01 8:20 ` Pali Rohár 2020-07-01 21:34 ` Bjorn Helgaas 2020-07-02 8:30 ` [PATCH v3] " Pali Rohár 2 siblings, 1 reply; 32+ messages in thread From: Pali Rohár @ 2020-07-01 8:20 UTC (permalink / raw) To: Thomas Petazzoni, Lorenzo Pieralisi, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium Cc: linux-pci, linux-arm-kernel, linux-kernel When there is no PCIe card connected and advk_pcie_rd_conf() or advk_pcie_wr_conf() is called for PCI bus which doesn't belong to emulated root bridge, the aardvark driver throws the following error message: advk-pcie d0070000.pcie: config read/write timed out Obviously accessing PCIe registers of disconnected card is not possible. Extend check in advk_pcie_valid_device() function for validating availability of PCIe bus. If PCIe link is down, then the device is marked as Not Found and the driver does not try to access these registers. This is just an optimization to prevent accessing PCIe registers when card is disconnected. Trying to access PCIe registers of disconnected card does not cause any crash, kernel just needs to wait for a timeout. So if card disappear immediately after checking for PCIe link (before accessing PCIe registers), it does not cause any problems. Signed-off-by: Pali Rohár <pali@kernel.org> --- Changes in V2: * Update commit message, mention that this is optimization --- drivers/pci/controller/pci-aardvark.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c index 90ff291c24f0..53a4cfd7d377 100644 --- a/drivers/pci/controller/pci-aardvark.c +++ b/drivers/pci/controller/pci-aardvark.c @@ -644,6 +644,9 @@ static bool advk_pcie_valid_device(struct advk_pcie *pcie, struct pci_bus *bus, if ((bus->number == pcie->root_bus_nr) && PCI_SLOT(devfn) != 0) return false; + if (bus->number != pcie->root_bus_nr && !advk_pcie_link_up(pcie)) + return false; + return true; } -- 2.20.1 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v2] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-07-01 8:20 ` [PATCH v2] " Pali Rohár @ 2020-07-01 21:34 ` Bjorn Helgaas 2020-07-02 8:23 ` Pali Rohár 0 siblings, 1 reply; 32+ messages in thread From: Bjorn Helgaas @ 2020-07-01 21:34 UTC (permalink / raw) To: Pali Rohár Cc: Thomas Petazzoni, Lorenzo Pieralisi, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium, linux-pci, linux-arm-kernel, linux-kernel On Wed, Jul 01, 2020 at 10:20:44AM +0200, Pali Rohár wrote: > When there is no PCIe card connected and advk_pcie_rd_conf() or > advk_pcie_wr_conf() is called for PCI bus which doesn't belong to emulated > root bridge, the aardvark driver throws the following error message: > > advk-pcie d0070000.pcie: config read/write timed out > > Obviously accessing PCIe registers of disconnected card is not possible. > > Extend check in advk_pcie_valid_device() function for validating > availability of PCIe bus. If PCIe link is down, then the device is marked > as Not Found and the driver does not try to access these registers. > > This is just an optimization to prevent accessing PCIe registers when card > is disconnected. Trying to access PCIe registers of disconnected card does > not cause any crash, kernel just needs to wait for a timeout. So if card > disappear immediately after checking for PCIe link (before accessing PCIe > registers), it does not cause any problems. Thanks, this is good. I'd really like a short comment in the code as well, because this sort of link-up check tends to get copied to new drivers where it shouldn't be used, e.g., something like this: /* * If the link goes down after we check for link-up, nothing bad * happens but the config access times out. */ > Signed-off-by: Pali Rohár <pali@kernel.org> > > --- > Changes in V2: > * Update commit message, mention that this is optimization > --- > drivers/pci/controller/pci-aardvark.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c > index 90ff291c24f0..53a4cfd7d377 100644 > --- a/drivers/pci/controller/pci-aardvark.c > +++ b/drivers/pci/controller/pci-aardvark.c > @@ -644,6 +644,9 @@ static bool advk_pcie_valid_device(struct advk_pcie *pcie, struct pci_bus *bus, > if ((bus->number == pcie->root_bus_nr) && PCI_SLOT(devfn) != 0) > return false; > > + if (bus->number != pcie->root_bus_nr && !advk_pcie_link_up(pcie)) > + return false; > + > return true; > } > > -- > 2.20.1 > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v2] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-07-01 21:34 ` Bjorn Helgaas @ 2020-07-02 8:23 ` Pali Rohár 0 siblings, 0 replies; 32+ messages in thread From: Pali Rohár @ 2020-07-02 8:23 UTC (permalink / raw) To: Bjorn Helgaas Cc: Thomas Petazzoni, Lorenzo Pieralisi, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium, linux-pci, linux-arm-kernel, linux-kernel On Wednesday 01 July 2020 16:34:42 Bjorn Helgaas wrote: > On Wed, Jul 01, 2020 at 10:20:44AM +0200, Pali Rohár wrote: > > When there is no PCIe card connected and advk_pcie_rd_conf() or > > advk_pcie_wr_conf() is called for PCI bus which doesn't belong to emulated > > root bridge, the aardvark driver throws the following error message: > > > > advk-pcie d0070000.pcie: config read/write timed out > > > > Obviously accessing PCIe registers of disconnected card is not possible. > > > > Extend check in advk_pcie_valid_device() function for validating > > availability of PCIe bus. If PCIe link is down, then the device is marked > > as Not Found and the driver does not try to access these registers. > > > > This is just an optimization to prevent accessing PCIe registers when card > > is disconnected. Trying to access PCIe registers of disconnected card does > > not cause any crash, kernel just needs to wait for a timeout. So if card > > disappear immediately after checking for PCIe link (before accessing PCIe > > registers), it does not cause any problems. > > Thanks, this is good. I'd really like a short comment in the code as > well, because this sort of link-up check tends to get copied to new > drivers where it shouldn't be used, e.g., something like this: > > /* > * If the link goes down after we check for link-up, nothing bad > * happens but the config access times out. > */ Ok, it makes sense! I will send a new patch version. > > Signed-off-by: Pali Rohár <pali@kernel.org> > > > > --- > > Changes in V2: > > * Update commit message, mention that this is optimization > > --- > > drivers/pci/controller/pci-aardvark.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c > > index 90ff291c24f0..53a4cfd7d377 100644 > > --- a/drivers/pci/controller/pci-aardvark.c > > +++ b/drivers/pci/controller/pci-aardvark.c > > @@ -644,6 +644,9 @@ static bool advk_pcie_valid_device(struct advk_pcie *pcie, struct pci_bus *bus, > > if ((bus->number == pcie->root_bus_nr) && PCI_SLOT(devfn) != 0) > > return false; > > > > + if (bus->number != pcie->root_bus_nr && !advk_pcie_link_up(pcie)) > > + return false; > > + > > return true; > > } > > > > -- > > 2.20.1 > > ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v3] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-05-28 14:31 [PATCH] PCI: aardvark: Don't touch PCIe registers if no card connected Pali Rohár 2020-05-28 16:26 ` Bjorn Helgaas 2020-07-01 8:20 ` [PATCH v2] " Pali Rohár @ 2020-07-02 8:30 ` Pali Rohár 2020-07-09 11:35 ` Lorenzo Pieralisi 2 siblings, 1 reply; 32+ messages in thread From: Pali Rohár @ 2020-07-02 8:30 UTC (permalink / raw) To: Thomas Petazzoni, Lorenzo Pieralisi, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium Cc: linux-pci, linux-arm-kernel, linux-kernel When there is no PCIe card connected and advk_pcie_rd_conf() or advk_pcie_wr_conf() is called for PCI bus which doesn't belong to emulated root bridge, the aardvark driver throws the following error message: advk-pcie d0070000.pcie: config read/write timed out Obviously accessing PCIe registers of disconnected card is not possible. Extend check in advk_pcie_valid_device() function for validating availability of PCIe bus. If PCIe link is down, then the device is marked as Not Found and the driver does not try to access these registers. This is just an optimization to prevent accessing PCIe registers when card is disconnected. Trying to access PCIe registers of disconnected card does not cause any crash, kernel just needs to wait for a timeout. So if card disappear immediately after checking for PCIe link (before accessing PCIe registers), it does not cause any problems. Signed-off-by: Pali Rohár <pali@kernel.org> --- Changes in V3: * Add comment to the code Changes in V2: * Update commit message, mention that this is optimization --- drivers/pci/controller/pci-aardvark.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c index 90ff291c24f0..d18f389b36a1 100644 --- a/drivers/pci/controller/pci-aardvark.c +++ b/drivers/pci/controller/pci-aardvark.c @@ -644,6 +644,13 @@ static bool advk_pcie_valid_device(struct advk_pcie *pcie, struct pci_bus *bus, if ((bus->number == pcie->root_bus_nr) && PCI_SLOT(devfn) != 0) return false; + /* + * If the link goes down after we check for link-up, nothing bad + * happens but the config access times out. + */ + if (bus->number != pcie->root_bus_nr && !advk_pcie_link_up(pcie)) + return false; + return true; } -- 2.20.1 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v3] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-07-02 8:30 ` [PATCH v3] " Pali Rohár @ 2020-07-09 11:35 ` Lorenzo Pieralisi 2020-07-09 12:22 ` Pali Rohár 0 siblings, 1 reply; 32+ messages in thread From: Lorenzo Pieralisi @ 2020-07-09 11:35 UTC (permalink / raw) To: Pali Rohár Cc: Thomas Petazzoni, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium, linux-pci, linux-arm-kernel, linux-kernel On Thu, Jul 02, 2020 at 10:30:36AM +0200, Pali Rohár wrote: > When there is no PCIe card connected and advk_pcie_rd_conf() or > advk_pcie_wr_conf() is called for PCI bus which doesn't belong to emulated > root bridge, the aardvark driver throws the following error message: > > advk-pcie d0070000.pcie: config read/write timed out > > Obviously accessing PCIe registers of disconnected card is not possible. > > Extend check in advk_pcie_valid_device() function for validating > availability of PCIe bus. If PCIe link is down, then the device is marked > as Not Found and the driver does not try to access these registers. > > This is just an optimization to prevent accessing PCIe registers when card > is disconnected. Trying to access PCIe registers of disconnected card does > not cause any crash, kernel just needs to wait for a timeout. So if card > disappear immediately after checking for PCIe link (before accessing PCIe > registers), it does not cause any problems. > > Signed-off-by: Pali Rohár <pali@kernel.org> > > --- > Changes in V3: > * Add comment to the code > Changes in V2: > * Update commit message, mention that this is optimization > --- > drivers/pci/controller/pci-aardvark.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c > index 90ff291c24f0..d18f389b36a1 100644 > --- a/drivers/pci/controller/pci-aardvark.c > +++ b/drivers/pci/controller/pci-aardvark.c > @@ -644,6 +644,13 @@ static bool advk_pcie_valid_device(struct advk_pcie *pcie, struct pci_bus *bus, > if ((bus->number == pcie->root_bus_nr) && PCI_SLOT(devfn) != 0) > return false; > > + /* > + * If the link goes down after we check for link-up, nothing bad > + * happens but the config access times out. > + */ > + if (bus->number != pcie->root_bus_nr && !advk_pcie_link_up(pcie)) > + return false; > + > return true; > } Question: this basically means that you can only effectively enumerate bus number == root_bus_nr and AFAICS if at probe the link did not come up it will never do, will it ? Isn't this equivalent to limiting the bus numbers the bridge is capable of handling ? Reworded: if in advk_pcie_setup_hw() the link does not come up, what's the point of trying to enumerate the bus hierarchy below the root bus ? Thanks, Lorenzo ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-07-09 11:35 ` Lorenzo Pieralisi @ 2020-07-09 12:22 ` Pali Rohár 2020-07-09 14:47 ` Lorenzo Pieralisi 0 siblings, 1 reply; 32+ messages in thread From: Pali Rohár @ 2020-07-09 12:22 UTC (permalink / raw) To: Lorenzo Pieralisi Cc: Thomas Petazzoni, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium, linux-pci, linux-arm-kernel, linux-kernel On Thursday 09 July 2020 12:35:09 Lorenzo Pieralisi wrote: > On Thu, Jul 02, 2020 at 10:30:36AM +0200, Pali Rohár wrote: > > When there is no PCIe card connected and advk_pcie_rd_conf() or > > advk_pcie_wr_conf() is called for PCI bus which doesn't belong to emulated > > root bridge, the aardvark driver throws the following error message: > > > > advk-pcie d0070000.pcie: config read/write timed out > > > > Obviously accessing PCIe registers of disconnected card is not possible. > > > > Extend check in advk_pcie_valid_device() function for validating > > availability of PCIe bus. If PCIe link is down, then the device is marked > > as Not Found and the driver does not try to access these registers. > > > > This is just an optimization to prevent accessing PCIe registers when card > > is disconnected. Trying to access PCIe registers of disconnected card does > > not cause any crash, kernel just needs to wait for a timeout. So if card > > disappear immediately after checking for PCIe link (before accessing PCIe > > registers), it does not cause any problems. > > > > Signed-off-by: Pali Rohár <pali@kernel.org> > > > > --- > > Changes in V3: > > * Add comment to the code > > Changes in V2: > > * Update commit message, mention that this is optimization > > --- > > drivers/pci/controller/pci-aardvark.c | 7 +++++++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c > > index 90ff291c24f0..d18f389b36a1 100644 > > --- a/drivers/pci/controller/pci-aardvark.c > > +++ b/drivers/pci/controller/pci-aardvark.c > > @@ -644,6 +644,13 @@ static bool advk_pcie_valid_device(struct advk_pcie *pcie, struct pci_bus *bus, > > if ((bus->number == pcie->root_bus_nr) && PCI_SLOT(devfn) != 0) > > return false; > > > > + /* > > + * If the link goes down after we check for link-up, nothing bad > > + * happens but the config access times out. > > + */ > > + if (bus->number != pcie->root_bus_nr && !advk_pcie_link_up(pcie)) > > + return false; > > + > > return true; > > } > > Question: this basically means that you can only effectively enumerate > bus number == root_bus_nr and AFAICS if at probe the link did not > come up it will never do, will it ? > > Isn't this equivalent to limiting the bus numbers the bridge is capable > of handling ? > > Reworded: if in advk_pcie_setup_hw() the link does not come up, what's > the point of trying to enumerate the bus hierarchy below the root bus ? Hello Lorenzo! PCIe link can theoretically come up even after boot, but aardvark driver currently does not support link detection at runtime. So it checks and enumerate device only at probe time. I do not know if hardware has some mechanism to inform kernel that PCIe link come up (or down) and re-enumeration is required. Or the only option is polling via advk_pcie_link_up(). So if device is not visible at the probe time then it would not appear in system and cannot be used. This is current state. Just to note that our hardware does not support physical hotplug of mPCIe cards. You need to connect card when board is powered off. So if at the aardvark probe time PCIe link is not up then trying to enumerate devices under (software) root bridge is not needed. But it is needed to register/enumerate software root bridge device and currently both is done by one (recursive) call pci_host_probe(). ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-07-09 12:22 ` Pali Rohár @ 2020-07-09 14:47 ` Lorenzo Pieralisi 2020-07-09 15:09 ` Pali Rohár 0 siblings, 1 reply; 32+ messages in thread From: Lorenzo Pieralisi @ 2020-07-09 14:47 UTC (permalink / raw) To: Pali Rohár Cc: Thomas Petazzoni, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium, linux-pci, linux-arm-kernel, linux-kernel On Thu, Jul 09, 2020 at 02:22:08PM +0200, Pali Rohár wrote: > On Thursday 09 July 2020 12:35:09 Lorenzo Pieralisi wrote: > > On Thu, Jul 02, 2020 at 10:30:36AM +0200, Pali Rohár wrote: > > > When there is no PCIe card connected and advk_pcie_rd_conf() or > > > advk_pcie_wr_conf() is called for PCI bus which doesn't belong to emulated > > > root bridge, the aardvark driver throws the following error message: > > > > > > advk-pcie d0070000.pcie: config read/write timed out > > > > > > Obviously accessing PCIe registers of disconnected card is not possible. > > > > > > Extend check in advk_pcie_valid_device() function for validating > > > availability of PCIe bus. If PCIe link is down, then the device is marked > > > as Not Found and the driver does not try to access these registers. > > > > > > This is just an optimization to prevent accessing PCIe registers when card > > > is disconnected. Trying to access PCIe registers of disconnected card does > > > not cause any crash, kernel just needs to wait for a timeout. So if card > > > disappear immediately after checking for PCIe link (before accessing PCIe > > > registers), it does not cause any problems. > > > > > > Signed-off-by: Pali Rohár <pali@kernel.org> > > > > > > --- > > > Changes in V3: > > > * Add comment to the code > > > Changes in V2: > > > * Update commit message, mention that this is optimization > > > --- > > > drivers/pci/controller/pci-aardvark.c | 7 +++++++ > > > 1 file changed, 7 insertions(+) > > > > > > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c > > > index 90ff291c24f0..d18f389b36a1 100644 > > > --- a/drivers/pci/controller/pci-aardvark.c > > > +++ b/drivers/pci/controller/pci-aardvark.c > > > @@ -644,6 +644,13 @@ static bool advk_pcie_valid_device(struct advk_pcie *pcie, struct pci_bus *bus, > > > if ((bus->number == pcie->root_bus_nr) && PCI_SLOT(devfn) != 0) > > > return false; > > > > > > + /* > > > + * If the link goes down after we check for link-up, nothing bad > > > + * happens but the config access times out. > > > + */ > > > + if (bus->number != pcie->root_bus_nr && !advk_pcie_link_up(pcie)) > > > + return false; > > > + > > > return true; > > > } > > > > Question: this basically means that you can only effectively enumerate > > bus number == root_bus_nr and AFAICS if at probe the link did not > > come up it will never do, will it ? > > > > Isn't this equivalent to limiting the bus numbers the bridge is capable > > of handling ? > > > > Reworded: if in advk_pcie_setup_hw() the link does not come up, what's > > the point of trying to enumerate the bus hierarchy below the root bus ? > > Hello Lorenzo! > > PCIe link can theoretically come up even after boot, but aardvark driver > currently does not support link detection at runtime. So it checks and > enumerate device only at probe time. If the link is not up at probe enumerating devices below the root bus is basically useless and that's actually what is causing the delays you are fixing. Is this correct ? > I do not know if hardware has some mechanism to inform kernel that PCIe > link come up (or down) and re-enumeration is required. Or the only > option is polling via advk_pcie_link_up(). > > So if device is not visible at the probe time then it would not appear > in system and cannot be used. This is current state. > > Just to note that our hardware does not support physical hotplug of > mPCIe cards. You need to connect card when board is powered off. > > So if at the aardvark probe time PCIe link is not up then trying to > enumerate devices under (software) root bridge is not needed. But it is > needed to register/enumerate software root bridge device and currently > both is done by one (recursive) call pci_host_probe(). I understand that but the bridge bus resource can be trimmed to just contain the root bus because that's the only one where there is a chance you can enumerate a device. I would like to get Bjorn's opinion on this, I don't like these "link is up" checks in config accessors (they are racy and honestly it is a run-time check that does not make much sense, either it is always true/false or it is inevitably racy) I was wondering if we can find an alternative solution but I am not sure the one I suggested above is better than this patch. Lorenzo ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-07-09 14:47 ` Lorenzo Pieralisi @ 2020-07-09 15:09 ` Pali Rohár 2020-07-10 9:18 ` Lorenzo Pieralisi 0 siblings, 1 reply; 32+ messages in thread From: Pali Rohár @ 2020-07-09 15:09 UTC (permalink / raw) To: Lorenzo Pieralisi Cc: Thomas Petazzoni, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium, linux-pci, linux-arm-kernel, linux-kernel On Thursday 09 July 2020 15:47:01 Lorenzo Pieralisi wrote: > On Thu, Jul 09, 2020 at 02:22:08PM +0200, Pali Rohár wrote: > > On Thursday 09 July 2020 12:35:09 Lorenzo Pieralisi wrote: > > > On Thu, Jul 02, 2020 at 10:30:36AM +0200, Pali Rohár wrote: > > > > When there is no PCIe card connected and advk_pcie_rd_conf() or > > > > advk_pcie_wr_conf() is called for PCI bus which doesn't belong to emulated > > > > root bridge, the aardvark driver throws the following error message: > > > > > > > > advk-pcie d0070000.pcie: config read/write timed out > > > > > > > > Obviously accessing PCIe registers of disconnected card is not possible. > > > > > > > > Extend check in advk_pcie_valid_device() function for validating > > > > availability of PCIe bus. If PCIe link is down, then the device is marked > > > > as Not Found and the driver does not try to access these registers. > > > > > > > > This is just an optimization to prevent accessing PCIe registers when card > > > > is disconnected. Trying to access PCIe registers of disconnected card does > > > > not cause any crash, kernel just needs to wait for a timeout. So if card > > > > disappear immediately after checking for PCIe link (before accessing PCIe > > > > registers), it does not cause any problems. > > > > > > > > Signed-off-by: Pali Rohár <pali@kernel.org> > > > > > > > > --- > > > > Changes in V3: > > > > * Add comment to the code > > > > Changes in V2: > > > > * Update commit message, mention that this is optimization > > > > --- > > > > drivers/pci/controller/pci-aardvark.c | 7 +++++++ > > > > 1 file changed, 7 insertions(+) > > > > > > > > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c > > > > index 90ff291c24f0..d18f389b36a1 100644 > > > > --- a/drivers/pci/controller/pci-aardvark.c > > > > +++ b/drivers/pci/controller/pci-aardvark.c > > > > @@ -644,6 +644,13 @@ static bool advk_pcie_valid_device(struct advk_pcie *pcie, struct pci_bus *bus, > > > > if ((bus->number == pcie->root_bus_nr) && PCI_SLOT(devfn) != 0) > > > > return false; > > > > > > > > + /* > > > > + * If the link goes down after we check for link-up, nothing bad > > > > + * happens but the config access times out. > > > > + */ > > > > + if (bus->number != pcie->root_bus_nr && !advk_pcie_link_up(pcie)) > > > > + return false; > > > > + > > > > return true; > > > > } > > > > > > Question: this basically means that you can only effectively enumerate > > > bus number == root_bus_nr and AFAICS if at probe the link did not > > > come up it will never do, will it ? > > > > > > Isn't this equivalent to limiting the bus numbers the bridge is capable > > > of handling ? > > > > > > Reworded: if in advk_pcie_setup_hw() the link does not come up, what's > > > the point of trying to enumerate the bus hierarchy below the root bus ? > > > > Hello Lorenzo! > > > > PCIe link can theoretically come up even after boot, but aardvark driver > > currently does not support link detection at runtime. So it checks and > > enumerate device only at probe time. > > If the link is not up at probe enumerating devices below the root > bus is basically useless and that's actually what is causing the > delays you are fixing. Is this correct ? Yes, this is one (but not the only one) delay. > > I do not know if hardware has some mechanism to inform kernel that PCIe > > link come up (or down) and re-enumeration is required. Or the only > > option is polling via advk_pcie_link_up(). > > > > So if device is not visible at the probe time then it would not appear > > in system and cannot be used. This is current state. > > > > Just to note that our hardware does not support physical hotplug of > > mPCIe cards. You need to connect card when board is powered off. > > > > So if at the aardvark probe time PCIe link is not up then trying to > > enumerate devices under (software) root bridge is not needed. But it is > > needed to register/enumerate software root bridge device and currently > > both is done by one (recursive) call pci_host_probe(). > > I understand that but the bridge bus resource can be trimmed to just > contain the root bus because that's the only one where there is a > chance you can enumerate a device. It is possible to register only root bridge without endpoint? > I would like to get Bjorn's opinion on this, I don't like these "link is > up" checks in config accessors (they are racy and honestly it is a > run-time check that does not make much sense, either it is always > true/false or it is inevitably racy) It is runtime check, but does not have to be always true/false. I have tested more Compex wifi cards and under certain conditions they "disappear" from the bus during usage. So I think it still make sense to do this "fast" check as it is only optimization. > I was wondering if we can find an > alternative solution but I am not sure the one I suggested above is > better than this patch. I do not know if it helps in situation when card disappear from bus on runtime... ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-07-09 15:09 ` Pali Rohár @ 2020-07-10 9:18 ` Lorenzo Pieralisi 2020-07-10 15:44 ` Pali Rohár 2020-07-13 8:27 ` Pali Rohár 0 siblings, 2 replies; 32+ messages in thread From: Lorenzo Pieralisi @ 2020-07-10 9:18 UTC (permalink / raw) To: Pali Rohár Cc: Thomas Petazzoni, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium, linux-pci, linux-arm-kernel, linux-kernel On Thu, Jul 09, 2020 at 05:09:59PM +0200, Pali Rohár wrote: [...] > > I understand that but the bridge bus resource can be trimmed to just > > contain the root bus because that's the only one where there is a > > chance you can enumerate a device. > > It is possible to register only root bridge without endpoint? It is possible to register the root bridge with a trimmed IORESOURCE_BUS so that you don't enumerate anything other than the root port. > > I would like to get Bjorn's opinion on this, I don't like these "link is > > up" checks in config accessors (they are racy and honestly it is a > > run-time check that does not make much sense, either it is always > > true/false or it is inevitably racy) > > It is runtime check, but does not have to be always true/false. I have > tested more Compex wifi cards and under certain conditions they > "disappear" from the bus during usage. I would be very grateful if you could describe what happens in HW when these conditions trigger - I would like to understand if this issue is aardvark specific or it isn't. > So I think it still make sense to do this "fast" check as it is only > optimization. I will merge this patch but I'd also like to understand the underlying issue better. Thanks, Lorenzo ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-07-10 9:18 ` Lorenzo Pieralisi @ 2020-07-10 15:44 ` Pali Rohár 2020-07-10 16:08 ` Bjorn Helgaas 2020-07-13 8:27 ` Pali Rohár 1 sibling, 1 reply; 32+ messages in thread From: Pali Rohár @ 2020-07-10 15:44 UTC (permalink / raw) To: Lorenzo Pieralisi Cc: Thomas Petazzoni, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium, linux-pci, linux-arm-kernel, linux-kernel On Friday 10 July 2020 10:18:00 Lorenzo Pieralisi wrote: > I would be very grateful if you could describe what happens in HW > when these conditions trigger - I would like to understand if this > issue is aardvark specific or it isn't. Hello Lorenzo! We are not sure what is the problem and where it happens. There are more issues which happens randomly or under some specific conditions. I can reproduce following issue: Connect Compex WLE900VX card, configure aardvark to gen2 mode. And then card is detected only after the first link training. If kernel tries to retrain link again (e.g. via ASPM code) then card is not detected anymore. To detect it again it is needed to reset card via PERST# signal (assert PERST#, wait, de-assert PERST#). PCI warm, hot or function reset does not help. When aardvark is configured in gen1 mode then card is detected fine also after multiple link training. Above problem does not happen with Compex WLE200VX (ath9k) or Compex WLE1216V5-20 cards. Sometimes WLE900VX card disappear from the bus during usage. It just stop communicates with ath10k driver and aardvark does not see link. Another issue which happens for WLE900VX, WLE600VX and WLE1216VS-20 (but not for WLE200VX): Linux kernel can detect these cards only if it issues card reset via PERST# signal and start link training (via standard pcie endpoint register PCI_EXP_LNKCTL/PCI_EXP_LNKCTL_RL) immediately after enable link training in aardvark (via aardvark specific LINK_TRAINING_EN bit). If there is e.g. 100ms delay between enabling link training and setting PCI_EXP_LNKCTL_RL bit then these cards are not detected. Also issuing reset via PERST# signal is required to detect these cards if either board was rebooted (not started from cold power off state) or if U-Boot touched/initialized PCIe aardvark. WLE200VX works fine also after doing second or third link training and also works without need to issue reset via PERST# signal. And WLE900VX card is not detected even after resetting it via PERST# signal if aardvark link training (LINK_TRAINING_EN bit) was enabled prior toggling PERST#. PERST# signal is controlled via GPIO. When I put WLE900VX card into board with uses mvebu PCI driver (not aardvak) then card is working fine, there is no need to issue card reset via PERST#, no need to explicitly set gen mode and card is also working after more link training. So basically I have no idea why it happens or where is the problem, either in aardvark or in cards or on both places. As you can see each of tested card has different set problems. Today I tested card from different vendor but with same Qualcomm chip as is in WLE900VX and I observe same behavior as from Compex WLE900VX. So it looks like that card vendor does not have to matter, important is wifi chip inside. I read in kernel bugzilla that WLE600VX and WLE900VX cards are buggy and more people have problems with them. But issues described in kernel bugzilla (like card is reporting incorrect PCI device id) I'm not observing. If you have any idea how to either debug these problems or come up with idea where could be the problem, please let me know. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-07-10 15:44 ` Pali Rohár @ 2020-07-10 16:08 ` Bjorn Helgaas 2020-07-10 19:30 ` Pali Rohár 0 siblings, 1 reply; 32+ messages in thread From: Bjorn Helgaas @ 2020-07-10 16:08 UTC (permalink / raw) To: Pali Rohár Cc: Lorenzo Pieralisi, Thomas Petazzoni, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium, linux-pci, linux-arm-kernel, linux-kernel On Fri, Jul 10, 2020 at 05:44:58PM +0200, Pali Rohár wrote: > I can reproduce following issue: Connect Compex WLE900VX card, configure > aardvark to gen2 mode. And then card is detected only after the first > link training. If kernel tries to retrain link again (e.g. via ASPM > code) then card is not detected anymore. Somebody should go over the ASPM retrain link code and the PCIe spec with a fine-toothed comb. Maybe we're doing something wrong there. Or maybe aardvark has some hardware issue and we need some sort of quirk to work around it. > Another issue which happens for WLE900VX, WLE600VX and WLE1216VS-20 (but > not for WLE200VX): Linux kernel can detect these cards only if it issues > card reset via PERST# signal and start link training (via standard pcie > endpoint register PCI_EXP_LNKCTL/PCI_EXP_LNKCTL_RL) I think you mean "downstream port" (not "endpoint") register? PCI_EXP_LNKCTL_RL is only applicable to *downstream ports* (root ports or switch downstream ports) and is reserved for endpoints. > immediately after > enable link training in aardvark (via aardvark specific LINK_TRAINING_EN > bit). If there is e.g. 100ms delay between enabling link training and > setting PCI_EXP_LNKCTL_RL bit then these cards are not detected. This sounds problematic. Hardware should not be dependent on the software being "fast enough". In general we should be able to insert arbitrary delays at any point without breaking anything. But I have the impression that aardvark requires more software hand-holding that most hardware does. If it imposes timing requirements on the software, that *should* be documented in the aardvark spec. > I read in kernel bugzilla that WLE600VX and WLE900VX cards are buggy and > more people have problems with them. But issues described in kernel > bugzilla (like card is reporting incorrect PCI device id) I'm not > observing. Pointer? Is the incorrect device ID 0xffff? That could be a symptom of a PCIe error. If we read a device ID that's something other than 0, 0xffff, or the correct ID, that would be really weird. Even 0 would be really strange. I suspect these wifi cards are a little special because they probably play unusual games with power for airplane mode and the like. Bjorn ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-07-10 16:08 ` Bjorn Helgaas @ 2020-07-10 19:30 ` Pali Rohár 2020-07-10 20:08 ` Bjorn Helgaas 0 siblings, 1 reply; 32+ messages in thread From: Pali Rohár @ 2020-07-10 19:30 UTC (permalink / raw) To: Bjorn Helgaas Cc: Lorenzo Pieralisi, Thomas Petazzoni, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium, linux-pci, linux-arm-kernel, linux-kernel On Friday 10 July 2020 11:08:28 Bjorn Helgaas wrote: > On Fri, Jul 10, 2020 at 05:44:58PM +0200, Pali Rohár wrote: > > I can reproduce following issue: Connect Compex WLE900VX card, configure > > aardvark to gen2 mode. And then card is detected only after the first > > link training. If kernel tries to retrain link again (e.g. via ASPM > > code) then card is not detected anymore. > > Somebody should go over the ASPM retrain link code and the PCIe spec > with a fine-toothed comb. Maybe we're doing something wrong there. I think this is not ASPM related as card simply disappear just after flipping PCI_EXP_LNKCTL_RL bit second time without changing ASPM bits. > Or maybe aardvark has some hardware issue and we need some sort of > quirk to work around it. It is possible that this is aardvark issue. As I said I really do not know. In aardvark driver there is already merged workaround for this issue: driver force gen1 aardvark mode for gen1 card. > > Another issue which happens for WLE900VX, WLE600VX and WLE1216VS-20 (but > > not for WLE200VX): Linux kernel can detect these cards only if it issues > > card reset via PERST# signal and start link training (via standard pcie > > endpoint register PCI_EXP_LNKCTL/PCI_EXP_LNKCTL_RL) > > I think you mean "downstream port" (not "endpoint") register? Yes. > PCI_EXP_LNKCTL_RL is only applicable to *downstream ports* (root ports > or switch downstream ports) and is reserved for endpoints. > > > immediately after > > enable link training in aardvark (via aardvark specific LINK_TRAINING_EN > > bit). If there is e.g. 100ms delay between enabling link training and > > setting PCI_EXP_LNKCTL_RL bit then these cards are not detected. > > This sounds problematic. Hardware should not be dependent on the > software being "fast enough". In general we should be able to insert > arbitrary delays at any point without breaking anything. Yes, it is problematic. For example following commit broke those cards: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f4c7d053d7f77cd5c1a1ba7c7ce085ddba13d1d7 And this commit fixed it (just msleep was moved to different stage): https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6964494582f56a3882c2c53b0edbfe99eb32b2e1 But we somehow need to deal with it until we find root cause. Basically additional sleep in aardvark init phase can break WLE900VX cards, but not WLE200VX. And because WLE900VX works fine with pci-mvebu and WLE200VX works fine with pci-aardvark we cannot deduce from it if problem for combination of WLE900VX and aardvark is in WLE900VX or in aardvark. > But I have the impression that aardvark requires more software > hand-holding that most hardware does. If it imposes timing > requirements on the software, that *should* be documented in the > aardvark spec. There is absolutely nothing regarding to timings in documentation which I saw. In documentation are just instructions/steps how to init PCI subsystem and it is basically advk_pcie_setup_hw() function. > > I read in kernel bugzilla that WLE600VX and WLE900VX cards are buggy and > > more people have problems with them. But issues described in kernel > > bugzilla (like card is reporting incorrect PCI device id) I'm not > > observing. > > Pointer? Hm... I cannot find right now pointer to bugzilla, but I have pointer to ath9k-devel mailing list with that incorrect device id: https://www.mail-archive.com/ath9k-devel@lists.ath9k.org/msg07529.html > Is the incorrect device ID 0xffff? No, incorrect device ID in that case is 0xabcd and vendor ID is correct (Qualcomm). > That could be a symptom > of a PCIe error. If we read a device ID that's something other than > 0, 0xffff, or the correct ID, that would be really weird. Even 0 > would be really strange. It is strange and also reason why discussion on that list is long. As I said, I'm not seeing that problem with wrong device ID. But I know people who are observing same problem on different boards (which do not use aardvark) as described in above mailing list thread with Compex ath10k cards. > I suspect these wifi cards are a little special because they probably > play unusual games with power for airplane mode and the like. This is another/different problem and is already "documented" in kernel bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=84821#c52 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-07-10 19:30 ` Pali Rohár @ 2020-07-10 20:08 ` Bjorn Helgaas 0 siblings, 0 replies; 32+ messages in thread From: Bjorn Helgaas @ 2020-07-10 20:08 UTC (permalink / raw) To: Pali Rohár Cc: Lorenzo Pieralisi, Thomas Petazzoni, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium, linux-pci, linux-arm-kernel, linux-kernel On Fri, Jul 10, 2020 at 09:30:03PM +0200, Pali Rohár wrote: > On Friday 10 July 2020 11:08:28 Bjorn Helgaas wrote: > > On Fri, Jul 10, 2020 at 05:44:58PM +0200, Pali Rohár wrote: > > > I can reproduce following issue: Connect Compex WLE900VX card, configure > > > aardvark to gen2 mode. And then card is detected only after the first > > > link training. If kernel tries to retrain link again (e.g. via ASPM > > > code) then card is not detected anymore. > > > > Somebody should go over the ASPM retrain link code and the PCIe spec > > with a fine-toothed comb. Maybe we're doing something wrong there. > > I think this is not ASPM related as card simply disappear just after > flipping PCI_EXP_LNKCTL_RL bit second time without changing ASPM bits. Right. The retrain code in aspm.c doesn't really have anything in particular to do with ASPM and it should probably be moved elsewhere. So I think the problem may be related to retrain and the delays after it in general, not to ASPM. > There is absolutely nothing regarding to timings in documentation which > I saw. In documentation are just instructions/steps how to init PCI > subsystem and it is basically advk_pcie_setup_hw() function. > > > > I read in kernel bugzilla that WLE600VX and WLE900VX cards are buggy and > > > more people have problems with them. But issues described in kernel > > > bugzilla (like card is reporting incorrect PCI device id) I'm not > > > observing. > > Hm... I cannot find right now pointer to bugzilla, but I have pointer to > ath9k-devel mailing list with that incorrect device id: > > https://www.mail-archive.com/ath9k-devel@lists.ath9k.org/msg07529.html > > > Is the incorrect device ID 0xffff? > > No, incorrect device ID in that case is 0xabcd and vendor ID is correct > (Qualcomm). From a quick look at that thread, it sounds like the device isn't quite ready yet. In that case, it's supposed to respond with Config Request Retry Status, and Linux is supposed to wait longer and retry. But I don't think Linux does that quite correctly, so it could be either a hardware problem or Linux being broken. But I guess that's not the current problem so I don't want to go down that rathole right now. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-07-10 9:18 ` Lorenzo Pieralisi 2020-07-10 15:44 ` Pali Rohár @ 2020-07-13 8:27 ` Pali Rohár 2020-07-13 11:23 ` Lorenzo Pieralisi 1 sibling, 1 reply; 32+ messages in thread From: Pali Rohár @ 2020-07-13 8:27 UTC (permalink / raw) To: Lorenzo Pieralisi Cc: Thomas Petazzoni, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium, linux-pci, linux-arm-kernel, linux-kernel On Friday 10 July 2020 10:18:00 Lorenzo Pieralisi wrote: > On Thu, Jul 09, 2020 at 05:09:59PM +0200, Pali Rohár wrote: > > > I understand that but the bridge bus resource can be trimmed to just > > > contain the root bus because that's the only one where there is a > > > chance you can enumerate a device. > > > > It is possible to register only root bridge without endpoint? > > It is possible to register the root bridge with a trimmed IORESOURCE_BUS > so that you don't enumerate anything other than the root port. Hello Lorenzo! I really do not know how to achieve it. From code it looks like that pci/probe.c scans child buses unconditionally. pci-aardvark.c calls pci_host_probe() which calls functions pci_scan_root_bus_bridge() which calls pci_scan_child_bus() which calls pci_scan_child_bus_extend() which calls pci_scan_bridge_extend() (bridge needs to be reconfigured) which then try to probe child bus via pci_scan_child_bus_extend() because bridge is not card bus. In function pci_scan_bridge_extend() I do not see a way how to skip probing for child buses which would avoid enumerating aardvark root bridge when PCIe device is not connected. dmesg output contains: advk-pcie d0070000.pcie: link never came up advk-pcie d0070000.pcie: PCI host bridge to bus 0000:00 pci_bus 0000:00: root bus resource [bus 00-ff] pci_bus 0000:00: root bus resource [mem 0xe8000000-0xe8ffffff] pci_bus 0000:00: root bus resource [io 0x0000-0xffff] (bus address [0xe9000000-0xe900ffff]) pci_bus 0000:00: scanning bus pci 0000:00:00.0: [1b4b:0100] type 01 class 0x060400 pci 0000:00:00.0: reg 0x38: [mem 0x00000000-0x000007ff pref] pci_bus 0000:00: fixups for bus pci 0000:00:00.0: scanning [bus 00-00] behind bridge, pass 0 pci 0000:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring pci 0000:00:00.0: scanning [bus 00-00] behind bridge, pass 1 pci_bus 0000:01: scanning bus advk-pcie d0070000.pcie: advk_pcie_valid_device ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-07-13 8:27 ` Pali Rohár @ 2020-07-13 11:23 ` Lorenzo Pieralisi 2020-07-13 14:50 ` Pali Rohár 2020-07-15 12:17 ` Pali Rohár 0 siblings, 2 replies; 32+ messages in thread From: Lorenzo Pieralisi @ 2020-07-13 11:23 UTC (permalink / raw) To: Pali Rohár Cc: Thomas Petazzoni, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium, linux-pci, linux-arm-kernel, linux-kernel On Mon, Jul 13, 2020 at 10:27:47AM +0200, Pali Rohár wrote: > On Friday 10 July 2020 10:18:00 Lorenzo Pieralisi wrote: > > On Thu, Jul 09, 2020 at 05:09:59PM +0200, Pali Rohár wrote: > > > > I understand that but the bridge bus resource can be trimmed to just > > > > contain the root bus because that's the only one where there is a > > > > chance you can enumerate a device. > > > > > > It is possible to register only root bridge without endpoint? > > > > It is possible to register the root bridge with a trimmed IORESOURCE_BUS > > so that you don't enumerate anything other than the root port. > > Hello Lorenzo! I really do not know how to achieve it. From code it > looks like that pci/probe.c scans child buses unconditionally. > > pci-aardvark.c calls pci_host_probe() which calls functions > pci_scan_root_bus_bridge() which calls pci_scan_child_bus() which calls > pci_scan_child_bus_extend() which calls pci_scan_bridge_extend() (bridge > needs to be reconfigured) which then try to probe child bus via > pci_scan_child_bus_extend() because bridge is not card bus. > > In function pci_scan_bridge_extend() I do not see a way how to skip > probing for child buses which would avoid enumerating aardvark root > bridge when PCIe device is not connected. > > dmesg output contains: > > advk-pcie d0070000.pcie: link never came up > advk-pcie d0070000.pcie: PCI host bridge to bus 0000:00 > pci_bus 0000:00: root bus resource [bus 00-ff] This resource can be limited to the root bus number only before calling pci_host_probe() (ie see pci_parse_request_of_pci_ranges() and code in pci_scan_bridge_extend() that programs primary/secondary/subordinate busses) but I think that only papers over the issue, it does not fix it. I will go over the thread again but I suspect I can merge the patch even though I still believe there is work to be done to understand the issue we are facing. Lorenzo > pci_bus 0000:00: root bus resource [mem 0xe8000000-0xe8ffffff] > pci_bus 0000:00: root bus resource [io 0x0000-0xffff] (bus address [0xe9000000-0xe900ffff]) > pci_bus 0000:00: scanning bus > pci 0000:00:00.0: [1b4b:0100] type 01 class 0x060400 > pci 0000:00:00.0: reg 0x38: [mem 0x00000000-0x000007ff pref] > pci_bus 0000:00: fixups for bus > pci 0000:00:00.0: scanning [bus 00-00] behind bridge, pass 0 > pci 0000:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring > pci 0000:00:00.0: scanning [bus 00-00] behind bridge, pass 1 > pci_bus 0000:01: scanning bus > advk-pcie d0070000.pcie: advk_pcie_valid_device ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-07-13 11:23 ` Lorenzo Pieralisi @ 2020-07-13 14:50 ` Pali Rohár 2020-07-13 16:41 ` Lorenzo Pieralisi 2020-07-15 12:17 ` Pali Rohár 1 sibling, 1 reply; 32+ messages in thread From: Pali Rohár @ 2020-07-13 14:50 UTC (permalink / raw) To: Lorenzo Pieralisi Cc: Thomas Petazzoni, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium, linux-pci, linux-arm-kernel, linux-kernel On Monday 13 July 2020 12:23:25 Lorenzo Pieralisi wrote: > I will go over the thread again but I suspect I can merge the patch even > though I still believe there is work to be done to understand the issue > we are facing. Just to note that pci-mvebu.c also checks if pcie link is up before trying to access the real PCIe interface registers, similarly as in my patch. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-07-13 14:50 ` Pali Rohár @ 2020-07-13 16:41 ` Lorenzo Pieralisi 2020-07-14 7:38 ` Pali Rohár 0 siblings, 1 reply; 32+ messages in thread From: Lorenzo Pieralisi @ 2020-07-13 16:41 UTC (permalink / raw) To: Pali Rohár Cc: Thomas Petazzoni, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium, linux-pci, linux-arm-kernel, linux-kernel On Mon, Jul 13, 2020 at 04:50:03PM +0200, Pali Rohár wrote: > On Monday 13 July 2020 12:23:25 Lorenzo Pieralisi wrote: > > I will go over the thread again but I suspect I can merge the patch even > > though I still believe there is work to be done to understand the issue > > we are facing. > > Just to note that pci-mvebu.c also checks if pcie link is up before > trying to access the real PCIe interface registers, similarly as in my > patch. I understand - that does not change my opinion though, the link check is just a workaround, it'd be best if we pinpoint the real issue which is likely to a HW one. Lorenzo ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-07-13 16:41 ` Lorenzo Pieralisi @ 2020-07-14 7:38 ` Pali Rohár 0 siblings, 0 replies; 32+ messages in thread From: Pali Rohár @ 2020-07-14 7:38 UTC (permalink / raw) To: Lorenzo Pieralisi Cc: Thomas Petazzoni, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium, linux-pci, linux-arm-kernel, linux-kernel On Monday 13 July 2020 17:41:40 Lorenzo Pieralisi wrote: > On Mon, Jul 13, 2020 at 04:50:03PM +0200, Pali Rohár wrote: > > On Monday 13 July 2020 12:23:25 Lorenzo Pieralisi wrote: > > > I will go over the thread again but I suspect I can merge the patch even > > > though I still believe there is work to be done to understand the issue > > > we are facing. > > > > Just to note that pci-mvebu.c also checks if pcie link is up before > > trying to access the real PCIe interface registers, similarly as in my > > patch. > > I understand - that does not change my opinion though, the link check > is just a workaround, it'd be best if we pinpoint the real issue which > is likely to a HW one. Lorenzo, if you have an idea how to debug this issue or if you would like to see some test results, let me know. I can do some tests, but I currently really do not know more then what I wrote in previous emails. In my opinion, problem is in HW which Marvell has not documented nor proved that it exists. Other option is that problem is in Compex card which can be triggered only by Marvell aardvark HW. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-07-13 11:23 ` Lorenzo Pieralisi 2020-07-13 14:50 ` Pali Rohár @ 2020-07-15 12:17 ` Pali Rohár 2020-07-15 16:21 ` Lorenzo Pieralisi 1 sibling, 1 reply; 32+ messages in thread From: Pali Rohár @ 2020-07-15 12:17 UTC (permalink / raw) To: Lorenzo Pieralisi Cc: Thomas Petazzoni, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium, linux-pci, linux-arm-kernel, linux-kernel On Monday 13 July 2020 12:23:25 Lorenzo Pieralisi wrote: > On Mon, Jul 13, 2020 at 10:27:47AM +0200, Pali Rohár wrote: > > On Friday 10 July 2020 10:18:00 Lorenzo Pieralisi wrote: > > > On Thu, Jul 09, 2020 at 05:09:59PM +0200, Pali Rohár wrote: > > > > > I understand that but the bridge bus resource can be trimmed to just > > > > > contain the root bus because that's the only one where there is a > > > > > chance you can enumerate a device. > > > > > > > > It is possible to register only root bridge without endpoint? > > > > > > It is possible to register the root bridge with a trimmed IORESOURCE_BUS > > > so that you don't enumerate anything other than the root port. > > > > Hello Lorenzo! I really do not know how to achieve it. From code it > > looks like that pci/probe.c scans child buses unconditionally. > > > > pci-aardvark.c calls pci_host_probe() which calls functions > > pci_scan_root_bus_bridge() which calls pci_scan_child_bus() which calls > > pci_scan_child_bus_extend() which calls pci_scan_bridge_extend() (bridge > > needs to be reconfigured) which then try to probe child bus via > > pci_scan_child_bus_extend() because bridge is not card bus. > > > > In function pci_scan_bridge_extend() I do not see a way how to skip > > probing for child buses which would avoid enumerating aardvark root > > bridge when PCIe device is not connected. > > > > dmesg output contains: > > > > advk-pcie d0070000.pcie: link never came up > > advk-pcie d0070000.pcie: PCI host bridge to bus 0000:00 > > pci_bus 0000:00: root bus resource [bus 00-ff] > > This resource can be limited to the root bus number only before calling > pci_host_probe() (ie see pci_parse_request_of_pci_ranges() and code in > pci_scan_bridge_extend() that programs primary/secondary/subordinate > busses) but I think that only papers over the issue, it does not fix it. I looked at the code in pci/probe.c again and I do not think it is possible to avoid scanning devices. pci_scan_child_bus_extend() is unconditionally calling pci_scan_slot() for devfn=0 as the first thing. And this function unconditionally calls pci_scan_device() which is directly trying to read vendor id from config register. So for me it looks like that kernel expects that can read vendor id and device id from config register for device which is not connected. And trying to read config register would cause those timeouts in aardvark. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-07-15 12:17 ` Pali Rohár @ 2020-07-15 16:21 ` Lorenzo Pieralisi 2020-07-21 8:57 ` Pali Rohár 0 siblings, 1 reply; 32+ messages in thread From: Lorenzo Pieralisi @ 2020-07-15 16:21 UTC (permalink / raw) To: Pali Rohár Cc: Thomas Petazzoni, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium, linux-pci, linux-arm-kernel, linux-kernel On Wed, Jul 15, 2020 at 02:17:26PM +0200, Pali Rohár wrote: > On Monday 13 July 2020 12:23:25 Lorenzo Pieralisi wrote: > > On Mon, Jul 13, 2020 at 10:27:47AM +0200, Pali Rohár wrote: > > > On Friday 10 July 2020 10:18:00 Lorenzo Pieralisi wrote: > > > > On Thu, Jul 09, 2020 at 05:09:59PM +0200, Pali Rohár wrote: > > > > > > I understand that but the bridge bus resource can be trimmed to just > > > > > > contain the root bus because that's the only one where there is a > > > > > > chance you can enumerate a device. > > > > > > > > > > It is possible to register only root bridge without endpoint? > > > > > > > > It is possible to register the root bridge with a trimmed IORESOURCE_BUS > > > > so that you don't enumerate anything other than the root port. > > > > > > Hello Lorenzo! I really do not know how to achieve it. From code it > > > looks like that pci/probe.c scans child buses unconditionally. > > > > > > pci-aardvark.c calls pci_host_probe() which calls functions > > > pci_scan_root_bus_bridge() which calls pci_scan_child_bus() which calls > > > pci_scan_child_bus_extend() which calls pci_scan_bridge_extend() (bridge > > > needs to be reconfigured) which then try to probe child bus via > > > pci_scan_child_bus_extend() because bridge is not card bus. > > > > > > In function pci_scan_bridge_extend() I do not see a way how to skip > > > probing for child buses which would avoid enumerating aardvark root > > > bridge when PCIe device is not connected. > > > > > > dmesg output contains: > > > > > > advk-pcie d0070000.pcie: link never came up > > > advk-pcie d0070000.pcie: PCI host bridge to bus 0000:00 > > > pci_bus 0000:00: root bus resource [bus 00-ff] > > > > This resource can be limited to the root bus number only before calling > > pci_host_probe() (ie see pci_parse_request_of_pci_ranges() and code in > > pci_scan_bridge_extend() that programs primary/secondary/subordinate > > busses) but I think that only papers over the issue, it does not fix it. > > I looked at the code in pci/probe.c again and I do not think it is > possible to avoid scanning devices. pci_scan_child_bus_extend() is > unconditionally calling pci_scan_slot() for devfn=0 as the first thing. > And this function unconditionally calls pci_scan_device() which is > directly trying to read vendor id from config register. > > So for me it looks like that kernel expects that can read vendor id and > device id from config register for device which is not connected. Not if it is connected to a bus that the root port does not decode, that's what I am saying. > And trying to read config register would cause those timeouts in > aardvark. The root port (which effectively works as PCI bridge from this standpoint) does not issue config cycles for busses that aren't within its decoded bus range, which in turn is determined by the firmware IORESOURCE_BUS resource. This issue is caused by devices that are connected downstream to the root port. Anyway - patch merged but I would be happy to keep this discussion going, somehow. If the LPC20 VFIO/IOMMU/PCI microconference is approved it can be a good venue for this to happen. Lorenzo ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-07-15 16:21 ` Lorenzo Pieralisi @ 2020-07-21 8:57 ` Pali Rohár 2020-07-21 10:48 ` Lorenzo Pieralisi 0 siblings, 1 reply; 32+ messages in thread From: Pali Rohár @ 2020-07-21 8:57 UTC (permalink / raw) To: Lorenzo Pieralisi Cc: Thomas Petazzoni, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium, linux-pci, linux-arm-kernel, linux-kernel On Wednesday 15 July 2020 17:21:08 Lorenzo Pieralisi wrote: > On Wed, Jul 15, 2020 at 02:17:26PM +0200, Pali Rohár wrote: > > On Monday 13 July 2020 12:23:25 Lorenzo Pieralisi wrote: > > > On Mon, Jul 13, 2020 at 10:27:47AM +0200, Pali Rohár wrote: > > > > On Friday 10 July 2020 10:18:00 Lorenzo Pieralisi wrote: > > > > > On Thu, Jul 09, 2020 at 05:09:59PM +0200, Pali Rohár wrote: > > > > > > > I understand that but the bridge bus resource can be trimmed to just > > > > > > > contain the root bus because that's the only one where there is a > > > > > > > chance you can enumerate a device. > > > > > > > > > > > > It is possible to register only root bridge without endpoint? > > > > > > > > > > It is possible to register the root bridge with a trimmed IORESOURCE_BUS > > > > > so that you don't enumerate anything other than the root port. > > > > > > > > Hello Lorenzo! I really do not know how to achieve it. From code it > > > > looks like that pci/probe.c scans child buses unconditionally. > > > > > > > > pci-aardvark.c calls pci_host_probe() which calls functions > > > > pci_scan_root_bus_bridge() which calls pci_scan_child_bus() which calls > > > > pci_scan_child_bus_extend() which calls pci_scan_bridge_extend() (bridge > > > > needs to be reconfigured) which then try to probe child bus via > > > > pci_scan_child_bus_extend() because bridge is not card bus. > > > > > > > > In function pci_scan_bridge_extend() I do not see a way how to skip > > > > probing for child buses which would avoid enumerating aardvark root > > > > bridge when PCIe device is not connected. > > > > > > > > dmesg output contains: > > > > > > > > advk-pcie d0070000.pcie: link never came up > > > > advk-pcie d0070000.pcie: PCI host bridge to bus 0000:00 > > > > pci_bus 0000:00: root bus resource [bus 00-ff] > > > > > > This resource can be limited to the root bus number only before calling > > > pci_host_probe() (ie see pci_parse_request_of_pci_ranges() and code in > > > pci_scan_bridge_extend() that programs primary/secondary/subordinate > > > busses) but I think that only papers over the issue, it does not fix it. > > > > I looked at the code in pci/probe.c again and I do not think it is > > possible to avoid scanning devices. pci_scan_child_bus_extend() is > > unconditionally calling pci_scan_slot() for devfn=0 as the first thing. > > And this function unconditionally calls pci_scan_device() which is > > directly trying to read vendor id from config register. > > > > So for me it looks like that kernel expects that can read vendor id and > > device id from config register for device which is not connected. > > Not if it is connected to a bus that the root port does not decode, > that's what I am saying. > > > And trying to read config register would cause those timeouts in > > aardvark. > > The root port (which effectively works as PCI bridge from this > standpoint) does not issue config cycles for busses that aren't within > its decoded bus range, which in turn is determined by the firmware > IORESOURCE_BUS resource. > > This issue is caused by devices that are connected downstream to > the root port. > > Anyway - patch merged Could you send me a link to git commit? I have looked into lpieralisi/pci.git repository, but I do not see it here. > but I would be happy to keep this discussion going, somehow. Ok, no problem. As I said if anybody has any idea or would like to see some tests from me, I can do it and provide results. > If the LPC20 VFIO/IOMMU/PCI microconference is approved it can be a > good venue for this to happen. > > Lorenzo ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3] PCI: aardvark: Don't touch PCIe registers if no card connected 2020-07-21 8:57 ` Pali Rohár @ 2020-07-21 10:48 ` Lorenzo Pieralisi 0 siblings, 0 replies; 32+ messages in thread From: Lorenzo Pieralisi @ 2020-07-21 10:48 UTC (permalink / raw) To: Pali Rohár Cc: Thomas Petazzoni, Andrew Murray, Bjorn Helgaas, Marek Behún, Remi Pommarel, Tomasz Maciej Nowak, Xogium, linux-pci, linux-arm-kernel, linux-kernel On Tue, Jul 21, 2020 at 10:57:13AM +0200, Pali Rohár wrote: > On Wednesday 15 July 2020 17:21:08 Lorenzo Pieralisi wrote: > > On Wed, Jul 15, 2020 at 02:17:26PM +0200, Pali Rohár wrote: > > > On Monday 13 July 2020 12:23:25 Lorenzo Pieralisi wrote: > > > > On Mon, Jul 13, 2020 at 10:27:47AM +0200, Pali Rohár wrote: > > > > > On Friday 10 July 2020 10:18:00 Lorenzo Pieralisi wrote: > > > > > > On Thu, Jul 09, 2020 at 05:09:59PM +0200, Pali Rohár wrote: > > > > > > > > I understand that but the bridge bus resource can be trimmed to just > > > > > > > > contain the root bus because that's the only one where there is a > > > > > > > > chance you can enumerate a device. > > > > > > > > > > > > > > It is possible to register only root bridge without endpoint? > > > > > > > > > > > > It is possible to register the root bridge with a trimmed IORESOURCE_BUS > > > > > > so that you don't enumerate anything other than the root port. > > > > > > > > > > Hello Lorenzo! I really do not know how to achieve it. From code it > > > > > looks like that pci/probe.c scans child buses unconditionally. > > > > > > > > > > pci-aardvark.c calls pci_host_probe() which calls functions > > > > > pci_scan_root_bus_bridge() which calls pci_scan_child_bus() which calls > > > > > pci_scan_child_bus_extend() which calls pci_scan_bridge_extend() (bridge > > > > > needs to be reconfigured) which then try to probe child bus via > > > > > pci_scan_child_bus_extend() because bridge is not card bus. > > > > > > > > > > In function pci_scan_bridge_extend() I do not see a way how to skip > > > > > probing for child buses which would avoid enumerating aardvark root > > > > > bridge when PCIe device is not connected. > > > > > > > > > > dmesg output contains: > > > > > > > > > > advk-pcie d0070000.pcie: link never came up > > > > > advk-pcie d0070000.pcie: PCI host bridge to bus 0000:00 > > > > > pci_bus 0000:00: root bus resource [bus 00-ff] > > > > > > > > This resource can be limited to the root bus number only before calling > > > > pci_host_probe() (ie see pci_parse_request_of_pci_ranges() and code in > > > > pci_scan_bridge_extend() that programs primary/secondary/subordinate > > > > busses) but I think that only papers over the issue, it does not fix it. > > > > > > I looked at the code in pci/probe.c again and I do not think it is > > > possible to avoid scanning devices. pci_scan_child_bus_extend() is > > > unconditionally calling pci_scan_slot() for devfn=0 as the first thing. > > > And this function unconditionally calls pci_scan_device() which is > > > directly trying to read vendor id from config register. > > > > > > So for me it looks like that kernel expects that can read vendor id and > > > device id from config register for device which is not connected. > > > > Not if it is connected to a bus that the root port does not decode, > > that's what I am saying. > > > > > And trying to read config register would cause those timeouts in > > > aardvark. > > > > The root port (which effectively works as PCI bridge from this > > standpoint) does not issue config cycles for busses that aren't within > > its decoded bus range, which in turn is determined by the firmware > > IORESOURCE_BUS resource. > > > > This issue is caused by devices that are connected downstream to > > the root port. > > > > Anyway - patch merged > > Could you send me a link to git commit? I have looked into > lpieralisi/pci.git repository, but I do not see it here. Apologies - I did not push it out, I have pushed it out on pci/aardvark now. > > but I would be happy to keep this discussion going, somehow. > > Ok, no problem. As I said if anybody has any idea or would like to see > some tests from me, I can do it and provide results. Sounds good, I will let you know, thanks. Lorenzo > > If the LPC20 VFIO/IOMMU/PCI microconference is approved it can be a > > good venue for this to happen. > > > > Lorenzo ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2020-07-21 10:48 UTC | newest] Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-05-28 14:31 [PATCH] PCI: aardvark: Don't touch PCIe registers if no card connected Pali Rohár 2020-05-28 16:26 ` Bjorn Helgaas 2020-05-28 16:38 ` Pali Rohár 2020-05-28 16:49 ` Bjorn Helgaas 2020-05-29 8:30 ` Pali Rohár 2020-06-30 12:31 ` Pali Rohár 2020-06-30 13:51 ` Bjorn Helgaas 2020-06-30 14:04 ` Pali Rohár 2020-06-30 14:58 ` Bjorn Helgaas 2020-07-01 8:08 ` Pali Rohár 2020-07-01 8:20 ` [PATCH v2] " Pali Rohár 2020-07-01 21:34 ` Bjorn Helgaas 2020-07-02 8:23 ` Pali Rohár 2020-07-02 8:30 ` [PATCH v3] " Pali Rohár 2020-07-09 11:35 ` Lorenzo Pieralisi 2020-07-09 12:22 ` Pali Rohár 2020-07-09 14:47 ` Lorenzo Pieralisi 2020-07-09 15:09 ` Pali Rohár 2020-07-10 9:18 ` Lorenzo Pieralisi 2020-07-10 15:44 ` Pali Rohár 2020-07-10 16:08 ` Bjorn Helgaas 2020-07-10 19:30 ` Pali Rohár 2020-07-10 20:08 ` Bjorn Helgaas 2020-07-13 8:27 ` Pali Rohár 2020-07-13 11:23 ` Lorenzo Pieralisi 2020-07-13 14:50 ` Pali Rohár 2020-07-13 16:41 ` Lorenzo Pieralisi 2020-07-14 7:38 ` Pali Rohár 2020-07-15 12:17 ` Pali Rohár 2020-07-15 16:21 ` Lorenzo Pieralisi 2020-07-21 8:57 ` Pali Rohár 2020-07-21 10:48 ` Lorenzo Pieralisi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).