* [PATCH] PCI: xgene: Fix IB window setup @ 2021-11-29 17:36 ` Rob Herring 0 siblings, 0 replies; 27+ messages in thread From: Rob Herring @ 2021-11-29 17:36 UTC (permalink / raw) To: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński, Bjorn Helgaas, Andrew Murray Cc: Stéphane Graber, stable, linux-pci, linux-arm-kernel, linux-kernel Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") broke PCI support on XGene. The cause is the IB resources are now sorted in address order instead of being in DT dma-ranges order. The result is which inbound registers are used for each region are swapped. I don't know the details about this h/w, but it appears that IB region 0 registers can't handle a size greater than 4GB. In any case, limiting the size for region 0 is enough to get back to the original assignment of dma-ranges to regions. Reported-by: Stéphane Graber <stgraber@ubuntu.com> Fixes: 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") Link: https://lore.kernel.org/all/CA+enf=v9rY_xnZML01oEgKLmvY1NGBUUhnSJaETmXtDtXfaczA@mail.gmail.com/ Cc: stable@vger.kernel.org # v5.5+ Signed-off-by: Rob Herring <robh@kernel.org> --- drivers/pci/controller/pci-xgene.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/pci/controller/pci-xgene.c b/drivers/pci/controller/pci-xgene.c index 56d0d50338c8..d83dbd977418 100644 --- a/drivers/pci/controller/pci-xgene.c +++ b/drivers/pci/controller/pci-xgene.c @@ -465,7 +465,7 @@ static int xgene_pcie_select_ib_reg(u8 *ib_reg_mask, u64 size) return 1; } - if ((size > SZ_1K) && (size < SZ_1T) && !(*ib_reg_mask & (1 << 0))) { + if ((size > SZ_1K) && (size < SZ_4G) && !(*ib_reg_mask & (1 << 0))) { *ib_reg_mask |= (1 << 0); return 0; } -- 2.32.0 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH] PCI: xgene: Fix IB window setup @ 2021-11-29 17:36 ` Rob Herring 0 siblings, 0 replies; 27+ messages in thread From: Rob Herring @ 2021-11-29 17:36 UTC (permalink / raw) To: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński, Bjorn Helgaas, Andrew Murray Cc: Stéphane Graber, stable, linux-pci, linux-arm-kernel, linux-kernel Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") broke PCI support on XGene. The cause is the IB resources are now sorted in address order instead of being in DT dma-ranges order. The result is which inbound registers are used for each region are swapped. I don't know the details about this h/w, but it appears that IB region 0 registers can't handle a size greater than 4GB. In any case, limiting the size for region 0 is enough to get back to the original assignment of dma-ranges to regions. Reported-by: Stéphane Graber <stgraber@ubuntu.com> Fixes: 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") Link: https://lore.kernel.org/all/CA+enf=v9rY_xnZML01oEgKLmvY1NGBUUhnSJaETmXtDtXfaczA@mail.gmail.com/ Cc: stable@vger.kernel.org # v5.5+ Signed-off-by: Rob Herring <robh@kernel.org> --- drivers/pci/controller/pci-xgene.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/pci/controller/pci-xgene.c b/drivers/pci/controller/pci-xgene.c index 56d0d50338c8..d83dbd977418 100644 --- a/drivers/pci/controller/pci-xgene.c +++ b/drivers/pci/controller/pci-xgene.c @@ -465,7 +465,7 @@ static int xgene_pcie_select_ib_reg(u8 *ib_reg_mask, u64 size) return 1; } - if ((size > SZ_1K) && (size < SZ_1T) && !(*ib_reg_mask & (1 << 0))) { + if ((size > SZ_1K) && (size < SZ_4G) && !(*ib_reg_mask & (1 << 0))) { *ib_reg_mask |= (1 << 0); return 0; } -- 2.32.0 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH] PCI: xgene: Fix IB window setup 2021-11-29 17:36 ` Rob Herring @ 2021-11-29 19:14 ` Stéphane Graber -1 siblings, 0 replies; 27+ messages in thread From: Stéphane Graber @ 2021-11-29 19:14 UTC (permalink / raw) To: Rob Herring Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński, Bjorn Helgaas, Andrew Murray, stable, linux-pci, linux-arm-kernel, linux-kernel On Mon, Nov 29, 2021 at 12:36 PM Rob Herring <robh@kernel.org> wrote: > > Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") > broke PCI support on XGene. The cause is the IB resources are now sorted > in address order instead of being in DT dma-ranges order. The result is > which inbound registers are used for each region are swapped. I don't > know the details about this h/w, but it appears that IB region 0 > registers can't handle a size greater than 4GB. In any case, limiting > the size for region 0 is enough to get back to the original assignment > of dma-ranges to regions. > > Reported-by: Stéphane Graber <stgraber@ubuntu.com> > Fixes: 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") > Link: https://lore.kernel.org/all/CA+enf=v9rY_xnZML01oEgKLmvY1NGBUUhnSJaETmXtDtXfaczA@mail.gmail.com/ > Cc: stable@vger.kernel.org # v5.5+ > Signed-off-by: Rob Herring <robh@kernel.org> I've been running with this exact change on top of the latest 5.12 stable release for a few days now, so can confirm that on my hardware it's behaving perfectly (on 4 different servers). Tested-by: Stéphane Graber <stgraber@ubuntu.com> > --- > drivers/pci/controller/pci-xgene.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/pci/controller/pci-xgene.c b/drivers/pci/controller/pci-xgene.c > index 56d0d50338c8..d83dbd977418 100644 > --- a/drivers/pci/controller/pci-xgene.c > +++ b/drivers/pci/controller/pci-xgene.c > @@ -465,7 +465,7 @@ static int xgene_pcie_select_ib_reg(u8 *ib_reg_mask, u64 size) > return 1; > } > > - if ((size > SZ_1K) && (size < SZ_1T) && !(*ib_reg_mask & (1 << 0))) { > + if ((size > SZ_1K) && (size < SZ_4G) && !(*ib_reg_mask & (1 << 0))) { > *ib_reg_mask |= (1 << 0); > return 0; > } > -- > 2.32.0 > ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] PCI: xgene: Fix IB window setup @ 2021-11-29 19:14 ` Stéphane Graber 0 siblings, 0 replies; 27+ messages in thread From: Stéphane Graber @ 2021-11-29 19:14 UTC (permalink / raw) To: Rob Herring Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński, Bjorn Helgaas, Andrew Murray, stable, linux-pci, linux-arm-kernel, linux-kernel On Mon, Nov 29, 2021 at 12:36 PM Rob Herring <robh@kernel.org> wrote: > > Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") > broke PCI support on XGene. The cause is the IB resources are now sorted > in address order instead of being in DT dma-ranges order. The result is > which inbound registers are used for each region are swapped. I don't > know the details about this h/w, but it appears that IB region 0 > registers can't handle a size greater than 4GB. In any case, limiting > the size for region 0 is enough to get back to the original assignment > of dma-ranges to regions. > > Reported-by: Stéphane Graber <stgraber@ubuntu.com> > Fixes: 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") > Link: https://lore.kernel.org/all/CA+enf=v9rY_xnZML01oEgKLmvY1NGBUUhnSJaETmXtDtXfaczA@mail.gmail.com/ > Cc: stable@vger.kernel.org # v5.5+ > Signed-off-by: Rob Herring <robh@kernel.org> I've been running with this exact change on top of the latest 5.12 stable release for a few days now, so can confirm that on my hardware it's behaving perfectly (on 4 different servers). Tested-by: Stéphane Graber <stgraber@ubuntu.com> > --- > drivers/pci/controller/pci-xgene.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/pci/controller/pci-xgene.c b/drivers/pci/controller/pci-xgene.c > index 56d0d50338c8..d83dbd977418 100644 > --- a/drivers/pci/controller/pci-xgene.c > +++ b/drivers/pci/controller/pci-xgene.c > @@ -465,7 +465,7 @@ static int xgene_pcie_select_ib_reg(u8 *ib_reg_mask, u64 size) > return 1; > } > > - if ((size > SZ_1K) && (size < SZ_1T) && !(*ib_reg_mask & (1 << 0))) { > + if ((size > SZ_1K) && (size < SZ_4G) && !(*ib_reg_mask & (1 << 0))) { > *ib_reg_mask |= (1 << 0); > return 0; > } > -- > 2.32.0 > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] PCI: xgene: Fix IB window setup 2021-11-29 17:36 ` Rob Herring @ 2021-11-30 7:55 ` Krzysztof Wilczyński -1 siblings, 0 replies; 27+ messages in thread From: Krzysztof Wilczyński @ 2021-11-30 7:55 UTC (permalink / raw) To: Rob Herring Cc: Toan Le, Lorenzo Pieralisi, Bjorn Helgaas, Andrew Murray, Stéphane Graber, stable, linux-pci, linux-arm-kernel, linux-kernel Hi, > Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") > broke PCI support on XGene. The cause is the IB resources are now sorted > in address order instead of being in DT dma-ranges order. The result is > which inbound registers are used for each region are swapped. I don't > know the details about this h/w, but it appears that IB region 0 > registers can't handle a size greater than 4GB. In any case, limiting > the size for region 0 is enough to get back to the original assignment > of dma-ranges to regions. A small nitpick: it would be "X-Gene" in the above as per Applied Micro's (or rather MACOM Technology Solutions these days, I suppose) product line naming. > @@ -465,7 +465,7 @@ static int xgene_pcie_select_ib_reg(u8 *ib_reg_mask, u64 size) > return 1; > } > > - if ((size > SZ_1K) && (size < SZ_1T) && !(*ib_reg_mask & (1 << 0))) { > + if ((size > SZ_1K) && (size < SZ_4G) && !(*ib_reg_mask & (1 << 0))) { > *ib_reg_mask |= (1 << 0); > return 0; > } Thank you! Reviewed-by: Krzysztof Wilczyński <kw@linux.com> Also, thank you Stéphane for testing! Much appreciated! Krzysztof ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] PCI: xgene: Fix IB window setup @ 2021-11-30 7:55 ` Krzysztof Wilczyński 0 siblings, 0 replies; 27+ messages in thread From: Krzysztof Wilczyński @ 2021-11-30 7:55 UTC (permalink / raw) To: Rob Herring Cc: Toan Le, Lorenzo Pieralisi, Bjorn Helgaas, Andrew Murray, Stéphane Graber, stable, linux-pci, linux-arm-kernel, linux-kernel Hi, > Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") > broke PCI support on XGene. The cause is the IB resources are now sorted > in address order instead of being in DT dma-ranges order. The result is > which inbound registers are used for each region are swapped. I don't > know the details about this h/w, but it appears that IB region 0 > registers can't handle a size greater than 4GB. In any case, limiting > the size for region 0 is enough to get back to the original assignment > of dma-ranges to regions. A small nitpick: it would be "X-Gene" in the above as per Applied Micro's (or rather MACOM Technology Solutions these days, I suppose) product line naming. > @@ -465,7 +465,7 @@ static int xgene_pcie_select_ib_reg(u8 *ib_reg_mask, u64 size) > return 1; > } > > - if ((size > SZ_1K) && (size < SZ_1T) && !(*ib_reg_mask & (1 << 0))) { > + if ((size > SZ_1K) && (size < SZ_4G) && !(*ib_reg_mask & (1 << 0))) { > *ib_reg_mask |= (1 << 0); > return 0; > } Thank you! Reviewed-by: Krzysztof Wilczyński <kw@linux.com> Also, thank you Stéphane for testing! Much appreciated! Krzysztof _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] PCI: xgene: Fix IB window setup 2021-11-29 17:36 ` Rob Herring @ 2021-11-30 14:30 ` Lorenzo Pieralisi -1 siblings, 0 replies; 27+ messages in thread From: Lorenzo Pieralisi @ 2021-11-30 14:30 UTC (permalink / raw) To: Krzysztof Wilczyński, Andrew Murray, Rob Herring, Toan Le, Bjorn Helgaas Cc: Lorenzo Pieralisi, linux-kernel, linux-arm-kernel, Stéphane Graber, stable, linux-pci On Mon, 29 Nov 2021 11:36:37 -0600, Rob Herring wrote: > Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") > broke PCI support on XGene. The cause is the IB resources are now sorted > in address order instead of being in DT dma-ranges order. The result is > which inbound registers are used for each region are swapped. I don't > know the details about this h/w, but it appears that IB region 0 > registers can't handle a size greater than 4GB. In any case, limiting > the size for region 0 is enough to get back to the original assignment > of dma-ranges to regions. > > [...] Applied to pci/xgene, thanks! [1/1] PCI: xgene: Fix IB window setup https://git.kernel.org/lpieralisi/pci/c/c7a75d0782 Thanks, Lorenzo ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] PCI: xgene: Fix IB window setup @ 2021-11-30 14:30 ` Lorenzo Pieralisi 0 siblings, 0 replies; 27+ messages in thread From: Lorenzo Pieralisi @ 2021-11-30 14:30 UTC (permalink / raw) To: Krzysztof Wilczyński, Andrew Murray, Rob Herring, Toan Le, Bjorn Helgaas Cc: Lorenzo Pieralisi, linux-kernel, linux-arm-kernel, Stéphane Graber, stable, linux-pci On Mon, 29 Nov 2021 11:36:37 -0600, Rob Herring wrote: > Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") > broke PCI support on XGene. The cause is the IB resources are now sorted > in address order instead of being in DT dma-ranges order. The result is > which inbound registers are used for each region are swapped. I don't > know the details about this h/w, but it appears that IB region 0 > registers can't handle a size greater than 4GB. In any case, limiting > the size for region 0 is enough to get back to the original assignment > of dma-ranges to regions. > > [...] Applied to pci/xgene, thanks! [1/1] PCI: xgene: Fix IB window setup https://git.kernel.org/lpieralisi/pci/c/c7a75d0782 Thanks, Lorenzo _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] PCI: xgene: Fix IB window setup 2021-11-29 17:36 ` Rob Herring @ 2022-02-04 23:01 ` dann frazier -1 siblings, 0 replies; 27+ messages in thread From: dann frazier @ 2022-02-04 23:01 UTC (permalink / raw) To: Rob Herring Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński, Bjorn Helgaas, Andrew Murray, Stéphane Graber, stable, linux-pci, linux-arm-kernel, linux-kernel On Mon, Nov 29, 2021 at 11:36:37AM -0600, Rob Herring wrote: > Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") > broke PCI support on XGene. The cause is the IB resources are now sorted > in address order instead of being in DT dma-ranges order. The result is > which inbound registers are used for each region are swapped. I don't > know the details about this h/w, but it appears that IB region 0 > registers can't handle a size greater than 4GB. In any case, limiting > the size for region 0 is enough to get back to the original assignment > of dma-ranges to regions. hey Rob! I've been seeing a panic on HP Moonshoot m400 cartridges (X-Gene1) - only during network installs - that I also bisected down to commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup"). I was hoping that this patch that fixed the issue on Stéphane's X-Gene2 system would also fix my issue, but no luck. In fact, it seems to just makes it fail differently. Reverting both patches is required to get a v5.17-rc kernel to boot. I've collected the following logs - let me know if anything else would be useful. 1) v5.17-rc2+ (unmodified): http://dannf.org/bugs/m400-no-reverts.log Note that the mlx4 driver fails initialization. 2) v5.17-rc2+, w/o the commit that fixed Stéphane's system: http://dannf.org/bugs/m400-xgene2-fix-reverted.log Note the mlx4 MSI-X timeout, and later panic. 3) v5.17-rc2+, w/ both commits reverted (works) http://dannf.org/bugs/m400-both-reverted.log -dann ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] PCI: xgene: Fix IB window setup @ 2022-02-04 23:01 ` dann frazier 0 siblings, 0 replies; 27+ messages in thread From: dann frazier @ 2022-02-04 23:01 UTC (permalink / raw) To: Rob Herring Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński, Bjorn Helgaas, Andrew Murray, Stéphane Graber, stable, linux-pci, linux-arm-kernel, linux-kernel On Mon, Nov 29, 2021 at 11:36:37AM -0600, Rob Herring wrote: > Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") > broke PCI support on XGene. The cause is the IB resources are now sorted > in address order instead of being in DT dma-ranges order. The result is > which inbound registers are used for each region are swapped. I don't > know the details about this h/w, but it appears that IB region 0 > registers can't handle a size greater than 4GB. In any case, limiting > the size for region 0 is enough to get back to the original assignment > of dma-ranges to regions. hey Rob! I've been seeing a panic on HP Moonshoot m400 cartridges (X-Gene1) - only during network installs - that I also bisected down to commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup"). I was hoping that this patch that fixed the issue on Stéphane's X-Gene2 system would also fix my issue, but no luck. In fact, it seems to just makes it fail differently. Reverting both patches is required to get a v5.17-rc kernel to boot. I've collected the following logs - let me know if anything else would be useful. 1) v5.17-rc2+ (unmodified): http://dannf.org/bugs/m400-no-reverts.log Note that the mlx4 driver fails initialization. 2) v5.17-rc2+, w/o the commit that fixed Stéphane's system: http://dannf.org/bugs/m400-xgene2-fix-reverted.log Note the mlx4 MSI-X timeout, and later panic. 3) v5.17-rc2+, w/ both commits reverted (works) http://dannf.org/bugs/m400-both-reverted.log -dann _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] PCI: xgene: Fix IB window setup 2022-02-04 23:01 ` dann frazier @ 2022-02-05 16:05 ` Rob Herring -1 siblings, 0 replies; 27+ messages in thread From: Rob Herring @ 2022-02-05 16:05 UTC (permalink / raw) To: dann frazier Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński, Bjorn Helgaas, Andrew Murray, Stéphane Graber, stable, PCI, linux-arm-kernel, linux-kernel On Fri, Feb 4, 2022 at 5:01 PM dann frazier <dann.frazier@canonical.com> wrote: > > On Mon, Nov 29, 2021 at 11:36:37AM -0600, Rob Herring wrote: > > Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") > > broke PCI support on XGene. The cause is the IB resources are now sorted > > in address order instead of being in DT dma-ranges order. The result is > > which inbound registers are used for each region are swapped. I don't > > know the details about this h/w, but it appears that IB region 0 > > registers can't handle a size greater than 4GB. In any case, limiting > > the size for region 0 is enough to get back to the original assignment > > of dma-ranges to regions. > > hey Rob! > > I've been seeing a panic on HP Moonshoot m400 cartridges (X-Gene1) - > only during network installs - that I also bisected down to commit > 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup"). I was > hoping that this patch that fixed the issue on Stéphane's X-Gene2 > system would also fix my issue, but no luck. In fact, it seems to just > makes it fail differently. Reverting both patches is required to get a > v5.17-rc kernel to boot. > > I've collected the following logs - let me know if anything else would > be useful. > > 1) v5.17-rc2+ (unmodified): > http://dannf.org/bugs/m400-no-reverts.log > Note that the mlx4 driver fails initialization. > > 2) v5.17-rc2+, w/o the commit that fixed Stéphane's system: > http://dannf.org/bugs/m400-xgene2-fix-reverted.log > Note the mlx4 MSI-X timeout, and later panic. > > 3) v5.17-rc2+, w/ both commits reverted (works) > http://dannf.org/bugs/m400-both-reverted.log The ranges and dma-ranges addresses don't appear to match up with any upstream dts files. Can you send me the DT? Otherwise, we're going to need some debugging added to xgene_pcie_setup_ib_reg() to see if the register setup changed. I can come up with something next week. Rob ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] PCI: xgene: Fix IB window setup @ 2022-02-05 16:05 ` Rob Herring 0 siblings, 0 replies; 27+ messages in thread From: Rob Herring @ 2022-02-05 16:05 UTC (permalink / raw) To: dann frazier Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński, Bjorn Helgaas, Andrew Murray, Stéphane Graber, stable, PCI, linux-arm-kernel, linux-kernel On Fri, Feb 4, 2022 at 5:01 PM dann frazier <dann.frazier@canonical.com> wrote: > > On Mon, Nov 29, 2021 at 11:36:37AM -0600, Rob Herring wrote: > > Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") > > broke PCI support on XGene. The cause is the IB resources are now sorted > > in address order instead of being in DT dma-ranges order. The result is > > which inbound registers are used for each region are swapped. I don't > > know the details about this h/w, but it appears that IB region 0 > > registers can't handle a size greater than 4GB. In any case, limiting > > the size for region 0 is enough to get back to the original assignment > > of dma-ranges to regions. > > hey Rob! > > I've been seeing a panic on HP Moonshoot m400 cartridges (X-Gene1) - > only during network installs - that I also bisected down to commit > 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup"). I was > hoping that this patch that fixed the issue on Stéphane's X-Gene2 > system would also fix my issue, but no luck. In fact, it seems to just > makes it fail differently. Reverting both patches is required to get a > v5.17-rc kernel to boot. > > I've collected the following logs - let me know if anything else would > be useful. > > 1) v5.17-rc2+ (unmodified): > http://dannf.org/bugs/m400-no-reverts.log > Note that the mlx4 driver fails initialization. > > 2) v5.17-rc2+, w/o the commit that fixed Stéphane's system: > http://dannf.org/bugs/m400-xgene2-fix-reverted.log > Note the mlx4 MSI-X timeout, and later panic. > > 3) v5.17-rc2+, w/ both commits reverted (works) > http://dannf.org/bugs/m400-both-reverted.log The ranges and dma-ranges addresses don't appear to match up with any upstream dts files. Can you send me the DT? Otherwise, we're going to need some debugging added to xgene_pcie_setup_ib_reg() to see if the register setup changed. I can come up with something next week. Rob _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] PCI: xgene: Fix IB window setup 2022-02-05 16:05 ` Rob Herring @ 2022-02-05 21:12 ` dann frazier -1 siblings, 0 replies; 27+ messages in thread From: dann frazier @ 2022-02-05 21:12 UTC (permalink / raw) To: Rob Herring Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński, Bjorn Helgaas, Andrew Murray, Stéphane Graber, stable, PCI, linux-arm-kernel, linux-kernel On Sat, Feb 5, 2022 at 9:05 AM Rob Herring <robh@kernel.org> wrote: > > On Fri, Feb 4, 2022 at 5:01 PM dann frazier <dann.frazier@canonical.com> wrote: > > > > On Mon, Nov 29, 2021 at 11:36:37AM -0600, Rob Herring wrote: > > > Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") > > > broke PCI support on XGene. The cause is the IB resources are now sorted > > > in address order instead of being in DT dma-ranges order. The result is > > > which inbound registers are used for each region are swapped. I don't > > > know the details about this h/w, but it appears that IB region 0 > > > registers can't handle a size greater than 4GB. In any case, limiting > > > the size for region 0 is enough to get back to the original assignment > > > of dma-ranges to regions. > > > > hey Rob! > > > > I've been seeing a panic on HP Moonshoot m400 cartridges (X-Gene1) - > > only during network installs - that I also bisected down to commit > > 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup"). I was > > hoping that this patch that fixed the issue on Stéphane's X-Gene2 > > system would also fix my issue, but no luck. In fact, it seems to just > > makes it fail differently. Reverting both patches is required to get a > > v5.17-rc kernel to boot. > > > > I've collected the following logs - let me know if anything else would > > be useful. > > > > 1) v5.17-rc2+ (unmodified): > > http://dannf.org/bugs/m400-no-reverts.log > > Note that the mlx4 driver fails initialization. > > > > 2) v5.17-rc2+, w/o the commit that fixed Stéphane's system: > > http://dannf.org/bugs/m400-xgene2-fix-reverted.log > > Note the mlx4 MSI-X timeout, and later panic. > > > > 3) v5.17-rc2+, w/ both commits reverted (works) > > http://dannf.org/bugs/m400-both-reverted.log > > The ranges and dma-ranges addresses don't appear to match up with any > upstream dts files. Can you send me the DT? Sure: http://dannf.org/bugs/fdt -dann > Otherwise, we're going to need some debugging added to > xgene_pcie_setup_ib_reg() to see if the register setup changed. I can > come up with something next week. > > Rob ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] PCI: xgene: Fix IB window setup @ 2022-02-05 21:12 ` dann frazier 0 siblings, 0 replies; 27+ messages in thread From: dann frazier @ 2022-02-05 21:12 UTC (permalink / raw) To: Rob Herring Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński, Bjorn Helgaas, Andrew Murray, Stéphane Graber, stable, PCI, linux-arm-kernel, linux-kernel On Sat, Feb 5, 2022 at 9:05 AM Rob Herring <robh@kernel.org> wrote: > > On Fri, Feb 4, 2022 at 5:01 PM dann frazier <dann.frazier@canonical.com> wrote: > > > > On Mon, Nov 29, 2021 at 11:36:37AM -0600, Rob Herring wrote: > > > Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") > > > broke PCI support on XGene. The cause is the IB resources are now sorted > > > in address order instead of being in DT dma-ranges order. The result is > > > which inbound registers are used for each region are swapped. I don't > > > know the details about this h/w, but it appears that IB region 0 > > > registers can't handle a size greater than 4GB. In any case, limiting > > > the size for region 0 is enough to get back to the original assignment > > > of dma-ranges to regions. > > > > hey Rob! > > > > I've been seeing a panic on HP Moonshoot m400 cartridges (X-Gene1) - > > only during network installs - that I also bisected down to commit > > 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup"). I was > > hoping that this patch that fixed the issue on Stéphane's X-Gene2 > > system would also fix my issue, but no luck. In fact, it seems to just > > makes it fail differently. Reverting both patches is required to get a > > v5.17-rc kernel to boot. > > > > I've collected the following logs - let me know if anything else would > > be useful. > > > > 1) v5.17-rc2+ (unmodified): > > http://dannf.org/bugs/m400-no-reverts.log > > Note that the mlx4 driver fails initialization. > > > > 2) v5.17-rc2+, w/o the commit that fixed Stéphane's system: > > http://dannf.org/bugs/m400-xgene2-fix-reverted.log > > Note the mlx4 MSI-X timeout, and later panic. > > > > 3) v5.17-rc2+, w/ both commits reverted (works) > > http://dannf.org/bugs/m400-both-reverted.log > > The ranges and dma-ranges addresses don't appear to match up with any > upstream dts files. Can you send me the DT? Sure: http://dannf.org/bugs/fdt -dann > Otherwise, we're going to need some debugging added to > xgene_pcie_setup_ib_reg() to see if the register setup changed. I can > come up with something next week. > > Rob _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] PCI: xgene: Fix IB window setup 2022-02-05 21:12 ` dann frazier @ 2022-02-07 16:09 ` Rob Herring -1 siblings, 0 replies; 27+ messages in thread From: Rob Herring @ 2022-02-07 16:09 UTC (permalink / raw) To: dann frazier Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński, Bjorn Helgaas, Andrew Murray, Stéphane Graber, stable, PCI, linux-arm-kernel, linux-kernel On Sat, Feb 5, 2022 at 3:13 PM dann frazier <dann.frazier@canonical.com> wrote: > > On Sat, Feb 5, 2022 at 9:05 AM Rob Herring <robh@kernel.org> wrote: > > > > On Fri, Feb 4, 2022 at 5:01 PM dann frazier <dann.frazier@canonical.com> wrote: > > > > > > On Mon, Nov 29, 2021 at 11:36:37AM -0600, Rob Herring wrote: > > > > Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") > > > > broke PCI support on XGene. The cause is the IB resources are now sorted > > > > in address order instead of being in DT dma-ranges order. The result is > > > > which inbound registers are used for each region are swapped. I don't > > > > know the details about this h/w, but it appears that IB region 0 > > > > registers can't handle a size greater than 4GB. In any case, limiting > > > > the size for region 0 is enough to get back to the original assignment > > > > of dma-ranges to regions. > > > > > > hey Rob! > > > > > > I've been seeing a panic on HP Moonshoot m400 cartridges (X-Gene1) - > > > only during network installs - that I also bisected down to commit > > > 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup"). I was > > > hoping that this patch that fixed the issue on Stéphane's X-Gene2 > > > system would also fix my issue, but no luck. In fact, it seems to just > > > makes it fail differently. Reverting both patches is required to get a > > > v5.17-rc kernel to boot. > > > > > > I've collected the following logs - let me know if anything else would > > > be useful. > > > > > > 1) v5.17-rc2+ (unmodified): > > > http://dannf.org/bugs/m400-no-reverts.log > > > Note that the mlx4 driver fails initialization. > > > > > > 2) v5.17-rc2+, w/o the commit that fixed Stéphane's system: > > > http://dannf.org/bugs/m400-xgene2-fix-reverted.log > > > Note the mlx4 MSI-X timeout, and later panic. > > > > > > 3) v5.17-rc2+, w/ both commits reverted (works) > > > http://dannf.org/bugs/m400-both-reverted.log > > > > The ranges and dma-ranges addresses don't appear to match up with any > > upstream dts files. Can you send me the DT? > > Sure: http://dannf.org/bugs/fdt The first fix certainly is a problem. It's going to need something besides size to key off of (originally it was dependent on order of dma-ranges entries). The 2nd issue is the 'dma-ranges' has a second entry that is now ignored: dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00>, <0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; Based on the flags (3rd addr cell: 0x0), we have an inbound config space which the kernel now ignores because inbound config space accesses make no sense. But clearly some setup is needed. Upstream, in contrast, sets up a memory range that includes this region, so the setup does happen: <0x42000000 0x00 0x00000000 0x00 0x00000000 0x80 0x00000000> Minimally, I suspect it will work if you change dma-ranges 2nd entry to: <0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000> While we shouldn't break existing DTs, the moonshot DT doesn't use what's documented upstream. There are multiple differences compared to what's documented. Is upstream supposed to support upstream DTs, downstream DTs, and ACPI for XGene which is an abandoned platform with only a handful of users? Rob ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] PCI: xgene: Fix IB window setup @ 2022-02-07 16:09 ` Rob Herring 0 siblings, 0 replies; 27+ messages in thread From: Rob Herring @ 2022-02-07 16:09 UTC (permalink / raw) To: dann frazier Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński, Bjorn Helgaas, Andrew Murray, Stéphane Graber, stable, PCI, linux-arm-kernel, linux-kernel On Sat, Feb 5, 2022 at 3:13 PM dann frazier <dann.frazier@canonical.com> wrote: > > On Sat, Feb 5, 2022 at 9:05 AM Rob Herring <robh@kernel.org> wrote: > > > > On Fri, Feb 4, 2022 at 5:01 PM dann frazier <dann.frazier@canonical.com> wrote: > > > > > > On Mon, Nov 29, 2021 at 11:36:37AM -0600, Rob Herring wrote: > > > > Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") > > > > broke PCI support on XGene. The cause is the IB resources are now sorted > > > > in address order instead of being in DT dma-ranges order. The result is > > > > which inbound registers are used for each region are swapped. I don't > > > > know the details about this h/w, but it appears that IB region 0 > > > > registers can't handle a size greater than 4GB. In any case, limiting > > > > the size for region 0 is enough to get back to the original assignment > > > > of dma-ranges to regions. > > > > > > hey Rob! > > > > > > I've been seeing a panic on HP Moonshoot m400 cartridges (X-Gene1) - > > > only during network installs - that I also bisected down to commit > > > 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup"). I was > > > hoping that this patch that fixed the issue on Stéphane's X-Gene2 > > > system would also fix my issue, but no luck. In fact, it seems to just > > > makes it fail differently. Reverting both patches is required to get a > > > v5.17-rc kernel to boot. > > > > > > I've collected the following logs - let me know if anything else would > > > be useful. > > > > > > 1) v5.17-rc2+ (unmodified): > > > http://dannf.org/bugs/m400-no-reverts.log > > > Note that the mlx4 driver fails initialization. > > > > > > 2) v5.17-rc2+, w/o the commit that fixed Stéphane's system: > > > http://dannf.org/bugs/m400-xgene2-fix-reverted.log > > > Note the mlx4 MSI-X timeout, and later panic. > > > > > > 3) v5.17-rc2+, w/ both commits reverted (works) > > > http://dannf.org/bugs/m400-both-reverted.log > > > > The ranges and dma-ranges addresses don't appear to match up with any > > upstream dts files. Can you send me the DT? > > Sure: http://dannf.org/bugs/fdt The first fix certainly is a problem. It's going to need something besides size to key off of (originally it was dependent on order of dma-ranges entries). The 2nd issue is the 'dma-ranges' has a second entry that is now ignored: dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00>, <0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; Based on the flags (3rd addr cell: 0x0), we have an inbound config space which the kernel now ignores because inbound config space accesses make no sense. But clearly some setup is needed. Upstream, in contrast, sets up a memory range that includes this region, so the setup does happen: <0x42000000 0x00 0x00000000 0x00 0x00000000 0x80 0x00000000> Minimally, I suspect it will work if you change dma-ranges 2nd entry to: <0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000> While we shouldn't break existing DTs, the moonshot DT doesn't use what's documented upstream. There are multiple differences compared to what's documented. Is upstream supposed to support upstream DTs, downstream DTs, and ACPI for XGene which is an abandoned platform with only a handful of users? Rob _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] PCI: xgene: Fix IB window setup 2022-02-07 16:09 ` Rob Herring @ 2022-02-08 1:19 ` dann frazier -1 siblings, 0 replies; 27+ messages in thread From: dann frazier @ 2022-02-08 1:19 UTC (permalink / raw) To: Rob Herring Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński, Bjorn Helgaas, Andrew Murray, Stéphane Graber, stable, PCI, linux-arm-kernel, linux-kernel On Mon, Feb 07, 2022 at 10:09:31AM -0600, Rob Herring wrote: > On Sat, Feb 5, 2022 at 3:13 PM dann frazier <dann.frazier@canonical.com> wrote: > > > > On Sat, Feb 5, 2022 at 9:05 AM Rob Herring <robh@kernel.org> wrote: > > > > > > On Fri, Feb 4, 2022 at 5:01 PM dann frazier <dann.frazier@canonical.com> wrote: > > > > > > > > On Mon, Nov 29, 2021 at 11:36:37AM -0600, Rob Herring wrote: > > > > > Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") > > > > > broke PCI support on XGene. The cause is the IB resources are now sorted > > > > > in address order instead of being in DT dma-ranges order. The result is > > > > > which inbound registers are used for each region are swapped. I don't > > > > > know the details about this h/w, but it appears that IB region 0 > > > > > registers can't handle a size greater than 4GB. In any case, limiting > > > > > the size for region 0 is enough to get back to the original assignment > > > > > of dma-ranges to regions. > > > > > > > > hey Rob! > > > > > > > > I've been seeing a panic on HP Moonshoot m400 cartridges (X-Gene1) - > > > > only during network installs - that I also bisected down to commit > > > > 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup"). I was > > > > hoping that this patch that fixed the issue on Stéphane's X-Gene2 > > > > system would also fix my issue, but no luck. In fact, it seems to just > > > > makes it fail differently. Reverting both patches is required to get a > > > > v5.17-rc kernel to boot. > > > > > > > > I've collected the following logs - let me know if anything else would > > > > be useful. > > > > > > > > 1) v5.17-rc2+ (unmodified): > > > > http://dannf.org/bugs/m400-no-reverts.log > > > > Note that the mlx4 driver fails initialization. > > > > > > > > 2) v5.17-rc2+, w/o the commit that fixed Stéphane's system: > > > > http://dannf.org/bugs/m400-xgene2-fix-reverted.log > > > > Note the mlx4 MSI-X timeout, and later panic. > > > > > > > > 3) v5.17-rc2+, w/ both commits reverted (works) > > > > http://dannf.org/bugs/m400-both-reverted.log > > > > > > The ranges and dma-ranges addresses don't appear to match up with any > > > upstream dts files. Can you send me the DT? > > > > Sure: http://dannf.org/bugs/fdt > > The first fix certainly is a problem. It's going to need something > besides size to key off of (originally it was dependent on order of > dma-ranges entries). > > The 2nd issue is the 'dma-ranges' has a second entry that is now ignored: > > dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00>, <0x00 > 0x79000000 0x00 0x79000000 0x00 0x800000>; > > Based on the flags (3rd addr cell: 0x0), we have an inbound config > space which the kernel now ignores because inbound config space > accesses make no sense. But clearly some setup is needed. Upstream, in > contrast, sets up a memory range that includes this region, so the > setup does happen: > > <0x42000000 0x00 0x00000000 0x00 0x00000000 0x80 0x00000000> > > Minimally, I suspect it will work if you change dma-ranges 2nd entry to: > > <0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000> Thanks for looking into this Rob. I tried to test that theory, but it didn't seem to work. This is what I tried: --- m400.dts 2022-02-07 20:16:44.840475323 +0000 +++ m400.dts.dmaonly 2022-02-08 00:17:54.097132000 +0000 @@ -446,7 +446,7 @@ reg = <0x00 0x1f2b0000 0x00 0x10000 0xe0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; reg-names = "csr\0cfg\0msi_gen\0msi_term"; ranges = <0x1000000 0x00 0x00 0xe0 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0xe1 0x30000000 0x00 0x80000000>; - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; interrupts = <0x00 0x10 0x04>; @@ -471,7 +471,7 @@ reg = <0x00 0x1f2c0000 0x00 0x10000 0xd0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; reg-names = "csr\0cfg\0msi_gen\0msi_term"; ranges = <0x1000000 0x00 0x00 0xd0 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0xd1 0x30000000 0x00 0x80000000>; - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; interrupts = <0x00 0x10 0x04>; @@ -496,7 +496,7 @@ reg = <0x00 0x1f2d0000 0x00 0x10000 0x90 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; reg-names = "csr\0cfg\0msi_gen\0msi_term"; ranges = <0x1000000 0x00 0x00 0x90 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0x91 0x30000000 0x00 0x80000000>; - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; interrupts = <0x00 0x10 0x04>; @@ -522,7 +522,7 @@ reg = <0x00 0x1f500000 0x00 0x10000 0xa0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; reg-names = "csr\0cfg\0msi_gen\0msi_term"; ranges = <0x2000000 0x00 0x30000000 0xa1 0x30000000 0x00 0x80000000>; - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; interrupts = <0x00 0x10 0x04>; @@ -547,7 +547,7 @@ reg = <0x00 0x1f510000 0x00 0x10000 0xc0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; reg-names = "csr\0cfg\0msi_gen\0msi_term"; ranges = <0x1000000 0x00 0x00 0xc0 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0xc1 0x30000000 0x00 0x80000000>; - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; interrupts = <0x00 0x10 0x04>; And that failed to boot with a 5.17-rc3. Since dma-ranges was previously identical to ib-ranges, I also tried making the same change to ib-ranges, but with no success. > While we shouldn't break existing DTs, the moonshot DT doesn't use > what's documented upstream. There are multiple differences compared to > what's documented. Is upstream supposed to support upstream DTs, > downstream DTs, and ACPI for XGene which is an abandoned platform with > only a handful of users? That's a fair question, though it's one of a policy, and I feel I'd be overstepping by weighing in. I suppose one option I have is to try and create and upstream a dts for these systems and modify our boot.scr to always load that over the one provided by firmware. While we do have some of these systems in production, they are being retired and replaced with newer kit over time, and it's possible we'll never need to upgrade them to a modern kernel. -dann _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] PCI: xgene: Fix IB window setup @ 2022-02-08 1:19 ` dann frazier 0 siblings, 0 replies; 27+ messages in thread From: dann frazier @ 2022-02-08 1:19 UTC (permalink / raw) To: Rob Herring Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński, Bjorn Helgaas, Andrew Murray, Stéphane Graber, stable, PCI, linux-arm-kernel, linux-kernel On Mon, Feb 07, 2022 at 10:09:31AM -0600, Rob Herring wrote: > On Sat, Feb 5, 2022 at 3:13 PM dann frazier <dann.frazier@canonical.com> wrote: > > > > On Sat, Feb 5, 2022 at 9:05 AM Rob Herring <robh@kernel.org> wrote: > > > > > > On Fri, Feb 4, 2022 at 5:01 PM dann frazier <dann.frazier@canonical.com> wrote: > > > > > > > > On Mon, Nov 29, 2021 at 11:36:37AM -0600, Rob Herring wrote: > > > > > Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") > > > > > broke PCI support on XGene. The cause is the IB resources are now sorted > > > > > in address order instead of being in DT dma-ranges order. The result is > > > > > which inbound registers are used for each region are swapped. I don't > > > > > know the details about this h/w, but it appears that IB region 0 > > > > > registers can't handle a size greater than 4GB. In any case, limiting > > > > > the size for region 0 is enough to get back to the original assignment > > > > > of dma-ranges to regions. > > > > > > > > hey Rob! > > > > > > > > I've been seeing a panic on HP Moonshoot m400 cartridges (X-Gene1) - > > > > only during network installs - that I also bisected down to commit > > > > 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup"). I was > > > > hoping that this patch that fixed the issue on Stéphane's X-Gene2 > > > > system would also fix my issue, but no luck. In fact, it seems to just > > > > makes it fail differently. Reverting both patches is required to get a > > > > v5.17-rc kernel to boot. > > > > > > > > I've collected the following logs - let me know if anything else would > > > > be useful. > > > > > > > > 1) v5.17-rc2+ (unmodified): > > > > http://dannf.org/bugs/m400-no-reverts.log > > > > Note that the mlx4 driver fails initialization. > > > > > > > > 2) v5.17-rc2+, w/o the commit that fixed Stéphane's system: > > > > http://dannf.org/bugs/m400-xgene2-fix-reverted.log > > > > Note the mlx4 MSI-X timeout, and later panic. > > > > > > > > 3) v5.17-rc2+, w/ both commits reverted (works) > > > > http://dannf.org/bugs/m400-both-reverted.log > > > > > > The ranges and dma-ranges addresses don't appear to match up with any > > > upstream dts files. Can you send me the DT? > > > > Sure: http://dannf.org/bugs/fdt > > The first fix certainly is a problem. It's going to need something > besides size to key off of (originally it was dependent on order of > dma-ranges entries). > > The 2nd issue is the 'dma-ranges' has a second entry that is now ignored: > > dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00>, <0x00 > 0x79000000 0x00 0x79000000 0x00 0x800000>; > > Based on the flags (3rd addr cell: 0x0), we have an inbound config > space which the kernel now ignores because inbound config space > accesses make no sense. But clearly some setup is needed. Upstream, in > contrast, sets up a memory range that includes this region, so the > setup does happen: > > <0x42000000 0x00 0x00000000 0x00 0x00000000 0x80 0x00000000> > > Minimally, I suspect it will work if you change dma-ranges 2nd entry to: > > <0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000> Thanks for looking into this Rob. I tried to test that theory, but it didn't seem to work. This is what I tried: --- m400.dts 2022-02-07 20:16:44.840475323 +0000 +++ m400.dts.dmaonly 2022-02-08 00:17:54.097132000 +0000 @@ -446,7 +446,7 @@ reg = <0x00 0x1f2b0000 0x00 0x10000 0xe0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; reg-names = "csr\0cfg\0msi_gen\0msi_term"; ranges = <0x1000000 0x00 0x00 0xe0 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0xe1 0x30000000 0x00 0x80000000>; - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; interrupts = <0x00 0x10 0x04>; @@ -471,7 +471,7 @@ reg = <0x00 0x1f2c0000 0x00 0x10000 0xd0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; reg-names = "csr\0cfg\0msi_gen\0msi_term"; ranges = <0x1000000 0x00 0x00 0xd0 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0xd1 0x30000000 0x00 0x80000000>; - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; interrupts = <0x00 0x10 0x04>; @@ -496,7 +496,7 @@ reg = <0x00 0x1f2d0000 0x00 0x10000 0x90 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; reg-names = "csr\0cfg\0msi_gen\0msi_term"; ranges = <0x1000000 0x00 0x00 0x90 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0x91 0x30000000 0x00 0x80000000>; - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; interrupts = <0x00 0x10 0x04>; @@ -522,7 +522,7 @@ reg = <0x00 0x1f500000 0x00 0x10000 0xa0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; reg-names = "csr\0cfg\0msi_gen\0msi_term"; ranges = <0x2000000 0x00 0x30000000 0xa1 0x30000000 0x00 0x80000000>; - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; interrupts = <0x00 0x10 0x04>; @@ -547,7 +547,7 @@ reg = <0x00 0x1f510000 0x00 0x10000 0xc0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; reg-names = "csr\0cfg\0msi_gen\0msi_term"; ranges = <0x1000000 0x00 0x00 0xc0 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0xc1 0x30000000 0x00 0x80000000>; - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; interrupts = <0x00 0x10 0x04>; And that failed to boot with a 5.17-rc3. Since dma-ranges was previously identical to ib-ranges, I also tried making the same change to ib-ranges, but with no success. > While we shouldn't break existing DTs, the moonshot DT doesn't use > what's documented upstream. There are multiple differences compared to > what's documented. Is upstream supposed to support upstream DTs, > downstream DTs, and ACPI for XGene which is an abandoned platform with > only a handful of users? That's a fair question, though it's one of a policy, and I feel I'd be overstepping by weighing in. I suppose one option I have is to try and create and upstream a dts for these systems and modify our boot.scr to always load that over the one provided by firmware. While we do have some of these systems in production, they are being retired and replaced with newer kit over time, and it's possible we'll never need to upgrade them to a modern kernel. -dann ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] PCI: xgene: Fix IB window setup 2022-02-08 1:19 ` dann frazier @ 2022-02-08 14:34 ` Rob Herring -1 siblings, 0 replies; 27+ messages in thread From: Rob Herring @ 2022-02-08 14:34 UTC (permalink / raw) To: dann frazier Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński, Bjorn Helgaas, Andrew Murray, Stéphane Graber, stable, PCI, linux-arm-kernel, linux-kernel On Mon, Feb 7, 2022 at 7:19 PM dann frazier <dann.frazier@canonical.com> wrote: > > On Mon, Feb 07, 2022 at 10:09:31AM -0600, Rob Herring wrote: > > On Sat, Feb 5, 2022 at 3:13 PM dann frazier <dann.frazier@canonical.com> wrote: > > > > > > On Sat, Feb 5, 2022 at 9:05 AM Rob Herring <robh@kernel.org> wrote: > > > > > > > > On Fri, Feb 4, 2022 at 5:01 PM dann frazier <dann.frazier@canonical.com> wrote: > > > > > > > > > > On Mon, Nov 29, 2021 at 11:36:37AM -0600, Rob Herring wrote: > > > > > > Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") > > > > > > broke PCI support on XGene. The cause is the IB resources are now sorted > > > > > > in address order instead of being in DT dma-ranges order. The result is > > > > > > which inbound registers are used for each region are swapped. I don't > > > > > > know the details about this h/w, but it appears that IB region 0 > > > > > > registers can't handle a size greater than 4GB. In any case, limiting > > > > > > the size for region 0 is enough to get back to the original assignment > > > > > > of dma-ranges to regions. > > > > > > > > > > hey Rob! > > > > > > > > > > I've been seeing a panic on HP Moonshoot m400 cartridges (X-Gene1) - > > > > > only during network installs - that I also bisected down to commit > > > > > 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup"). I was > > > > > hoping that this patch that fixed the issue on Stéphane's X-Gene2 > > > > > system would also fix my issue, but no luck. In fact, it seems to just > > > > > makes it fail differently. Reverting both patches is required to get a > > > > > v5.17-rc kernel to boot. > > > > > > > > > > I've collected the following logs - let me know if anything else would > > > > > be useful. > > > > > > > > > > 1) v5.17-rc2+ (unmodified): > > > > > http://dannf.org/bugs/m400-no-reverts.log > > > > > Note that the mlx4 driver fails initialization. > > > > > > > > > > 2) v5.17-rc2+, w/o the commit that fixed Stéphane's system: > > > > > http://dannf.org/bugs/m400-xgene2-fix-reverted.log > > > > > Note the mlx4 MSI-X timeout, and later panic. > > > > > > > > > > 3) v5.17-rc2+, w/ both commits reverted (works) > > > > > http://dannf.org/bugs/m400-both-reverted.log > > > > > > > > The ranges and dma-ranges addresses don't appear to match up with any > > > > upstream dts files. Can you send me the DT? > > > > > > Sure: http://dannf.org/bugs/fdt > > > > The first fix certainly is a problem. It's going to need something > > besides size to key off of (originally it was dependent on order of > > dma-ranges entries). > > > > The 2nd issue is the 'dma-ranges' has a second entry that is now ignored: > > > > dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00>, <0x00 > > 0x79000000 0x00 0x79000000 0x00 0x800000>; > > > > Based on the flags (3rd addr cell: 0x0), we have an inbound config > > space which the kernel now ignores because inbound config space > > accesses make no sense. But clearly some setup is needed. Upstream, in > > contrast, sets up a memory range that includes this region, so the > > setup does happen: > > > > <0x42000000 0x00 0x00000000 0x00 0x00000000 0x80 0x00000000> > > > > Minimally, I suspect it will work if you change dma-ranges 2nd entry to: > > > > <0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000> > > Thanks for looking into this Rob. I tried to test that theory, but it > didn't seem to work. This is what I tried: > > --- m400.dts 2022-02-07 20:16:44.840475323 +0000 > +++ m400.dts.dmaonly 2022-02-08 00:17:54.097132000 +0000 > @@ -446,7 +446,7 @@ > reg = <0x00 0x1f2b0000 0x00 0x10000 0xe0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; > reg-names = "csr\0cfg\0msi_gen\0msi_term"; > ranges = <0x1000000 0x00 0x00 0xe0 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0xe1 0x30000000 0x00 0x80000000>; > - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; > ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; > interrupts = <0x00 0x10 0x04>; > @@ -471,7 +471,7 @@ > reg = <0x00 0x1f2c0000 0x00 0x10000 0xd0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; > reg-names = "csr\0cfg\0msi_gen\0msi_term"; > ranges = <0x1000000 0x00 0x00 0xd0 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0xd1 0x30000000 0x00 0x80000000>; > - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; > ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; > interrupts = <0x00 0x10 0x04>; > @@ -496,7 +496,7 @@ > reg = <0x00 0x1f2d0000 0x00 0x10000 0x90 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; > reg-names = "csr\0cfg\0msi_gen\0msi_term"; > ranges = <0x1000000 0x00 0x00 0x90 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0x91 0x30000000 0x00 0x80000000>; > - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; > ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; > interrupts = <0x00 0x10 0x04>; > @@ -522,7 +522,7 @@ > reg = <0x00 0x1f500000 0x00 0x10000 0xa0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; > reg-names = "csr\0cfg\0msi_gen\0msi_term"; > ranges = <0x2000000 0x00 0x30000000 0xa1 0x30000000 0x00 0x80000000>; > - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; > ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; > interrupts = <0x00 0x10 0x04>; > @@ -547,7 +547,7 @@ > reg = <0x00 0x1f510000 0x00 0x10000 0xc0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; > reg-names = "csr\0cfg\0msi_gen\0msi_term"; > ranges = <0x1000000 0x00 0x00 0xc0 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0xc1 0x30000000 0x00 0x80000000>; > - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; > ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; > interrupts = <0x00 0x10 0x04>; > > And that failed to boot with a 5.17-rc3. Since dma-ranges was > previously identical to ib-ranges, I also tried making the same change > to ib-ranges, but with no success. Failed to boot at all or just PCIe still didn't work causing boot to eventually fail? 'ib-ranges' is unknown to the kernel, so the firmware is using it somehow? You also need to revert the first fix for PCIe to work. > > While we shouldn't break existing DTs, the moonshot DT doesn't use > > what's documented upstream. There are multiple differences compared to > > what's documented. Is upstream supposed to support upstream DTs, > > downstream DTs, and ACPI for XGene which is an abandoned platform with > > only a handful of users? > > That's a fair question, though it's one of a policy, and I feel I'd be > overstepping by weighing in. I suppose one option I have is to try > and create and upstream a dts for these systems and modify our > boot.scr to always load that over the one provided by firmware. While > we do have some of these systems in production, they are being retired > and replaced with newer kit over time, and it's possible we'll never > need to upgrade them to a modern kernel. > > -dann ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] PCI: xgene: Fix IB window setup @ 2022-02-08 14:34 ` Rob Herring 0 siblings, 0 replies; 27+ messages in thread From: Rob Herring @ 2022-02-08 14:34 UTC (permalink / raw) To: dann frazier Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński, Bjorn Helgaas, Andrew Murray, Stéphane Graber, stable, PCI, linux-arm-kernel, linux-kernel On Mon, Feb 7, 2022 at 7:19 PM dann frazier <dann.frazier@canonical.com> wrote: > > On Mon, Feb 07, 2022 at 10:09:31AM -0600, Rob Herring wrote: > > On Sat, Feb 5, 2022 at 3:13 PM dann frazier <dann.frazier@canonical.com> wrote: > > > > > > On Sat, Feb 5, 2022 at 9:05 AM Rob Herring <robh@kernel.org> wrote: > > > > > > > > On Fri, Feb 4, 2022 at 5:01 PM dann frazier <dann.frazier@canonical.com> wrote: > > > > > > > > > > On Mon, Nov 29, 2021 at 11:36:37AM -0600, Rob Herring wrote: > > > > > > Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") > > > > > > broke PCI support on XGene. The cause is the IB resources are now sorted > > > > > > in address order instead of being in DT dma-ranges order. The result is > > > > > > which inbound registers are used for each region are swapped. I don't > > > > > > know the details about this h/w, but it appears that IB region 0 > > > > > > registers can't handle a size greater than 4GB. In any case, limiting > > > > > > the size for region 0 is enough to get back to the original assignment > > > > > > of dma-ranges to regions. > > > > > > > > > > hey Rob! > > > > > > > > > > I've been seeing a panic on HP Moonshoot m400 cartridges (X-Gene1) - > > > > > only during network installs - that I also bisected down to commit > > > > > 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup"). I was > > > > > hoping that this patch that fixed the issue on Stéphane's X-Gene2 > > > > > system would also fix my issue, but no luck. In fact, it seems to just > > > > > makes it fail differently. Reverting both patches is required to get a > > > > > v5.17-rc kernel to boot. > > > > > > > > > > I've collected the following logs - let me know if anything else would > > > > > be useful. > > > > > > > > > > 1) v5.17-rc2+ (unmodified): > > > > > http://dannf.org/bugs/m400-no-reverts.log > > > > > Note that the mlx4 driver fails initialization. > > > > > > > > > > 2) v5.17-rc2+, w/o the commit that fixed Stéphane's system: > > > > > http://dannf.org/bugs/m400-xgene2-fix-reverted.log > > > > > Note the mlx4 MSI-X timeout, and later panic. > > > > > > > > > > 3) v5.17-rc2+, w/ both commits reverted (works) > > > > > http://dannf.org/bugs/m400-both-reverted.log > > > > > > > > The ranges and dma-ranges addresses don't appear to match up with any > > > > upstream dts files. Can you send me the DT? > > > > > > Sure: http://dannf.org/bugs/fdt > > > > The first fix certainly is a problem. It's going to need something > > besides size to key off of (originally it was dependent on order of > > dma-ranges entries). > > > > The 2nd issue is the 'dma-ranges' has a second entry that is now ignored: > > > > dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00>, <0x00 > > 0x79000000 0x00 0x79000000 0x00 0x800000>; > > > > Based on the flags (3rd addr cell: 0x0), we have an inbound config > > space which the kernel now ignores because inbound config space > > accesses make no sense. But clearly some setup is needed. Upstream, in > > contrast, sets up a memory range that includes this region, so the > > setup does happen: > > > > <0x42000000 0x00 0x00000000 0x00 0x00000000 0x80 0x00000000> > > > > Minimally, I suspect it will work if you change dma-ranges 2nd entry to: > > > > <0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000> > > Thanks for looking into this Rob. I tried to test that theory, but it > didn't seem to work. This is what I tried: > > --- m400.dts 2022-02-07 20:16:44.840475323 +0000 > +++ m400.dts.dmaonly 2022-02-08 00:17:54.097132000 +0000 > @@ -446,7 +446,7 @@ > reg = <0x00 0x1f2b0000 0x00 0x10000 0xe0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; > reg-names = "csr\0cfg\0msi_gen\0msi_term"; > ranges = <0x1000000 0x00 0x00 0xe0 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0xe1 0x30000000 0x00 0x80000000>; > - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; > ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; > interrupts = <0x00 0x10 0x04>; > @@ -471,7 +471,7 @@ > reg = <0x00 0x1f2c0000 0x00 0x10000 0xd0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; > reg-names = "csr\0cfg\0msi_gen\0msi_term"; > ranges = <0x1000000 0x00 0x00 0xd0 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0xd1 0x30000000 0x00 0x80000000>; > - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; > ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; > interrupts = <0x00 0x10 0x04>; > @@ -496,7 +496,7 @@ > reg = <0x00 0x1f2d0000 0x00 0x10000 0x90 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; > reg-names = "csr\0cfg\0msi_gen\0msi_term"; > ranges = <0x1000000 0x00 0x00 0x90 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0x91 0x30000000 0x00 0x80000000>; > - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; > ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; > interrupts = <0x00 0x10 0x04>; > @@ -522,7 +522,7 @@ > reg = <0x00 0x1f500000 0x00 0x10000 0xa0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; > reg-names = "csr\0cfg\0msi_gen\0msi_term"; > ranges = <0x2000000 0x00 0x30000000 0xa1 0x30000000 0x00 0x80000000>; > - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; > ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; > interrupts = <0x00 0x10 0x04>; > @@ -547,7 +547,7 @@ > reg = <0x00 0x1f510000 0x00 0x10000 0xc0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; > reg-names = "csr\0cfg\0msi_gen\0msi_term"; > ranges = <0x1000000 0x00 0x00 0xc0 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0xc1 0x30000000 0x00 0x80000000>; > - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; > ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; > interrupts = <0x00 0x10 0x04>; > > And that failed to boot with a 5.17-rc3. Since dma-ranges was > previously identical to ib-ranges, I also tried making the same change > to ib-ranges, but with no success. Failed to boot at all or just PCIe still didn't work causing boot to eventually fail? 'ib-ranges' is unknown to the kernel, so the firmware is using it somehow? You also need to revert the first fix for PCIe to work. > > While we shouldn't break existing DTs, the moonshot DT doesn't use > > what's documented upstream. There are multiple differences compared to > > what's documented. Is upstream supposed to support upstream DTs, > > downstream DTs, and ACPI for XGene which is an abandoned platform with > > only a handful of users? > > That's a fair question, though it's one of a policy, and I feel I'd be > overstepping by weighing in. I suppose one option I have is to try > and create and upstream a dts for these systems and modify our > boot.scr to always load that over the one provided by firmware. While > we do have some of these systems in production, they are being retired > and replaced with newer kit over time, and it's possible we'll never > need to upgrade them to a modern kernel. > > -dann _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] PCI: xgene: Fix IB window setup 2022-02-08 14:34 ` Rob Herring @ 2022-02-11 2:16 ` dann frazier -1 siblings, 0 replies; 27+ messages in thread From: dann frazier @ 2022-02-11 2:16 UTC (permalink / raw) To: Rob Herring Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński, Bjorn Helgaas, Andrew Murray, Stéphane Graber, stable, PCI, linux-arm-kernel, linux-kernel On Tue, Feb 08, 2022 at 08:34:45AM -0600, Rob Herring wrote: > On Mon, Feb 7, 2022 at 7:19 PM dann frazier <dann.frazier@canonical.com> wrote: > > > > On Mon, Feb 07, 2022 at 10:09:31AM -0600, Rob Herring wrote: > > > On Sat, Feb 5, 2022 at 3:13 PM dann frazier <dann.frazier@canonical.com> wrote: > > > > > > > > On Sat, Feb 5, 2022 at 9:05 AM Rob Herring <robh@kernel.org> wrote: > > > > > > > > > > On Fri, Feb 4, 2022 at 5:01 PM dann frazier <dann.frazier@canonical.com> wrote: > > > > > > > > > > > > On Mon, Nov 29, 2021 at 11:36:37AM -0600, Rob Herring wrote: > > > > > > > Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") > > > > > > > broke PCI support on XGene. The cause is the IB resources are now sorted > > > > > > > in address order instead of being in DT dma-ranges order. The result is > > > > > > > which inbound registers are used for each region are swapped. I don't > > > > > > > know the details about this h/w, but it appears that IB region 0 > > > > > > > registers can't handle a size greater than 4GB. In any case, limiting > > > > > > > the size for region 0 is enough to get back to the original assignment > > > > > > > of dma-ranges to regions. > > > > > > > > > > > > hey Rob! > > > > > > > > > > > > I've been seeing a panic on HP Moonshoot m400 cartridges (X-Gene1) - > > > > > > only during network installs - that I also bisected down to commit > > > > > > 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup"). I was > > > > > > hoping that this patch that fixed the issue on Stéphane's X-Gene2 > > > > > > system would also fix my issue, but no luck. In fact, it seems to just > > > > > > makes it fail differently. Reverting both patches is required to get a > > > > > > v5.17-rc kernel to boot. > > > > > > > > > > > > I've collected the following logs - let me know if anything else would > > > > > > be useful. > > > > > > > > > > > > 1) v5.17-rc2+ (unmodified): > > > > > > http://dannf.org/bugs/m400-no-reverts.log > > > > > > Note that the mlx4 driver fails initialization. > > > > > > > > > > > > 2) v5.17-rc2+, w/o the commit that fixed Stéphane's system: > > > > > > http://dannf.org/bugs/m400-xgene2-fix-reverted.log > > > > > > Note the mlx4 MSI-X timeout, and later panic. > > > > > > > > > > > > 3) v5.17-rc2+, w/ both commits reverted (works) > > > > > > http://dannf.org/bugs/m400-both-reverted.log > > > > > > > > > > The ranges and dma-ranges addresses don't appear to match up with any > > > > > upstream dts files. Can you send me the DT? > > > > > > > > Sure: http://dannf.org/bugs/fdt > > > > > > The first fix certainly is a problem. It's going to need something > > > besides size to key off of (originally it was dependent on order of > > > dma-ranges entries). > > > > > > The 2nd issue is the 'dma-ranges' has a second entry that is now ignored: > > > > > > dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00>, <0x00 > > > 0x79000000 0x00 0x79000000 0x00 0x800000>; > > > > > > Based on the flags (3rd addr cell: 0x0), we have an inbound config > > > space which the kernel now ignores because inbound config space > > > accesses make no sense. But clearly some setup is needed. Upstream, in > > > contrast, sets up a memory range that includes this region, so the > > > setup does happen: > > > > > > <0x42000000 0x00 0x00000000 0x00 0x00000000 0x80 0x00000000> > > > > > > Minimally, I suspect it will work if you change dma-ranges 2nd entry to: > > > > > > <0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000> > > > > Thanks for looking into this Rob. I tried to test that theory, but it > > didn't seem to work. This is what I tried: > > > > --- m400.dts 2022-02-07 20:16:44.840475323 +0000 > > +++ m400.dts.dmaonly 2022-02-08 00:17:54.097132000 +0000 > > @@ -446,7 +446,7 @@ > > reg = <0x00 0x1f2b0000 0x00 0x10000 0xe0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; > > reg-names = "csr\0cfg\0msi_gen\0msi_term"; > > ranges = <0x1000000 0x00 0x00 0xe0 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0xe1 0x30000000 0x00 0x80000000>; > > - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > > + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; > > ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > > ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; > > interrupts = <0x00 0x10 0x04>; > > @@ -471,7 +471,7 @@ > > reg = <0x00 0x1f2c0000 0x00 0x10000 0xd0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; > > reg-names = "csr\0cfg\0msi_gen\0msi_term"; > > ranges = <0x1000000 0x00 0x00 0xd0 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0xd1 0x30000000 0x00 0x80000000>; > > - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > > + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; > > ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > > ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; > > interrupts = <0x00 0x10 0x04>; > > @@ -496,7 +496,7 @@ > > reg = <0x00 0x1f2d0000 0x00 0x10000 0x90 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; > > reg-names = "csr\0cfg\0msi_gen\0msi_term"; > > ranges = <0x1000000 0x00 0x00 0x90 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0x91 0x30000000 0x00 0x80000000>; > > - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > > + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; > > ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > > ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; > > interrupts = <0x00 0x10 0x04>; > > @@ -522,7 +522,7 @@ > > reg = <0x00 0x1f500000 0x00 0x10000 0xa0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; > > reg-names = "csr\0cfg\0msi_gen\0msi_term"; > > ranges = <0x2000000 0x00 0x30000000 0xa1 0x30000000 0x00 0x80000000>; > > - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > > + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; > > ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > > ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; > > interrupts = <0x00 0x10 0x04>; > > @@ -547,7 +547,7 @@ > > reg = <0x00 0x1f510000 0x00 0x10000 0xc0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; > > reg-names = "csr\0cfg\0msi_gen\0msi_term"; > > ranges = <0x1000000 0x00 0x00 0xc0 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0xc1 0x30000000 0x00 0x80000000>; > > - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > > + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; > > ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > > ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; > > interrupts = <0x00 0x10 0x04>; > > > > And that failed to boot with a 5.17-rc3. Since dma-ranges was > > previously identical to ib-ranges, I also tried making the same change > > to ib-ranges, but with no success. > > Failed to boot at all or just PCIe still didn't work causing boot to > eventually fail? Sorry, I mean PCIe still didn't work, here's the log: http://dannf.org/bugs/m400-tweaked_dtb.log (unmodified kernel source w/ above dtb change) > 'ib-ranges' is unknown to the kernel, so the firmware > is using it somehow? > > You also need to revert the first fix for PCIe to work. Oh, OK. I misunderstood. I tried reverting commit 6dce5aa59e0b "PCI: xgene: Use inbound resources for setup" along with a dtb with the dma-ranges change in the diff above, but PCIe still didn't work. Here's the log: http://dannf.org/bugs/m400-6dce5aa5_reverted+tweaked_dtb.log -dann > > > > While we shouldn't break existing DTs, the moonshot DT doesn't use > > > what's documented upstream. There are multiple differences compared to > > > what's documented. Is upstream supposed to support upstream DTs, > > > downstream DTs, and ACPI for XGene which is an abandoned platform with > > > only a handful of users? > > > > That's a fair question, though it's one of a policy, and I feel I'd be > > overstepping by weighing in. I suppose one option I have is to try > > and create and upstream a dts for these systems and modify our > > boot.scr to always load that over the one provided by firmware. While > > we do have some of these systems in production, they are being retired > > and replaced with newer kit over time, and it's possible we'll never > > need to upgrade them to a modern kernel. > > > > -dann ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] PCI: xgene: Fix IB window setup @ 2022-02-11 2:16 ` dann frazier 0 siblings, 0 replies; 27+ messages in thread From: dann frazier @ 2022-02-11 2:16 UTC (permalink / raw) To: Rob Herring Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński, Bjorn Helgaas, Andrew Murray, Stéphane Graber, stable, PCI, linux-arm-kernel, linux-kernel On Tue, Feb 08, 2022 at 08:34:45AM -0600, Rob Herring wrote: > On Mon, Feb 7, 2022 at 7:19 PM dann frazier <dann.frazier@canonical.com> wrote: > > > > On Mon, Feb 07, 2022 at 10:09:31AM -0600, Rob Herring wrote: > > > On Sat, Feb 5, 2022 at 3:13 PM dann frazier <dann.frazier@canonical.com> wrote: > > > > > > > > On Sat, Feb 5, 2022 at 9:05 AM Rob Herring <robh@kernel.org> wrote: > > > > > > > > > > On Fri, Feb 4, 2022 at 5:01 PM dann frazier <dann.frazier@canonical.com> wrote: > > > > > > > > > > > > On Mon, Nov 29, 2021 at 11:36:37AM -0600, Rob Herring wrote: > > > > > > > Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") > > > > > > > broke PCI support on XGene. The cause is the IB resources are now sorted > > > > > > > in address order instead of being in DT dma-ranges order. The result is > > > > > > > which inbound registers are used for each region are swapped. I don't > > > > > > > know the details about this h/w, but it appears that IB region 0 > > > > > > > registers can't handle a size greater than 4GB. In any case, limiting > > > > > > > the size for region 0 is enough to get back to the original assignment > > > > > > > of dma-ranges to regions. > > > > > > > > > > > > hey Rob! > > > > > > > > > > > > I've been seeing a panic on HP Moonshoot m400 cartridges (X-Gene1) - > > > > > > only during network installs - that I also bisected down to commit > > > > > > 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup"). I was > > > > > > hoping that this patch that fixed the issue on Stéphane's X-Gene2 > > > > > > system would also fix my issue, but no luck. In fact, it seems to just > > > > > > makes it fail differently. Reverting both patches is required to get a > > > > > > v5.17-rc kernel to boot. > > > > > > > > > > > > I've collected the following logs - let me know if anything else would > > > > > > be useful. > > > > > > > > > > > > 1) v5.17-rc2+ (unmodified): > > > > > > http://dannf.org/bugs/m400-no-reverts.log > > > > > > Note that the mlx4 driver fails initialization. > > > > > > > > > > > > 2) v5.17-rc2+, w/o the commit that fixed Stéphane's system: > > > > > > http://dannf.org/bugs/m400-xgene2-fix-reverted.log > > > > > > Note the mlx4 MSI-X timeout, and later panic. > > > > > > > > > > > > 3) v5.17-rc2+, w/ both commits reverted (works) > > > > > > http://dannf.org/bugs/m400-both-reverted.log > > > > > > > > > > The ranges and dma-ranges addresses don't appear to match up with any > > > > > upstream dts files. Can you send me the DT? > > > > > > > > Sure: http://dannf.org/bugs/fdt > > > > > > The first fix certainly is a problem. It's going to need something > > > besides size to key off of (originally it was dependent on order of > > > dma-ranges entries). > > > > > > The 2nd issue is the 'dma-ranges' has a second entry that is now ignored: > > > > > > dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00>, <0x00 > > > 0x79000000 0x00 0x79000000 0x00 0x800000>; > > > > > > Based on the flags (3rd addr cell: 0x0), we have an inbound config > > > space which the kernel now ignores because inbound config space > > > accesses make no sense. But clearly some setup is needed. Upstream, in > > > contrast, sets up a memory range that includes this region, so the > > > setup does happen: > > > > > > <0x42000000 0x00 0x00000000 0x00 0x00000000 0x80 0x00000000> > > > > > > Minimally, I suspect it will work if you change dma-ranges 2nd entry to: > > > > > > <0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000> > > > > Thanks for looking into this Rob. I tried to test that theory, but it > > didn't seem to work. This is what I tried: > > > > --- m400.dts 2022-02-07 20:16:44.840475323 +0000 > > +++ m400.dts.dmaonly 2022-02-08 00:17:54.097132000 +0000 > > @@ -446,7 +446,7 @@ > > reg = <0x00 0x1f2b0000 0x00 0x10000 0xe0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; > > reg-names = "csr\0cfg\0msi_gen\0msi_term"; > > ranges = <0x1000000 0x00 0x00 0xe0 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0xe1 0x30000000 0x00 0x80000000>; > > - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > > + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; > > ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > > ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; > > interrupts = <0x00 0x10 0x04>; > > @@ -471,7 +471,7 @@ > > reg = <0x00 0x1f2c0000 0x00 0x10000 0xd0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; > > reg-names = "csr\0cfg\0msi_gen\0msi_term"; > > ranges = <0x1000000 0x00 0x00 0xd0 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0xd1 0x30000000 0x00 0x80000000>; > > - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > > + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; > > ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > > ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; > > interrupts = <0x00 0x10 0x04>; > > @@ -496,7 +496,7 @@ > > reg = <0x00 0x1f2d0000 0x00 0x10000 0x90 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; > > reg-names = "csr\0cfg\0msi_gen\0msi_term"; > > ranges = <0x1000000 0x00 0x00 0x90 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0x91 0x30000000 0x00 0x80000000>; > > - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > > + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; > > ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > > ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; > > interrupts = <0x00 0x10 0x04>; > > @@ -522,7 +522,7 @@ > > reg = <0x00 0x1f500000 0x00 0x10000 0xa0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; > > reg-names = "csr\0cfg\0msi_gen\0msi_term"; > > ranges = <0x2000000 0x00 0x30000000 0xa1 0x30000000 0x00 0x80000000>; > > - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > > + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; > > ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > > ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; > > interrupts = <0x00 0x10 0x04>; > > @@ -547,7 +547,7 @@ > > reg = <0x00 0x1f510000 0x00 0x10000 0xc0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; > > reg-names = "csr\0cfg\0msi_gen\0msi_term"; > > ranges = <0x1000000 0x00 0x00 0xc0 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0xc1 0x30000000 0x00 0x80000000>; > > - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > > + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; > > ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; > > ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; > > interrupts = <0x00 0x10 0x04>; > > > > And that failed to boot with a 5.17-rc3. Since dma-ranges was > > previously identical to ib-ranges, I also tried making the same change > > to ib-ranges, but with no success. > > Failed to boot at all or just PCIe still didn't work causing boot to > eventually fail? Sorry, I mean PCIe still didn't work, here's the log: http://dannf.org/bugs/m400-tweaked_dtb.log (unmodified kernel source w/ above dtb change) > 'ib-ranges' is unknown to the kernel, so the firmware > is using it somehow? > > You also need to revert the first fix for PCIe to work. Oh, OK. I misunderstood. I tried reverting commit 6dce5aa59e0b "PCI: xgene: Use inbound resources for setup" along with a dtb with the dma-ranges change in the diff above, but PCIe still didn't work. Here's the log: http://dannf.org/bugs/m400-6dce5aa5_reverted+tweaked_dtb.log -dann > > > > While we shouldn't break existing DTs, the moonshot DT doesn't use > > > what's documented upstream. There are multiple differences compared to > > > what's documented. Is upstream supposed to support upstream DTs, > > > downstream DTs, and ACPI for XGene which is an abandoned platform with > > > only a handful of users? > > > > That's a fair question, though it's one of a policy, and I feel I'd be > > overstepping by weighing in. I suppose one option I have is to try > > and create and upstream a dts for these systems and modify our > > boot.scr to always load that over the one provided by firmware. While > > we do have some of these systems in production, they are being retired > > and replaced with newer kit over time, and it's possible we'll never > > need to upgrade them to a modern kernel. > > > > -dann _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] PCI: xgene: Fix IB window setup 2022-02-11 2:16 ` dann frazier @ 2022-02-21 11:50 ` Thorsten Leemhuis -1 siblings, 0 replies; 27+ messages in thread From: Thorsten Leemhuis @ 2022-02-21 11:50 UTC (permalink / raw) To: dann frazier, Rob Herring Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński, Bjorn Helgaas, Andrew Murray, Stéphane Graber, stable, PCI, linux-arm-kernel, linux-kernel, regressions Hi, this is your Linux kernel regression tracker speaking. Top-posting for once, to make this easy accessible to everyone. What's the status of this regression and getting it fixed? It looks like there was quite some progress for a while, but then things seem to have come to a halt ten days ago. Could anyone please provide a status update, please? Ciao, Thorsten (wearing his 'Linux kernel regression tracker' hat) P.S.: As a Linux kernel regression tracker I'm getting a lot of reports on my table. I can only look briefly into most of them. Unfortunately therefore I sometimes will get things wrong or miss something important. I hope that's not the case here; if you think it is, don't hesitate to tell me about it in a public reply, that's in everyone's interest. #regzbot poke On 11.02.22 03:16, dann frazier wrote: > On Tue, Feb 08, 2022 at 08:34:45AM -0600, Rob Herring wrote: >> On Mon, Feb 7, 2022 at 7:19 PM dann frazier <dann.frazier@canonical.com> wrote: >>> >>> On Mon, Feb 07, 2022 at 10:09:31AM -0600, Rob Herring wrote: >>>> On Sat, Feb 5, 2022 at 3:13 PM dann frazier <dann.frazier@canonical.com> wrote: >>>>> >>>>> On Sat, Feb 5, 2022 at 9:05 AM Rob Herring <robh@kernel.org> wrote: >>>>>> >>>>>> On Fri, Feb 4, 2022 at 5:01 PM dann frazier <dann.frazier@canonical.com> wrote: >>>>>>> >>>>>>> On Mon, Nov 29, 2021 at 11:36:37AM -0600, Rob Herring wrote: >>>>>>>> Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") >>>>>>>> broke PCI support on XGene. The cause is the IB resources are now sorted >>>>>>>> in address order instead of being in DT dma-ranges order. The result is >>>>>>>> which inbound registers are used for each region are swapped. I don't >>>>>>>> know the details about this h/w, but it appears that IB region 0 >>>>>>>> registers can't handle a size greater than 4GB. In any case, limiting >>>>>>>> the size for region 0 is enough to get back to the original assignment >>>>>>>> of dma-ranges to regions. >>>>>>> >>>>>>> hey Rob! >>>>>>> >>>>>>> I've been seeing a panic on HP Moonshoot m400 cartridges (X-Gene1) - >>>>>>> only during network installs - that I also bisected down to commit >>>>>>> 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup"). I was >>>>>>> hoping that this patch that fixed the issue on Stéphane's X-Gene2 >>>>>>> system would also fix my issue, but no luck. In fact, it seems to just >>>>>>> makes it fail differently. Reverting both patches is required to get a >>>>>>> v5.17-rc kernel to boot. >>>>>>> >>>>>>> I've collected the following logs - let me know if anything else would >>>>>>> be useful. >>>>>>> >>>>>>> 1) v5.17-rc2+ (unmodified): >>>>>>> http://dannf.org/bugs/m400-no-reverts.log >>>>>>> Note that the mlx4 driver fails initialization. >>>>>>> >>>>>>> 2) v5.17-rc2+, w/o the commit that fixed Stéphane's system: >>>>>>> http://dannf.org/bugs/m400-xgene2-fix-reverted.log >>>>>>> Note the mlx4 MSI-X timeout, and later panic. >>>>>>> >>>>>>> 3) v5.17-rc2+, w/ both commits reverted (works) >>>>>>> http://dannf.org/bugs/m400-both-reverted.log >>>>>> >>>>>> The ranges and dma-ranges addresses don't appear to match up with any >>>>>> upstream dts files. Can you send me the DT? >>>>> >>>>> Sure: http://dannf.org/bugs/fdt >>>> >>>> The first fix certainly is a problem. It's going to need something >>>> besides size to key off of (originally it was dependent on order of >>>> dma-ranges entries). >>>> >>>> The 2nd issue is the 'dma-ranges' has a second entry that is now ignored: >>>> >>>> dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00>, <0x00 >>>> 0x79000000 0x00 0x79000000 0x00 0x800000>; >>>> >>>> Based on the flags (3rd addr cell: 0x0), we have an inbound config >>>> space which the kernel now ignores because inbound config space >>>> accesses make no sense. But clearly some setup is needed. Upstream, in >>>> contrast, sets up a memory range that includes this region, so the >>>> setup does happen: >>>> >>>> <0x42000000 0x00 0x00000000 0x00 0x00000000 0x80 0x00000000> >>>> >>>> Minimally, I suspect it will work if you change dma-ranges 2nd entry to: >>>> >>>> <0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000> >>> >>> Thanks for looking into this Rob. I tried to test that theory, but it >>> didn't seem to work. This is what I tried: >>> >>> --- m400.dts 2022-02-07 20:16:44.840475323 +0000 >>> +++ m400.dts.dmaonly 2022-02-08 00:17:54.097132000 +0000 >>> @@ -446,7 +446,7 @@ >>> reg = <0x00 0x1f2b0000 0x00 0x10000 0xe0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; >>> reg-names = "csr\0cfg\0msi_gen\0msi_term"; >>> ranges = <0x1000000 0x00 0x00 0xe0 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0xe1 0x30000000 0x00 0x80000000>; >>> - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; >>> + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; >>> ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; >>> ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; >>> interrupts = <0x00 0x10 0x04>; >>> @@ -471,7 +471,7 @@ >>> reg = <0x00 0x1f2c0000 0x00 0x10000 0xd0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; >>> reg-names = "csr\0cfg\0msi_gen\0msi_term"; >>> ranges = <0x1000000 0x00 0x00 0xd0 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0xd1 0x30000000 0x00 0x80000000>; >>> - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; >>> + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; >>> ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; >>> ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; >>> interrupts = <0x00 0x10 0x04>; >>> @@ -496,7 +496,7 @@ >>> reg = <0x00 0x1f2d0000 0x00 0x10000 0x90 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; >>> reg-names = "csr\0cfg\0msi_gen\0msi_term"; >>> ranges = <0x1000000 0x00 0x00 0x90 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0x91 0x30000000 0x00 0x80000000>; >>> - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; >>> + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; >>> ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; >>> ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; >>> interrupts = <0x00 0x10 0x04>; >>> @@ -522,7 +522,7 @@ >>> reg = <0x00 0x1f500000 0x00 0x10000 0xa0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; >>> reg-names = "csr\0cfg\0msi_gen\0msi_term"; >>> ranges = <0x2000000 0x00 0x30000000 0xa1 0x30000000 0x00 0x80000000>; >>> - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; >>> + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; >>> ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; >>> ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; >>> interrupts = <0x00 0x10 0x04>; >>> @@ -547,7 +547,7 @@ >>> reg = <0x00 0x1f510000 0x00 0x10000 0xc0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; >>> reg-names = "csr\0cfg\0msi_gen\0msi_term"; >>> ranges = <0x1000000 0x00 0x00 0xc0 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0xc1 0x30000000 0x00 0x80000000>; >>> - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; >>> + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; >>> ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; >>> ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; >>> interrupts = <0x00 0x10 0x04>; >>> >>> And that failed to boot with a 5.17-rc3. Since dma-ranges was >>> previously identical to ib-ranges, I also tried making the same change >>> to ib-ranges, but with no success. >> >> Failed to boot at all or just PCIe still didn't work causing boot to >> eventually fail? > > Sorry, I mean PCIe still didn't work, here's the log: > http://dannf.org/bugs/m400-tweaked_dtb.log > (unmodified kernel source w/ above dtb change) > >> 'ib-ranges' is unknown to the kernel, so the firmware >> is using it somehow? >> >> You also need to revert the first fix for PCIe to work. > > Oh, OK. I misunderstood. I tried reverting commit 6dce5aa59e0b "PCI: > xgene: Use inbound resources for setup" along with a dtb with the > dma-ranges change in the diff above, but PCIe still didn't > work. Here's the log: > > http://dannf.org/bugs/m400-6dce5aa5_reverted+tweaked_dtb.log > > -dann > >> >>>> While we shouldn't break existing DTs, the moonshot DT doesn't use >>>> what's documented upstream. There are multiple differences compared to >>>> what's documented. Is upstream supposed to support upstream DTs, >>>> downstream DTs, and ACPI for XGene which is an abandoned platform with >>>> only a handful of users? >>> >>> That's a fair question, though it's one of a policy, and I feel I'd be >>> overstepping by weighing in. I suppose one option I have is to try >>> and create and upstream a dts for these systems and modify our >>> boot.scr to always load that over the one provided by firmware. While >>> we do have some of these systems in production, they are being retired >>> and replaced with newer kit over time, and it's possible we'll never >>> need to upgrade them to a modern kernel. >>> >>> -dann ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] PCI: xgene: Fix IB window setup @ 2022-02-21 11:50 ` Thorsten Leemhuis 0 siblings, 0 replies; 27+ messages in thread From: Thorsten Leemhuis @ 2022-02-21 11:50 UTC (permalink / raw) To: dann frazier, Rob Herring Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński, Bjorn Helgaas, Andrew Murray, Stéphane Graber, stable, PCI, linux-arm-kernel, linux-kernel, regressions Hi, this is your Linux kernel regression tracker speaking. Top-posting for once, to make this easy accessible to everyone. What's the status of this regression and getting it fixed? It looks like there was quite some progress for a while, but then things seem to have come to a halt ten days ago. Could anyone please provide a status update, please? Ciao, Thorsten (wearing his 'Linux kernel regression tracker' hat) P.S.: As a Linux kernel regression tracker I'm getting a lot of reports on my table. I can only look briefly into most of them. Unfortunately therefore I sometimes will get things wrong or miss something important. I hope that's not the case here; if you think it is, don't hesitate to tell me about it in a public reply, that's in everyone's interest. #regzbot poke On 11.02.22 03:16, dann frazier wrote: > On Tue, Feb 08, 2022 at 08:34:45AM -0600, Rob Herring wrote: >> On Mon, Feb 7, 2022 at 7:19 PM dann frazier <dann.frazier@canonical.com> wrote: >>> >>> On Mon, Feb 07, 2022 at 10:09:31AM -0600, Rob Herring wrote: >>>> On Sat, Feb 5, 2022 at 3:13 PM dann frazier <dann.frazier@canonical.com> wrote: >>>>> >>>>> On Sat, Feb 5, 2022 at 9:05 AM Rob Herring <robh@kernel.org> wrote: >>>>>> >>>>>> On Fri, Feb 4, 2022 at 5:01 PM dann frazier <dann.frazier@canonical.com> wrote: >>>>>>> >>>>>>> On Mon, Nov 29, 2021 at 11:36:37AM -0600, Rob Herring wrote: >>>>>>>> Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") >>>>>>>> broke PCI support on XGene. The cause is the IB resources are now sorted >>>>>>>> in address order instead of being in DT dma-ranges order. The result is >>>>>>>> which inbound registers are used for each region are swapped. I don't >>>>>>>> know the details about this h/w, but it appears that IB region 0 >>>>>>>> registers can't handle a size greater than 4GB. In any case, limiting >>>>>>>> the size for region 0 is enough to get back to the original assignment >>>>>>>> of dma-ranges to regions. >>>>>>> >>>>>>> hey Rob! >>>>>>> >>>>>>> I've been seeing a panic on HP Moonshoot m400 cartridges (X-Gene1) - >>>>>>> only during network installs - that I also bisected down to commit >>>>>>> 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup"). I was >>>>>>> hoping that this patch that fixed the issue on Stéphane's X-Gene2 >>>>>>> system would also fix my issue, but no luck. In fact, it seems to just >>>>>>> makes it fail differently. Reverting both patches is required to get a >>>>>>> v5.17-rc kernel to boot. >>>>>>> >>>>>>> I've collected the following logs - let me know if anything else would >>>>>>> be useful. >>>>>>> >>>>>>> 1) v5.17-rc2+ (unmodified): >>>>>>> http://dannf.org/bugs/m400-no-reverts.log >>>>>>> Note that the mlx4 driver fails initialization. >>>>>>> >>>>>>> 2) v5.17-rc2+, w/o the commit that fixed Stéphane's system: >>>>>>> http://dannf.org/bugs/m400-xgene2-fix-reverted.log >>>>>>> Note the mlx4 MSI-X timeout, and later panic. >>>>>>> >>>>>>> 3) v5.17-rc2+, w/ both commits reverted (works) >>>>>>> http://dannf.org/bugs/m400-both-reverted.log >>>>>> >>>>>> The ranges and dma-ranges addresses don't appear to match up with any >>>>>> upstream dts files. Can you send me the DT? >>>>> >>>>> Sure: http://dannf.org/bugs/fdt >>>> >>>> The first fix certainly is a problem. It's going to need something >>>> besides size to key off of (originally it was dependent on order of >>>> dma-ranges entries). >>>> >>>> The 2nd issue is the 'dma-ranges' has a second entry that is now ignored: >>>> >>>> dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00>, <0x00 >>>> 0x79000000 0x00 0x79000000 0x00 0x800000>; >>>> >>>> Based on the flags (3rd addr cell: 0x0), we have an inbound config >>>> space which the kernel now ignores because inbound config space >>>> accesses make no sense. But clearly some setup is needed. Upstream, in >>>> contrast, sets up a memory range that includes this region, so the >>>> setup does happen: >>>> >>>> <0x42000000 0x00 0x00000000 0x00 0x00000000 0x80 0x00000000> >>>> >>>> Minimally, I suspect it will work if you change dma-ranges 2nd entry to: >>>> >>>> <0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000> >>> >>> Thanks for looking into this Rob. I tried to test that theory, but it >>> didn't seem to work. This is what I tried: >>> >>> --- m400.dts 2022-02-07 20:16:44.840475323 +0000 >>> +++ m400.dts.dmaonly 2022-02-08 00:17:54.097132000 +0000 >>> @@ -446,7 +446,7 @@ >>> reg = <0x00 0x1f2b0000 0x00 0x10000 0xe0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; >>> reg-names = "csr\0cfg\0msi_gen\0msi_term"; >>> ranges = <0x1000000 0x00 0x00 0xe0 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0xe1 0x30000000 0x00 0x80000000>; >>> - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; >>> + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; >>> ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; >>> ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; >>> interrupts = <0x00 0x10 0x04>; >>> @@ -471,7 +471,7 @@ >>> reg = <0x00 0x1f2c0000 0x00 0x10000 0xd0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; >>> reg-names = "csr\0cfg\0msi_gen\0msi_term"; >>> ranges = <0x1000000 0x00 0x00 0xd0 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0xd1 0x30000000 0x00 0x80000000>; >>> - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; >>> + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; >>> ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; >>> ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; >>> interrupts = <0x00 0x10 0x04>; >>> @@ -496,7 +496,7 @@ >>> reg = <0x00 0x1f2d0000 0x00 0x10000 0x90 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; >>> reg-names = "csr\0cfg\0msi_gen\0msi_term"; >>> ranges = <0x1000000 0x00 0x00 0x90 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0x91 0x30000000 0x00 0x80000000>; >>> - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; >>> + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; >>> ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; >>> ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; >>> interrupts = <0x00 0x10 0x04>; >>> @@ -522,7 +522,7 @@ >>> reg = <0x00 0x1f500000 0x00 0x10000 0xa0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; >>> reg-names = "csr\0cfg\0msi_gen\0msi_term"; >>> ranges = <0x2000000 0x00 0x30000000 0xa1 0x30000000 0x00 0x80000000>; >>> - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; >>> + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; >>> ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; >>> ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; >>> interrupts = <0x00 0x10 0x04>; >>> @@ -547,7 +547,7 @@ >>> reg = <0x00 0x1f510000 0x00 0x10000 0xc0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>; >>> reg-names = "csr\0cfg\0msi_gen\0msi_term"; >>> ranges = <0x1000000 0x00 0x00 0xc0 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0xc1 0x30000000 0x00 0x80000000>; >>> - dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; >>> + dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>; >>> ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>; >>> ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>; >>> interrupts = <0x00 0x10 0x04>; >>> >>> And that failed to boot with a 5.17-rc3. Since dma-ranges was >>> previously identical to ib-ranges, I also tried making the same change >>> to ib-ranges, but with no success. >> >> Failed to boot at all or just PCIe still didn't work causing boot to >> eventually fail? > > Sorry, I mean PCIe still didn't work, here's the log: > http://dannf.org/bugs/m400-tweaked_dtb.log > (unmodified kernel source w/ above dtb change) > >> 'ib-ranges' is unknown to the kernel, so the firmware >> is using it somehow? >> >> You also need to revert the first fix for PCIe to work. > > Oh, OK. I misunderstood. I tried reverting commit 6dce5aa59e0b "PCI: > xgene: Use inbound resources for setup" along with a dtb with the > dma-ranges change in the diff above, but PCIe still didn't > work. Here's the log: > > http://dannf.org/bugs/m400-6dce5aa5_reverted+tweaked_dtb.log > > -dann > >> >>>> While we shouldn't break existing DTs, the moonshot DT doesn't use >>>> what's documented upstream. There are multiple differences compared to >>>> what's documented. Is upstream supposed to support upstream DTs, >>>> downstream DTs, and ACPI for XGene which is an abandoned platform with >>>> only a handful of users? >>> >>> That's a fair question, though it's one of a policy, and I feel I'd be >>> overstepping by weighing in. I suppose one option I have is to try >>> and create and upstream a dts for these systems and modify our >>> boot.scr to always load that over the one provided by firmware. While >>> we do have some of these systems in production, they are being retired >>> and replaced with newer kit over time, and it's possible we'll never >>> need to upgrade them to a modern kernel. >>> >>> -dann _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] PCI: xgene: Fix IB window setup 2022-02-04 23:01 ` dann frazier @ 2022-02-06 9:52 ` Thorsten Leemhuis -1 siblings, 0 replies; 27+ messages in thread From: Thorsten Leemhuis @ 2022-02-06 9:52 UTC (permalink / raw) To: dann frazier, Rob Herring Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński, Bjorn Helgaas, Andrew Murray, Stéphane Graber, stable, linux-pci, linux-arm-kernel, linux-kernel, regressions [TLDR: I'm adding the regression report below to regzbot, the Linux kernel regression tracking bot; nearly all text you find below is compiled from a few templates paragraphs you likely have encountered already already from mails similar to this one.] Hi, this is your Linux kernel regression tracker speaking. CCing the regression mailing list, as it should be in the loop for all regressions, as explained here: https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html On 05.02.22 00:01, dann frazier wrote: > On Mon, Nov 29, 2021 at 11:36:37AM -0600, Rob Herring wrote: >> Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") >> broke PCI support on XGene. The cause is the IB resources are now sorted >> in address order instead of being in DT dma-ranges order. The result is >> which inbound registers are used for each region are swapped. I don't >> know the details about this h/w, but it appears that IB region 0 >> registers can't handle a size greater than 4GB. In any case, limiting >> the size for region 0 is enough to get back to the original assignment >> of dma-ranges to regions. > > hey Rob! > > I've been seeing a panic on HP Moonshoot m400 cartridges (X-Gene1) - > only during network installs - that I also bisected down to commit > 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup"). I was > hoping that this patch that fixed the issue on Stéphane's X-Gene2 > system would also fix my issue, but no luck. In fact, it seems to just > makes it fail differently. Reverting both patches is required to get a > v5.17-rc kernel to boot. > > I've collected the following logs - let me know if anything else would > be useful. > > 1) v5.17-rc2+ (unmodified): > http://dannf.org/bugs/m400-no-reverts.log > Note that the mlx4 driver fails initialization. > > 2) v5.17-rc2+, w/o the commit that fixed Stéphane's system: > http://dannf.org/bugs/m400-xgene2-fix-reverted.log > Note the mlx4 MSI-X timeout, and later panic. > > 3) v5.17-rc2+, w/ both commits reverted (works) > http://dannf.org/bugs/m400-both-reverted.log Thanks for the report. To be sure this issue doesn't fall through the cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression tracking bot: #regzbot ^introduced c7a75d07827a1f33d #regzbot title Follow-up error for the commit fixing "PCIe regression on APM Merlin (aarch64 dev platform) preventing NVME initialization" #regzbot ignore-activity Reminder for developers: when fixing the issue, please add a 'Link:' tags pointing to the report (the mail quoted above) using the lore.kernel.org/r/, as explained in 'Documentation/process/submitting-patches.rst' and 'Documentation/process/5.Posting.rst', as this allows the bot to assign any fixes posted or commited with the report to always show the current status of things and automatically close the issue when the fix hits the right tree. I'm sending this to everyone that got the initial report, to make them aware of the tracking. I also hope that messages like this motivate people to directly get at least the regression mailing list and ideally even regzbot involved when dealing with regressions, as messages like this wouldn't be needed then. Don't worry, I'll send further messages wrt to this regression just to the lists (with a tag in the subject so people can filter them away), if they are relevant just for regzbot. With a bit of luck no such messages will be needed anyway. Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) P.S.: As the Linux kernel's regression tracker I'm getting a lot of reports on my table. I can only look briefly into most of them and lack knowledge about most of the areas they concern. I thus unfortunately will sometimes get things wrong or miss something important. I hope that's not the case here; if you think it is, don't hesitate to tell me in a public reply, it's in everyone's interest to set the public record straight. -- Additional information about regzbot: If you want to know more about regzbot, check out its web-interface, the getting start guide, and the references documentation: https://linux-regtracking.leemhuis.info/regzbot/ https://gitlab.com/knurd42/regzbot/-/blob/main/docs/getting_started.md https://gitlab.com/knurd42/regzbot/-/blob/main/docs/reference.md The last two documents will explain how you can interact with regzbot yourself if your want to. Hint for reporters: when reporting a regression it's in your interest to CC the regression list and tell regzbot about the issue, as that ensures the regression makes it onto the radar of the Linux kernel's regression tracker -- that's in your interest, as it ensures your report won't fall through the cracks unnoticed. Hint for developers: you normally don't need to care about regzbot once it's involved. Fix the issue as you normally would, just remember to include 'Link:' tag in the patch descriptions pointing to all reports about the issue. This has been expected from developers even before regzbot showed up for reasons explained in 'Documentation/process/submitting-patches.rst' and 'Documentation/process/5.Posting.rst'. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] PCI: xgene: Fix IB window setup @ 2022-02-06 9:52 ` Thorsten Leemhuis 0 siblings, 0 replies; 27+ messages in thread From: Thorsten Leemhuis @ 2022-02-06 9:52 UTC (permalink / raw) To: dann frazier, Rob Herring Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński, Bjorn Helgaas, Andrew Murray, Stéphane Graber, stable, linux-pci, linux-arm-kernel, linux-kernel, regressions [TLDR: I'm adding the regression report below to regzbot, the Linux kernel regression tracking bot; nearly all text you find below is compiled from a few templates paragraphs you likely have encountered already already from mails similar to this one.] Hi, this is your Linux kernel regression tracker speaking. CCing the regression mailing list, as it should be in the loop for all regressions, as explained here: https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html On 05.02.22 00:01, dann frazier wrote: > On Mon, Nov 29, 2021 at 11:36:37AM -0600, Rob Herring wrote: >> Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") >> broke PCI support on XGene. The cause is the IB resources are now sorted >> in address order instead of being in DT dma-ranges order. The result is >> which inbound registers are used for each region are swapped. I don't >> know the details about this h/w, but it appears that IB region 0 >> registers can't handle a size greater than 4GB. In any case, limiting >> the size for region 0 is enough to get back to the original assignment >> of dma-ranges to regions. > > hey Rob! > > I've been seeing a panic on HP Moonshoot m400 cartridges (X-Gene1) - > only during network installs - that I also bisected down to commit > 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup"). I was > hoping that this patch that fixed the issue on Stéphane's X-Gene2 > system would also fix my issue, but no luck. In fact, it seems to just > makes it fail differently. Reverting both patches is required to get a > v5.17-rc kernel to boot. > > I've collected the following logs - let me know if anything else would > be useful. > > 1) v5.17-rc2+ (unmodified): > http://dannf.org/bugs/m400-no-reverts.log > Note that the mlx4 driver fails initialization. > > 2) v5.17-rc2+, w/o the commit that fixed Stéphane's system: > http://dannf.org/bugs/m400-xgene2-fix-reverted.log > Note the mlx4 MSI-X timeout, and later panic. > > 3) v5.17-rc2+, w/ both commits reverted (works) > http://dannf.org/bugs/m400-both-reverted.log Thanks for the report. To be sure this issue doesn't fall through the cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression tracking bot: #regzbot ^introduced c7a75d07827a1f33d #regzbot title Follow-up error for the commit fixing "PCIe regression on APM Merlin (aarch64 dev platform) preventing NVME initialization" #regzbot ignore-activity Reminder for developers: when fixing the issue, please add a 'Link:' tags pointing to the report (the mail quoted above) using the lore.kernel.org/r/, as explained in 'Documentation/process/submitting-patches.rst' and 'Documentation/process/5.Posting.rst', as this allows the bot to assign any fixes posted or commited with the report to always show the current status of things and automatically close the issue when the fix hits the right tree. I'm sending this to everyone that got the initial report, to make them aware of the tracking. I also hope that messages like this motivate people to directly get at least the regression mailing list and ideally even regzbot involved when dealing with regressions, as messages like this wouldn't be needed then. Don't worry, I'll send further messages wrt to this regression just to the lists (with a tag in the subject so people can filter them away), if they are relevant just for regzbot. With a bit of luck no such messages will be needed anyway. Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) P.S.: As the Linux kernel's regression tracker I'm getting a lot of reports on my table. I can only look briefly into most of them and lack knowledge about most of the areas they concern. I thus unfortunately will sometimes get things wrong or miss something important. I hope that's not the case here; if you think it is, don't hesitate to tell me in a public reply, it's in everyone's interest to set the public record straight. -- Additional information about regzbot: If you want to know more about regzbot, check out its web-interface, the getting start guide, and the references documentation: https://linux-regtracking.leemhuis.info/regzbot/ https://gitlab.com/knurd42/regzbot/-/blob/main/docs/getting_started.md https://gitlab.com/knurd42/regzbot/-/blob/main/docs/reference.md The last two documents will explain how you can interact with regzbot yourself if your want to. Hint for reporters: when reporting a regression it's in your interest to CC the regression list and tell regzbot about the issue, as that ensures the regression makes it onto the radar of the Linux kernel's regression tracker -- that's in your interest, as it ensures your report won't fall through the cracks unnoticed. Hint for developers: you normally don't need to care about regzbot once it's involved. Fix the issue as you normally would, just remember to include 'Link:' tag in the patch descriptions pointing to all reports about the issue. This has been expected from developers even before regzbot showed up for reasons explained in 'Documentation/process/submitting-patches.rst' and 'Documentation/process/5.Posting.rst'. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] PCI: xgene: Fix IB window setup #forregzbot 2022-02-06 9:52 ` Thorsten Leemhuis (?) @ 2022-03-07 14:25 ` Thorsten Leemhuis -1 siblings, 0 replies; 27+ messages in thread From: Thorsten Leemhuis @ 2022-03-07 14:25 UTC (permalink / raw) To: regressions TWIMC: this mail is primarily send for documentation purposes and for regzbot, my Linux kernel regression tracking bot. These mails usually contain '#forregzbot' in the subject, to make them easy to spot and filter. Forwarding a few details from here: https://lore.kernel.org/regressions/CAL_JsqLHun+X4jMwTbVMmjjETfbP73j52XCwWBj9MJCkpQ41mA@mail.gmail.com/ #regzbot introduced: 6dce5aa59e0b #regzbot title: c7a75d07827a fixed 6dce5aa59e0b for XGene2, but that *further* broke XGene1 #regzbot backburner: needs more debugging and number of people who care likely small anyway On 06.02.22 10:52, Thorsten Leemhuis wrote: > [TLDR: I'm adding the regression report below to regzbot, the Linux > kernel regression tracking bot; nearly all text you find below is > compiled from a few templates paragraphs you likely have encountered > already already from mails similar to this one.] > > Hi, this is your Linux kernel regression tracker speaking. > > CCing the regression mailing list, as it should be in the loop for all > regressions, as explained here: > https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html > > On 05.02.22 00:01, dann frazier wrote: >> On Mon, Nov 29, 2021 at 11:36:37AM -0600, Rob Herring wrote: >>> Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") >>> broke PCI support on XGene. The cause is the IB resources are now sorted >>> in address order instead of being in DT dma-ranges order. The result is >>> which inbound registers are used for each region are swapped. I don't >>> know the details about this h/w, but it appears that IB region 0 >>> registers can't handle a size greater than 4GB. In any case, limiting >>> the size for region 0 is enough to get back to the original assignment >>> of dma-ranges to regions. >> >> hey Rob! >> >> I've been seeing a panic on HP Moonshoot m400 cartridges (X-Gene1) - >> only during network installs - that I also bisected down to commit >> 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup"). I was >> hoping that this patch that fixed the issue on Stéphane's X-Gene2 >> system would also fix my issue, but no luck. In fact, it seems to just >> makes it fail differently. Reverting both patches is required to get a >> v5.17-rc kernel to boot. >> >> I've collected the following logs - let me know if anything else would >> be useful. >> >> 1) v5.17-rc2+ (unmodified): >> http://dannf.org/bugs/m400-no-reverts.log >> Note that the mlx4 driver fails initialization. >> >> 2) v5.17-rc2+, w/o the commit that fixed Stéphane's system: >> http://dannf.org/bugs/m400-xgene2-fix-reverted.log >> Note the mlx4 MSI-X timeout, and later panic. >> >> 3) v5.17-rc2+, w/ both commits reverted (works) >> http://dannf.org/bugs/m400-both-reverted.log > > Thanks for the report. > > To be sure this issue doesn't fall through the cracks unnoticed, I'm > adding it to regzbot, my Linux kernel regression tracking bot: > > #regzbot ^introduced c7a75d07827a1f33d > #regzbot title Follow-up error for the commit fixing "PCIe regression on > APM Merlin (aarch64 dev platform) preventing NVME initialization" > #regzbot ignore-activity > > Reminder for developers: when fixing the issue, please add a 'Link:' > tags pointing to the report (the mail quoted above) using the > lore.kernel.org/r/, as explained in > 'Documentation/process/submitting-patches.rst' and > 'Documentation/process/5.Posting.rst', as this allows the bot to assign > any fixes posted or commited with the report to always show the current > status of things and automatically close the issue when the fix hits the > right tree. > > I'm sending this to everyone that got the initial report, to make them > aware of the tracking. I also hope that messages like this motivate > people to directly get at least the regression mailing list and ideally > even regzbot involved when dealing with regressions, as messages like > this wouldn't be needed then. > > Don't worry, I'll send further messages wrt to this regression just to > the lists (with a tag in the subject so people can filter them away), if > they are relevant just for regzbot. With a bit of luck no such messages > will be needed anyway. > > Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) > > P.S.: As the Linux kernel's regression tracker I'm getting a lot of > reports on my table. I can only look briefly into most of them and lack > knowledge about most of the areas they concern. I thus unfortunately > will sometimes get things wrong or miss something important. I hope > that's not the case here; if you think it is, don't hesitate to tell me > in a public reply, it's in everyone's interest to set the public record > straight. > ^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2022-03-07 14:25 UTC | newest] Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-11-29 17:36 [PATCH] PCI: xgene: Fix IB window setup Rob Herring 2021-11-29 17:36 ` Rob Herring 2021-11-29 19:14 ` Stéphane Graber 2021-11-29 19:14 ` Stéphane Graber 2021-11-30 7:55 ` Krzysztof Wilczyński 2021-11-30 7:55 ` Krzysztof Wilczyński 2021-11-30 14:30 ` Lorenzo Pieralisi 2021-11-30 14:30 ` Lorenzo Pieralisi 2022-02-04 23:01 ` dann frazier 2022-02-04 23:01 ` dann frazier 2022-02-05 16:05 ` Rob Herring 2022-02-05 16:05 ` Rob Herring 2022-02-05 21:12 ` dann frazier 2022-02-05 21:12 ` dann frazier 2022-02-07 16:09 ` Rob Herring 2022-02-07 16:09 ` Rob Herring 2022-02-08 1:19 ` dann frazier 2022-02-08 1:19 ` dann frazier 2022-02-08 14:34 ` Rob Herring 2022-02-08 14:34 ` Rob Herring 2022-02-11 2:16 ` dann frazier 2022-02-11 2:16 ` dann frazier 2022-02-21 11:50 ` Thorsten Leemhuis 2022-02-21 11:50 ` Thorsten Leemhuis 2022-02-06 9:52 ` Thorsten Leemhuis 2022-02-06 9:52 ` Thorsten Leemhuis 2022-03-07 14:25 ` [PATCH] PCI: xgene: Fix IB window setup #forregzbot Thorsten Leemhuis
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.