* iMX6 PCIe MSI issues
@ 2018-11-23 22:17 Robert Hancock
2018-11-26 1:53 ` Richard Zhu
2018-11-26 16:31 ` Fabio Estevam
0 siblings, 2 replies; 11+ messages in thread
From: Robert Hancock @ 2018-11-23 22:17 UTC (permalink / raw)
To: linux-arm-kernel
I am working with a custom FPGA PCI Express endpoint connected to an NXP
iMX6D processor running the 4.19.2 kernel. It seems happy using INTx
interrupts but when trying to enable MSI the device driver is not
receiving any interrupts.
>From some register poking I have figured out:
-the MSI address set on the PCIe device is correctly set in the iMX MSI
controller's MSI Controller Address register (0x1ffc820)
-the interrupt vectors are enabled in the MSI controller's Interrupt
Enable register (0x1ffc828)
-the interrupt vectors are not masked in the MSI controller's Interrupt
Mask register (0x1ffc82c)
-The MSI controller's Interrupt Status register (0x1ffc830) shows that
the requested interrupt vectors are pending
-In the ARM GIC, vector 152 (for msi_ctrl_int) is enabled in the IS
enable register (0x00a01110), but not set in the IS pending (0x00a01210)
or IS active (0x00a01310) registers
-Vector 152 is not masked in the GPC interrupt mask (0x00a01310)
-Vector 152 is not active in the GPC interrupt status (0x00a01310)
So it appears the MSI controller is receiving and recognizing the MSI
from the device, but the interrupt is not making it into the GIC for
some reason. If I manually set vector 152 to pending in the GIC, the
dw_handle_msi_irq handler in pci-designware-host.c does get called along
with the interrupt handler(s) for the PCIe device, so it appears the
chain from that point on is working:
# devmem 0x00a01210 32 0x1000000
I found someone else reporting this in 2014 with an unknown kernel
version on the NXP forums here, but with no resolution listed there:
https://community.nxp.com/thread/318307
Any ideas on what may be going wrong? My next step may be to try an
older kernel version to see if this got broken at some point.
--
Robert Hancock
Senior Software Developer
SED Systems
Email: hancock at sedsystems.ca
^ permalink raw reply [flat|nested] 11+ messages in thread
* iMX6 PCIe MSI issues 2018-11-23 22:17 iMX6 PCIe MSI issues Robert Hancock @ 2018-11-26 1:53 ` Richard Zhu 2018-11-26 16:22 ` Robert Hancock 2018-11-26 16:31 ` Fabio Estevam 1 sibling, 1 reply; 11+ messages in thread From: Richard Zhu @ 2018-11-26 1:53 UTC (permalink / raw) To: linux-arm-kernel Hi Robert: Can you make reference to the following URL? https://patchwork.ozlabs.org/patch/989802/ It may be helpful. Best Regards Richard Zhu Office: 86-21-28937189 Mobile: 86-13386059786 > -----Original Message----- > From: Robert Hancock [mailto:hancock at sedsystems.ca] > Sent: 2018?11?24? 6:17 > To: linux-arm-kernel at lists.infradead.org > Cc: Richard Zhu <hongxing.zhu@nxp.com>; l.stach at pengutronix.de > Subject: iMX6 PCIe MSI issues > > I am working with a custom FPGA PCI Express endpoint connected to an NXP > iMX6D processor running the 4.19.2 kernel. It seems happy using INTx > interrupts but when trying to enable MSI the device driver is not receiving any > interrupts. > > From some register poking I have figured out: > -the MSI address set on the PCIe device is correctly set in the iMX MSI > controller's MSI Controller Address register (0x1ffc820) -the interrupt vectors > are enabled in the MSI controller's Interrupt Enable register (0x1ffc828) -the > interrupt vectors are not masked in the MSI controller's Interrupt Mask > register (0x1ffc82c) -The MSI controller's Interrupt Status register (0x1ffc830) > shows that the requested interrupt vectors are pending -In the ARM GIC, > vector 152 (for msi_ctrl_int) is enabled in the IS enable register (0x00a01110), > but not set in the IS pending (0x00a01210) or IS active (0x00a01310) registers > -Vector 152 is not masked in the GPC interrupt mask (0x00a01310) -Vector > 152 is not active in the GPC interrupt status (0x00a01310) > > So it appears the MSI controller is receiving and recognizing the MSI from the > device, but the interrupt is not making it into the GIC for some reason. If I > manually set vector 152 to pending in the GIC, the dw_handle_msi_irq > handler in pci-designware-host.c does get called along with the interrupt > handler(s) for the PCIe device, so it appears the chain from that point on is > working: > > # devmem 0x00a01210 32 0x1000000 > > I found someone else reporting this in 2014 with an unknown kernel version > on the NXP forums here, but with no resolution listed there: > > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcom > munity.nxp.com%2Fthread%2F318307&data=02%7C01%7Chongxing.zhu > %40nxp.com%7C83e67fc00a0a488ed23908d651917763%7C686ea1d3bc2b4 > c6fa92cd99c5c301635%7C0%7C0%7C636786082558025273&sdata=c% > 2BATbyH0928oYCMejdXByUI9GSv5entWGgNlmZ4E7Nc%3D&reserved=0 > > Any ideas on what may be going wrong? My next step may be to try an older > kernel version to see if this got broken at some point. > > -- > Robert Hancock > Senior Software Developer > SED Systems > Email: hancock at sedsystems.ca ^ permalink raw reply [flat|nested] 11+ messages in thread
* iMX6 PCIe MSI issues 2018-11-26 1:53 ` Richard Zhu @ 2018-11-26 16:22 ` Robert Hancock 0 siblings, 0 replies; 11+ messages in thread From: Robert Hancock @ 2018-11-26 16:22 UTC (permalink / raw) To: linux-arm-kernel It doesn't appear that patch has any effect on this issue, because the dw_handle_msi_irq function in question is never being called at all - I added some printk messages to verify that. I then suspected it could be some issues with an interrupt happening during initialization blocking the processing of further interrupts, but if I manually ack all of the possible interrupts by writing all ones to the MSI Interrupt Status register, I can see that the pending interrupts are cleared, and if new interrupts are raised they show up in that register again, but vector 152 for PCIe INTD/MSI is still not asserted in the GIC. Manually asserting it by writing to the GIC pending register causes the pending interrupts to be handled. It's like the vector is somehow not hooked up to the GIC or there's some other register that has to be set to enable the interrupts to actually be raised that I'm not aware of, but I'm currently at a loss to explain it.. On 2018-11-25 7:53 p.m., Richard Zhu wrote: > Hi Robert: > Can you make reference to the following URL? > > https://patchwork.ozlabs.org/patch/989802/ > It may be helpful. > > > Best Regards > Richard Zhu > Office: 86-21-28937189 > Mobile: 86-13386059786 > > >> -----Original Message----- >> From: Robert Hancock [mailto:hancock at sedsystems.ca] >> Sent: 2018?11?24? 6:17 >> To: linux-arm-kernel at lists.infradead.org >> Cc: Richard Zhu <hongxing.zhu@nxp.com>; l.stach at pengutronix.de >> Subject: iMX6 PCIe MSI issues >> >> I am working with a custom FPGA PCI Express endpoint connected to an NXP >> iMX6D processor running the 4.19.2 kernel. It seems happy using INTx >> interrupts but when trying to enable MSI the device driver is not receiving any >> interrupts. >> >> From some register poking I have figured out: >> -the MSI address set on the PCIe device is correctly set in the iMX MSI >> controller's MSI Controller Address register (0x1ffc820) -the interrupt vectors >> are enabled in the MSI controller's Interrupt Enable register (0x1ffc828) -the >> interrupt vectors are not masked in the MSI controller's Interrupt Mask >> register (0x1ffc82c) -The MSI controller's Interrupt Status register (0x1ffc830) >> shows that the requested interrupt vectors are pending -In the ARM GIC, >> vector 152 (for msi_ctrl_int) is enabled in the IS enable register (0x00a01110), >> but not set in the IS pending (0x00a01210) or IS active (0x00a01310) registers >> -Vector 152 is not masked in the GPC interrupt mask (0x00a01310) -Vector >> 152 is not active in the GPC interrupt status (0x00a01310) >> >> So it appears the MSI controller is receiving and recognizing the MSI from the >> device, but the interrupt is not making it into the GIC for some reason. If I >> manually set vector 152 to pending in the GIC, the dw_handle_msi_irq >> handler in pci-designware-host.c does get called along with the interrupt >> handler(s) for the PCIe device, so it appears the chain from that point on is >> working: >> >> # devmem 0x00a01210 32 0x1000000 >> >> I found someone else reporting this in 2014 with an unknown kernel version >> on the NXP forums here, but with no resolution listed there: >> >> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcom >> munity.nxp.com%2Fthread%2F318307&data=02%7C01%7Chongxing.zhu >> %40nxp.com%7C83e67fc00a0a488ed23908d651917763%7C686ea1d3bc2b4 >> c6fa92cd99c5c301635%7C0%7C0%7C636786082558025273&sdata=c% >> 2BATbyH0928oYCMejdXByUI9GSv5entWGgNlmZ4E7Nc%3D&reserved=0 >> >> Any ideas on what may be going wrong? My next step may be to try an older >> kernel version to see if this got broken at some point. >> >> -- >> Robert Hancock >> Senior Software Developer >> SED Systems >> Email: hancock at sedsystems.ca -- Robert Hancock Senior Software Developer SED Systems Email: hancock at sedsystems.ca ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: iMX6 PCIe MSI issues 2018-11-23 22:17 iMX6 PCIe MSI issues Robert Hancock @ 2018-11-26 16:31 ` Fabio Estevam 2018-11-26 16:31 ` Fabio Estevam 1 sibling, 0 replies; 11+ messages in thread From: Fabio Estevam @ 2018-11-26 16:31 UTC (permalink / raw) To: hancock, Tim Harvey, Trent Piepho Cc: moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE, Richard Zhu, Lucas Stach, linux-pci Adding Trent and Tim (as I think they managed to fix some imx6 MSI issues) On Fri, Nov 23, 2018 at 8:17 PM Robert Hancock <hancock@sedsystems.ca> wrote: > > I am working with a custom FPGA PCI Express endpoint connected to an NXP > iMX6D processor running the 4.19.2 kernel. It seems happy using INTx > interrupts but when trying to enable MSI the device driver is not > receiving any interrupts. > > From some register poking I have figured out: > -the MSI address set on the PCIe device is correctly set in the iMX MSI > controller's MSI Controller Address register (0x1ffc820) > -the interrupt vectors are enabled in the MSI controller's Interrupt > Enable register (0x1ffc828) > -the interrupt vectors are not masked in the MSI controller's Interrupt > Mask register (0x1ffc82c) > -The MSI controller's Interrupt Status register (0x1ffc830) shows that > the requested interrupt vectors are pending > -In the ARM GIC, vector 152 (for msi_ctrl_int) is enabled in the IS > enable register (0x00a01110), but not set in the IS pending (0x00a01210) > or IS active (0x00a01310) registers > -Vector 152 is not masked in the GPC interrupt mask (0x00a01310) > -Vector 152 is not active in the GPC interrupt status (0x00a01310) > > So it appears the MSI controller is receiving and recognizing the MSI > from the device, but the interrupt is not making it into the GIC for > some reason. If I manually set vector 152 to pending in the GIC, the > dw_handle_msi_irq handler in pci-designware-host.c does get called along > with the interrupt handler(s) for the PCIe device, so it appears the > chain from that point on is working: > > # devmem 0x00a01210 32 0x1000000 > > I found someone else reporting this in 2014 with an unknown kernel > version on the NXP forums here, but with no resolution listed there: > > https://community.nxp.com/thread/318307 > > Any ideas on what may be going wrong? My next step may be to try an > older kernel version to see if this got broken at some point. > > -- > Robert Hancock > Senior Software Developer > SED Systems > Email: hancock@sedsystems.ca > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 11+ messages in thread
* iMX6 PCIe MSI issues @ 2018-11-26 16:31 ` Fabio Estevam 0 siblings, 0 replies; 11+ messages in thread From: Fabio Estevam @ 2018-11-26 16:31 UTC (permalink / raw) To: linux-arm-kernel Adding Trent and Tim (as I think they managed to fix some imx6 MSI issues) On Fri, Nov 23, 2018 at 8:17 PM Robert Hancock <hancock@sedsystems.ca> wrote: > > I am working with a custom FPGA PCI Express endpoint connected to an NXP > iMX6D processor running the 4.19.2 kernel. It seems happy using INTx > interrupts but when trying to enable MSI the device driver is not > receiving any interrupts. > > From some register poking I have figured out: > -the MSI address set on the PCIe device is correctly set in the iMX MSI > controller's MSI Controller Address register (0x1ffc820) > -the interrupt vectors are enabled in the MSI controller's Interrupt > Enable register (0x1ffc828) > -the interrupt vectors are not masked in the MSI controller's Interrupt > Mask register (0x1ffc82c) > -The MSI controller's Interrupt Status register (0x1ffc830) shows that > the requested interrupt vectors are pending > -In the ARM GIC, vector 152 (for msi_ctrl_int) is enabled in the IS > enable register (0x00a01110), but not set in the IS pending (0x00a01210) > or IS active (0x00a01310) registers > -Vector 152 is not masked in the GPC interrupt mask (0x00a01310) > -Vector 152 is not active in the GPC interrupt status (0x00a01310) > > So it appears the MSI controller is receiving and recognizing the MSI > from the device, but the interrupt is not making it into the GIC for > some reason. If I manually set vector 152 to pending in the GIC, the > dw_handle_msi_irq handler in pci-designware-host.c does get called along > with the interrupt handler(s) for the PCIe device, so it appears the > chain from that point on is working: > > # devmem 0x00a01210 32 0x1000000 > > I found someone else reporting this in 2014 with an unknown kernel > version on the NXP forums here, but with no resolution listed there: > > https://community.nxp.com/thread/318307 > > Any ideas on what may be going wrong? My next step may be to try an > older kernel version to see if this got broken at some point. > > -- > Robert Hancock > Senior Software Developer > SED Systems > Email: hancock at sedsystems.ca > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel at lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: iMX6 PCIe MSI issues 2018-11-26 16:31 ` Fabio Estevam @ 2018-11-26 17:09 ` Trent Piepho -1 siblings, 0 replies; 11+ messages in thread From: Trent Piepho @ 2018-11-26 17:09 UTC (permalink / raw) To: festevam, hancock, tharvey Cc: linux-arm-kernel, l.stach, linux-pci, hongxing.zhu There is a bug that appeared in 4.14 that will result in an MSI getting dropped if it occurs during or shortly after that/another MSI interrupt handler is run. Obviously, then means one needs to get at least one MSI to work in the first place to see the bug! Robert's description also has MSI status set in dwc msi status register (0x830), that would not be the case for the MSI race. An interrupt is only passed up to the GIC on a 0->1 transition in the dwc msi status bit. We see it's a 1 now, but was the GIC interrupt enabled when the transition happened? It's not said below if that was checked. Try clearing the status (write a *1* to the bit clear it) in the dwc msi status register, check that it is now zero, and then see if another MSI causes it to become set, and does that make it to the GIC? If it does become set, but no irq to the GIC, then I have no idea what is there to stop it. This part of the chip is not documented well. Also, I think the new irq domain stuff in 4.17 breaks irq accounting to the GIC chain interrupt (152) to the dwc msi domain. It'll always show as zero in /proc/interrupts. But I've mostly been working in 4.16 so I'm not sure about the precise interaction of irq domains and /proc/interrupts yet. On Mon, 2018-11-26 at 14:31 -0200, Fabio Estevam wrote: > Adding Trent and Tim (as I think they managed to fix some imx6 MSI > issues) > > On Fri, Nov 23, 2018 at 8:17 PM Robert Hancock <hancock@sedsystems.ca > > wrote: > > > > I am working with a custom FPGA PCI Express endpoint connected to > > an NXP > > iMX6D processor running the 4.19.2 kernel. It seems happy using > > INTx > > interrupts but when trying to enable MSI the device driver is not > > receiving any interrupts. > > > > From some register poking I have figured out: > > -the MSI address set on the PCIe device is correctly set in the iMX > > MSI > > controller's MSI Controller Address register (0x1ffc820) > > -the interrupt vectors are enabled in the MSI controller's > > Interrupt > > Enable register (0x1ffc828) > > -the interrupt vectors are not masked in the MSI controller's > > Interrupt > > Mask register (0x1ffc82c) > > -The MSI controller's Interrupt Status register (0x1ffc830) shows > > that > > the requested interrupt vectors are pending > > -In the ARM GIC, vector 152 (for msi_ctrl_int) is enabled in the IS > > enable register (0x00a01110), but not set in the IS pending > > (0x00a01210) > > or IS active (0x00a01310) registers > > -Vector 152 is not masked in the GPC interrupt mask (0x00a01310) > > -Vector 152 is not active in the GPC interrupt status (0x00a01310) > > > > So it appears the MSI controller is receiving and recognizing the > > MSI > > from the device, but the interrupt is not making it into the GIC > > for > > some reason. If I manually set vector 152 to pending in the GIC, > > the > > dw_handle_msi_irq handler in pci-designware-host.c does get called > > along > > with the interrupt handler(s) for the PCIe device, so it appears > > the > > chain from that point on is working: > > > > # devmem 0x00a01210 32 0x1000000 > > > > I found someone else reporting this in 2014 with an unknown kernel > > version on the NXP forums here, but with no resolution listed > > there: > > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fco > > mmunity.nxp.com%2Fthread%2F318307&data=02%7C01%7Ctpiepho%40impi > > nj.com%7Cb1e4af4c58704651bc4e08d653bcaabe%7C6de70f0f73574529a415d8c > > bb7e93e5e%7C0%7C0%7C636788467119945424&sdata=I1b%2BZ1L99MErNA44 > > JlffTejqZlFSWhSkLeSFmv830Rg%3D&reserved=0 > > > > Any ideas on what may be going wrong? My next step may be to try an > > older kernel version to see if this got broken at some point. > > > > -- > > Robert Hancock > > Senior Software Developer > > SED Systems > > Email: hancock@sedsystems.ca > > > > _______________________________________________ > > linux-arm-kernel mailing list > > linux-arm-kernel@lists.infradead.org > > https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flis > > ts.infradead.org%2Fmailman%2Flistinfo%2Flinux-arm- > > kernel&data=02%7C01%7Ctpiepho%40impinj.com%7Cb1e4af4c58704651bc > > 4e08d653bcaabe%7C6de70f0f73574529a415d8cbb7e93e5e%7C0%7C0%7C6367884 > > 67119945424&sdata=6jndN8yOGxm60y%2B2fUuWTZnNvAs967PL6KnoncXyb6w > > %3D&reserved=0 ^ permalink raw reply [flat|nested] 11+ messages in thread
* iMX6 PCIe MSI issues @ 2018-11-26 17:09 ` Trent Piepho 0 siblings, 0 replies; 11+ messages in thread From: Trent Piepho @ 2018-11-26 17:09 UTC (permalink / raw) To: linux-arm-kernel There is a bug that appeared in 4.14 that will result in an MSI getting dropped if it occurs during or shortly after that/another MSI interrupt handler is run. Obviously, then means one needs to get at least one MSI to work in the first place to see the bug! Robert's description also has MSI status set in dwc msi status register (0x830), that would not be the case for the MSI race. An interrupt is only passed up to the GIC on a 0->1 transition in the dwc msi status bit. We see it's a 1 now, but was the GIC interrupt enabled when the transition happened? It's not said below if that was checked. Try clearing the status (write a *1* to the bit clear it) in the dwc msi status register, check that it is now zero, and then see if another MSI causes it to become set, and does that make it to the GIC? If it does become set, but no irq to the GIC, then I have no idea what is there to stop it. This part of the chip is not documented well. Also, I think the new irq domain stuff in 4.17 breaks irq accounting to the GIC chain interrupt (152) to the dwc msi domain. It'll always show as zero in /proc/interrupts. But I've mostly been working in 4.16 so I'm not sure about the precise interaction of irq domains and /proc/interrupts yet. On Mon, 2018-11-26 at 14:31 -0200, Fabio Estevam wrote: > Adding Trent and Tim (as I think they managed to fix some imx6 MSI > issues) > > On Fri, Nov 23, 2018 at 8:17 PM Robert Hancock <hancock@sedsystems.ca > > wrote: > > > > I am working with a custom FPGA PCI Express endpoint connected to > > an NXP > > iMX6D processor running the 4.19.2 kernel. It seems happy using > > INTx > > interrupts but when trying to enable MSI the device driver is not > > receiving any interrupts. > > > > From some register poking I have figured out: > > -the MSI address set on the PCIe device is correctly set in the iMX > > MSI > > controller's MSI Controller Address register (0x1ffc820) > > -the interrupt vectors are enabled in the MSI controller's > > Interrupt > > Enable register (0x1ffc828) > > -the interrupt vectors are not masked in the MSI controller's > > Interrupt > > Mask register (0x1ffc82c) > > -The MSI controller's Interrupt Status register (0x1ffc830) shows > > that > > the requested interrupt vectors are pending > > -In the ARM GIC, vector 152 (for msi_ctrl_int) is enabled in the IS > > enable register (0x00a01110), but not set in the IS pending > > (0x00a01210) > > or IS active (0x00a01310) registers > > -Vector 152 is not masked in the GPC interrupt mask (0x00a01310) > > -Vector 152 is not active in the GPC interrupt status (0x00a01310) > > > > So it appears the MSI controller is receiving and recognizing the > > MSI > > from the device, but the interrupt is not making it into the GIC > > for > > some reason. If I manually set vector 152 to pending in the GIC, > > the > > dw_handle_msi_irq handler in pci-designware-host.c does get called > > along > > with the interrupt handler(s) for the PCIe device, so it appears > > the > > chain from that point on is working: > > > > # devmem 0x00a01210 32 0x1000000 > > > > I found someone else reporting this in 2014 with an unknown kernel > > version on the NXP forums here, but with no resolution listed > > there: > > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fco > > mmunity.nxp.com%2Fthread%2F318307&data=02%7C01%7Ctpiepho%40impi > > nj.com%7Cb1e4af4c58704651bc4e08d653bcaabe%7C6de70f0f73574529a415d8c > > bb7e93e5e%7C0%7C0%7C636788467119945424&sdata=I1b%2BZ1L99MErNA44 > > JlffTejqZlFSWhSkLeSFmv830Rg%3D&reserved=0 > > > > Any ideas on what may be going wrong? My next step may be to try an > > older kernel version to see if this got broken at some point. > > > > -- > > Robert Hancock > > Senior Software Developer > > SED Systems > > Email: hancock at sedsystems.ca > > > > _______________________________________________ > > linux-arm-kernel mailing list > > linux-arm-kernel at lists.infradead.org > > https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flis > > ts.infradead.org%2Fmailman%2Flistinfo%2Flinux-arm- > > kernel&data=02%7C01%7Ctpiepho%40impinj.com%7Cb1e4af4c58704651bc > > 4e08d653bcaabe%7C6de70f0f73574529a415d8cbb7e93e5e%7C0%7C0%7C6367884 > > 67119945424&sdata=6jndN8yOGxm60y%2B2fUuWTZnNvAs967PL6KnoncXyb6w > > %3D&reserved=0 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: iMX6 PCIe MSI issues 2018-11-26 17:09 ` Trent Piepho @ 2018-11-26 19:24 ` Robert Hancock -1 siblings, 0 replies; 11+ messages in thread From: Robert Hancock @ 2018-11-26 19:24 UTC (permalink / raw) To: Trent Piepho, festevam, tharvey Cc: linux-arm-kernel, l.stach, linux-pci, hongxing.zhu On 2018-11-26 11:09 a.m., Trent Piepho wrote: > There is a bug that appeared in 4.14 that will result in an MSI getting > dropped if it occurs during or shortly after that/another MSI interrupt > handler is run. Obviously, then means one needs to get at least one > MSI to work in the first place to see the bug! > > Robert's description also has MSI status set in dwc msi status register > (0x830), that would not be the case for the MSI race. > > An interrupt is only passed up to the GIC on a 0->1 transition in the > dwc msi status bit. We see it's a 1 now, but was the GIC interrupt > enabled when the transition happened? It's not said below if that was > checked. > > Try clearing the status (write a *1* to the bit clear it) in the dwc > msi status register, check that it is now zero, and then see if another > MSI causes it to become set, and does that make it to the GIC? I've tried that (writing ones to the status register, verifying it goes to zero, raising another interrupt) and it doesn't seem to make it to the GIC even though the status register has transitioned from zero to non-zero. > > If it does become set, but no irq to the GIC, then I have no idea what > is there to stop it. This part of the chip is not documented well. > > Also, I think the new irq domain stuff in 4.17 breaks irq accounting to > the GIC chain interrupt (152) to the dwc msi domain. It'll always show > as zero in /proc/interrupts. But I've mostly been working in 4.16 so > I'm not sure about the precise interaction of irq domains and > /proc/interrupts yet. I'm not actually seeing the MSI interrupt showing up in /proc/interrupts at all in 4.19. From adding some debug output into the dwc PCIe code, it appears it's using Linux IRQ 24 as the chaining interrupt, but there's no entry in /proc/interrupts for either Linux IRQ 24 or GIC vector 152. Not sure if there is supposed to be or not. It does appear that the vector isn't masked in the GIC in any case, however, and when I force the interrupt into the GIC pending register, things seem to happen properly after that. > > On Mon, 2018-11-26 at 14:31 -0200, Fabio Estevam wrote: >> Adding Trent and Tim (as I think they managed to fix some imx6 MSI >> issues) >> >> On Fri, Nov 23, 2018 at 8:17 PM Robert Hancock <hancock@sedsystems.ca >>> wrote: >>> >>> I am working with a custom FPGA PCI Express endpoint connected to >>> an NXP >>> iMX6D processor running the 4.19.2 kernel. It seems happy using >>> INTx >>> interrupts but when trying to enable MSI the device driver is not >>> receiving any interrupts. >>> >>> From some register poking I have figured out: >>> -the MSI address set on the PCIe device is correctly set in the iMX >>> MSI >>> controller's MSI Controller Address register (0x1ffc820) >>> -the interrupt vectors are enabled in the MSI controller's >>> Interrupt >>> Enable register (0x1ffc828) >>> -the interrupt vectors are not masked in the MSI controller's >>> Interrupt >>> Mask register (0x1ffc82c) >>> -The MSI controller's Interrupt Status register (0x1ffc830) shows >>> that >>> the requested interrupt vectors are pending >>> -In the ARM GIC, vector 152 (for msi_ctrl_int) is enabled in the IS >>> enable register (0x00a01110), but not set in the IS pending >>> (0x00a01210) >>> or IS active (0x00a01310) registers >>> -Vector 152 is not masked in the GPC interrupt mask (0x00a01310) >>> -Vector 152 is not active in the GPC interrupt status (0x00a01310) >>> >>> So it appears the MSI controller is receiving and recognizing the >>> MSI >>> from the device, but the interrupt is not making it into the GIC >>> for >>> some reason. If I manually set vector 152 to pending in the GIC, >>> the >>> dw_handle_msi_irq handler in pci-designware-host.c does get called >>> along >>> with the interrupt handler(s) for the PCIe device, so it appears >>> the >>> chain from that point on is working: >>> >>> # devmem 0x00a01210 32 0x1000000 >>> >>> I found someone else reporting this in 2014 with an unknown kernel >>> version on the NXP forums here, but with no resolution listed >>> there: >>> >>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fco >>> mmunity.nxp.com%2Fthread%2F318307&data=02%7C01%7Ctpiepho%40impi >>> nj.com%7Cb1e4af4c58704651bc4e08d653bcaabe%7C6de70f0f73574529a415d8c >>> bb7e93e5e%7C0%7C0%7C636788467119945424&sdata=I1b%2BZ1L99MErNA44 >>> JlffTejqZlFSWhSkLeSFmv830Rg%3D&reserved=0 >>> >>> Any ideas on what may be going wrong? My next step may be to try an >>> older kernel version to see if this got broken at some point. >>> >>> -- >>> Robert Hancock >>> Senior Software Developer >>> SED Systems >>> Email: hancock@sedsystems.ca >>> >>> _______________________________________________ >>> linux-arm-kernel mailing list >>> linux-arm-kernel@lists.infradead.org >>> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flis >>> ts.infradead.org%2Fmailman%2Flistinfo%2Flinux-arm- >>> kernel&data=02%7C01%7Ctpiepho%40impinj.com%7Cb1e4af4c58704651bc >>> 4e08d653bcaabe%7C6de70f0f73574529a415d8cbb7e93e5e%7C0%7C0%7C6367884 >>> 67119945424&sdata=6jndN8yOGxm60y%2B2fUuWTZnNvAs967PL6KnoncXyb6w >>> %3D&reserved=0 -- Robert Hancock Senior Software Developer SED Systems Phone: (306) 933-1567 Email: hancock@sedsystems.ca ^ permalink raw reply [flat|nested] 11+ messages in thread
* iMX6 PCIe MSI issues @ 2018-11-26 19:24 ` Robert Hancock 0 siblings, 0 replies; 11+ messages in thread From: Robert Hancock @ 2018-11-26 19:24 UTC (permalink / raw) To: linux-arm-kernel On 2018-11-26 11:09 a.m., Trent Piepho wrote: > There is a bug that appeared in 4.14 that will result in an MSI getting > dropped if it occurs during or shortly after that/another MSI interrupt > handler is run. Obviously, then means one needs to get at least one > MSI to work in the first place to see the bug! > > Robert's description also has MSI status set in dwc msi status register > (0x830), that would not be the case for the MSI race. > > An interrupt is only passed up to the GIC on a 0->1 transition in the > dwc msi status bit. We see it's a 1 now, but was the GIC interrupt > enabled when the transition happened? It's not said below if that was > checked. > > Try clearing the status (write a *1* to the bit clear it) in the dwc > msi status register, check that it is now zero, and then see if another > MSI causes it to become set, and does that make it to the GIC? I've tried that (writing ones to the status register, verifying it goes to zero, raising another interrupt) and it doesn't seem to make it to the GIC even though the status register has transitioned from zero to non-zero. > > If it does become set, but no irq to the GIC, then I have no idea what > is there to stop it. This part of the chip is not documented well. > > Also, I think the new irq domain stuff in 4.17 breaks irq accounting to > the GIC chain interrupt (152) to the dwc msi domain. It'll always show > as zero in /proc/interrupts. But I've mostly been working in 4.16 so > I'm not sure about the precise interaction of irq domains and > /proc/interrupts yet. I'm not actually seeing the MSI interrupt showing up in /proc/interrupts at all in 4.19. From adding some debug output into the dwc PCIe code, it appears it's using Linux IRQ 24 as the chaining interrupt, but there's no entry in /proc/interrupts for either Linux IRQ 24 or GIC vector 152. Not sure if there is supposed to be or not. It does appear that the vector isn't masked in the GIC in any case, however, and when I force the interrupt into the GIC pending register, things seem to happen properly after that. > > On Mon, 2018-11-26 at 14:31 -0200, Fabio Estevam wrote: >> Adding Trent and Tim (as I think they managed to fix some imx6 MSI >> issues) >> >> On Fri, Nov 23, 2018 at 8:17 PM Robert Hancock <hancock@sedsystems.ca >>> wrote: >>> >>> I am working with a custom FPGA PCI Express endpoint connected to >>> an NXP >>> iMX6D processor running the 4.19.2 kernel. It seems happy using >>> INTx >>> interrupts but when trying to enable MSI the device driver is not >>> receiving any interrupts. >>> >>> From some register poking I have figured out: >>> -the MSI address set on the PCIe device is correctly set in the iMX >>> MSI >>> controller's MSI Controller Address register (0x1ffc820) >>> -the interrupt vectors are enabled in the MSI controller's >>> Interrupt >>> Enable register (0x1ffc828) >>> -the interrupt vectors are not masked in the MSI controller's >>> Interrupt >>> Mask register (0x1ffc82c) >>> -The MSI controller's Interrupt Status register (0x1ffc830) shows >>> that >>> the requested interrupt vectors are pending >>> -In the ARM GIC, vector 152 (for msi_ctrl_int) is enabled in the IS >>> enable register (0x00a01110), but not set in the IS pending >>> (0x00a01210) >>> or IS active (0x00a01310) registers >>> -Vector 152 is not masked in the GPC interrupt mask (0x00a01310) >>> -Vector 152 is not active in the GPC interrupt status (0x00a01310) >>> >>> So it appears the MSI controller is receiving and recognizing the >>> MSI >>> from the device, but the interrupt is not making it into the GIC >>> for >>> some reason. If I manually set vector 152 to pending in the GIC, >>> the >>> dw_handle_msi_irq handler in pci-designware-host.c does get called >>> along >>> with the interrupt handler(s) for the PCIe device, so it appears >>> the >>> chain from that point on is working: >>> >>> # devmem 0x00a01210 32 0x1000000 >>> >>> I found someone else reporting this in 2014 with an unknown kernel >>> version on the NXP forums here, but with no resolution listed >>> there: >>> >>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fco >>> mmunity.nxp.com%2Fthread%2F318307&data=02%7C01%7Ctpiepho%40impi >>> nj.com%7Cb1e4af4c58704651bc4e08d653bcaabe%7C6de70f0f73574529a415d8c >>> bb7e93e5e%7C0%7C0%7C636788467119945424&sdata=I1b%2BZ1L99MErNA44 >>> JlffTejqZlFSWhSkLeSFmv830Rg%3D&reserved=0 >>> >>> Any ideas on what may be going wrong? My next step may be to try an >>> older kernel version to see if this got broken at some point. >>> >>> -- >>> Robert Hancock >>> Senior Software Developer >>> SED Systems >>> Email: hancock at sedsystems.ca >>> >>> _______________________________________________ >>> linux-arm-kernel mailing list >>> linux-arm-kernel at lists.infradead.org >>> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flis >>> ts.infradead.org%2Fmailman%2Flistinfo%2Flinux-arm- >>> kernel&data=02%7C01%7Ctpiepho%40impinj.com%7Cb1e4af4c58704651bc >>> 4e08d653bcaabe%7C6de70f0f73574529a415d8cbb7e93e5e%7C0%7C0%7C6367884 >>> 67119945424&sdata=6jndN8yOGxm60y%2B2fUuWTZnNvAs967PL6KnoncXyb6w >>> %3D&reserved=0 -- Robert Hancock Senior Software Developer SED Systems Phone: (306) 933-1567 Email: hancock at sedsystems.ca ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: iMX6 PCIe MSI issues 2018-11-26 19:24 ` Robert Hancock @ 2018-11-27 18:53 ` Trent Piepho -1 siblings, 0 replies; 11+ messages in thread From: Trent Piepho @ 2018-11-27 18:53 UTC (permalink / raw) To: festevam, hancock, tharvey Cc: linux-arm-kernel, l.stach, linux-pci, hongxing.zhu On Mon, 2018-11-26 at 13:24 -0600, Robert Hancock wrote: > > > Also, I think the new irq domain stuff in 4.17 breaks irq accounting to > > the GIC chain interrupt (152) to the dwc msi domain. It'll always show > > as zero in /proc/interrupts. But I've mostly been working in 4.16 so > > I'm not sure about the precise interaction of irq domains and > > /proc/interrupts yet. > > I'm not actually seeing the MSI interrupt showing up in /proc/interrupts > at all in 4.19. From adding some debug output into the dwc PCIe code, it > appears it's using Linux IRQ 24 as the chaining interrupt, but there's > no entry in /proc/interrupts for either Linux IRQ 24 or GIC vector 152. > Not sure if there is supposed to be or not. It does appear that the > vector isn't masked in the GIC in any case, however, and when I force > the interrupt into the GIC pending register, things seem to happen > properly after that. In 4.16, the MSI chaining interrupt does show up in /proc/interrupts and does increment. Also shows up as trace events too. In 4.17, it no longer appears in /proc/interrupts. Finding the Linux irq number is non-obvious, as you've seen. It will show up in /sys/kernel/irq and /sys/kernel/debug/irq/irqs, but the count is always zero. IMHO, not an improvement. So if you're using that count in /sys to determine that the GIC irq never fired, then it's not conclusive. It always reads zero. But the same problem 2014 would obviously predate the 4.17 kernel. ^ permalink raw reply [flat|nested] 11+ messages in thread
* iMX6 PCIe MSI issues @ 2018-11-27 18:53 ` Trent Piepho 0 siblings, 0 replies; 11+ messages in thread From: Trent Piepho @ 2018-11-27 18:53 UTC (permalink / raw) To: linux-arm-kernel On Mon, 2018-11-26 at 13:24 -0600, Robert Hancock wrote: > > > Also, I think the new irq domain stuff in 4.17 breaks irq accounting to > > the GIC chain interrupt (152) to the dwc msi domain. It'll always show > > as zero in /proc/interrupts. But I've mostly been working in 4.16 so > > I'm not sure about the precise interaction of irq domains and > > /proc/interrupts yet. > > I'm not actually seeing the MSI interrupt showing up in /proc/interrupts > at all in 4.19. From adding some debug output into the dwc PCIe code, it > appears it's using Linux IRQ 24 as the chaining interrupt, but there's > no entry in /proc/interrupts for either Linux IRQ 24 or GIC vector 152. > Not sure if there is supposed to be or not. It does appear that the > vector isn't masked in the GIC in any case, however, and when I force > the interrupt into the GIC pending register, things seem to happen > properly after that. In 4.16, the MSI chaining interrupt does show up in /proc/interrupts and does increment. Also shows up as trace events too. In 4.17, it no longer appears in /proc/interrupts. Finding the Linux irq number is non-obvious, as you've seen. It will show up in /sys/kernel/irq and /sys/kernel/debug/irq/irqs, but the count is always zero. IMHO, not an improvement. So if you're using that count in /sys to determine that the GIC irq never fired, then it's not conclusive. It always reads zero. But the same problem 2014 would obviously predate the 4.17 kernel. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2018-11-27 18:53 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-11-23 22:17 iMX6 PCIe MSI issues Robert Hancock 2018-11-26 1:53 ` Richard Zhu 2018-11-26 16:22 ` Robert Hancock 2018-11-26 16:31 ` Fabio Estevam 2018-11-26 16:31 ` Fabio Estevam 2018-11-26 17:09 ` Trent Piepho 2018-11-26 17:09 ` Trent Piepho 2018-11-26 19:24 ` Robert Hancock 2018-11-26 19:24 ` Robert Hancock 2018-11-27 18:53 ` Trent Piepho 2018-11-27 18:53 ` Trent Piepho
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.