All of lore.kernel.org
 help / color / mirror / Atom feed
* iMX6 PCIe MSI issues
@ 2018-11-23 22:17 Robert Hancock
  2018-11-26  1:53 ` Richard Zhu
  2018-11-26 16:31   ` Fabio Estevam
  0 siblings, 2 replies; 11+ messages in thread
From: Robert Hancock @ 2018-11-23 22:17 UTC (permalink / raw)
  To: linux-arm-kernel

I am working with a custom FPGA PCI Express endpoint connected to an NXP
iMX6D processor running the 4.19.2 kernel. It seems happy using INTx
interrupts but when trying to enable MSI the device driver is not
receiving any interrupts.

>From some register poking I have figured out:
-the MSI address set on the PCIe device is correctly set in the iMX MSI
controller's MSI Controller Address register (0x1ffc820)
-the interrupt vectors are enabled in the MSI controller's Interrupt
Enable register (0x1ffc828)
-the interrupt vectors are not masked in the MSI controller's Interrupt
Mask register (0x1ffc82c)
-The MSI controller's Interrupt Status register (0x1ffc830) shows that
the requested interrupt vectors are pending
-In the ARM GIC, vector 152 (for msi_ctrl_int) is enabled in the IS
enable register (0x00a01110), but not set in the IS pending (0x00a01210)
or IS active (0x00a01310) registers
-Vector 152 is not masked in the GPC interrupt mask (0x00a01310)
-Vector 152 is not active in the GPC interrupt status (0x00a01310)

So it appears the MSI controller is receiving and recognizing the MSI
from the device, but the interrupt is not making it into the GIC for
some reason. If I manually set vector 152 to pending in the GIC, the
dw_handle_msi_irq handler in pci-designware-host.c does get called along
with the interrupt handler(s) for the PCIe device, so it appears the
chain from that point on is working:

# devmem 0x00a01210 32 0x1000000

I found someone else reporting this in 2014 with an unknown kernel
version on the NXP forums here, but with no resolution listed there:

https://community.nxp.com/thread/318307

Any ideas on what may be going wrong? My next step may be to try an
older kernel version to see if this got broken at some point.

-- 
Robert Hancock
Senior Software Developer
SED Systems
Email: hancock at sedsystems.ca

^ permalink raw reply	[flat|nested] 11+ messages in thread

* iMX6 PCIe MSI issues
  2018-11-23 22:17 iMX6 PCIe MSI issues Robert Hancock
@ 2018-11-26  1:53 ` Richard Zhu
  2018-11-26 16:22   ` Robert Hancock
  2018-11-26 16:31   ` Fabio Estevam
  1 sibling, 1 reply; 11+ messages in thread
From: Richard Zhu @ 2018-11-26  1:53 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Robert:
Can you make reference to the following URL?

https://patchwork.ozlabs.org/patch/989802/
It may be helpful.


Best Regards
Richard Zhu
Office: 86-21-28937189
Mobile: 86-13386059786


> -----Original Message-----
> From: Robert Hancock [mailto:hancock at sedsystems.ca]
> Sent: 2018?11?24? 6:17
> To: linux-arm-kernel at lists.infradead.org
> Cc: Richard Zhu <hongxing.zhu@nxp.com>; l.stach at pengutronix.de
> Subject: iMX6 PCIe MSI issues
> 
> I am working with a custom FPGA PCI Express endpoint connected to an NXP
> iMX6D processor running the 4.19.2 kernel. It seems happy using INTx
> interrupts but when trying to enable MSI the device driver is not receiving any
> interrupts.
> 
> From some register poking I have figured out:
> -the MSI address set on the PCIe device is correctly set in the iMX MSI
> controller's MSI Controller Address register (0x1ffc820) -the interrupt vectors
> are enabled in the MSI controller's Interrupt Enable register (0x1ffc828) -the
> interrupt vectors are not masked in the MSI controller's Interrupt Mask
> register (0x1ffc82c) -The MSI controller's Interrupt Status register (0x1ffc830)
> shows that the requested interrupt vectors are pending -In the ARM GIC,
> vector 152 (for msi_ctrl_int) is enabled in the IS enable register (0x00a01110),
> but not set in the IS pending (0x00a01210) or IS active (0x00a01310) registers
> -Vector 152 is not masked in the GPC interrupt mask (0x00a01310) -Vector
> 152 is not active in the GPC interrupt status (0x00a01310)
> 
> So it appears the MSI controller is receiving and recognizing the MSI from the
> device, but the interrupt is not making it into the GIC for some reason. If I
> manually set vector 152 to pending in the GIC, the dw_handle_msi_irq
> handler in pci-designware-host.c does get called along with the interrupt
> handler(s) for the PCIe device, so it appears the chain from that point on is
> working:
> 
> # devmem 0x00a01210 32 0x1000000
> 
> I found someone else reporting this in 2014 with an unknown kernel version
> on the NXP forums here, but with no resolution listed there:
> 
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcom
> munity.nxp.com%2Fthread%2F318307&amp;data=02%7C01%7Chongxing.zhu
> %40nxp.com%7C83e67fc00a0a488ed23908d651917763%7C686ea1d3bc2b4
> c6fa92cd99c5c301635%7C0%7C0%7C636786082558025273&amp;sdata=c%
> 2BATbyH0928oYCMejdXByUI9GSv5entWGgNlmZ4E7Nc%3D&amp;reserved=0
> 
> Any ideas on what may be going wrong? My next step may be to try an older
> kernel version to see if this got broken at some point.
> 
> --
> Robert Hancock
> Senior Software Developer
> SED Systems
> Email: hancock at sedsystems.ca

^ permalink raw reply	[flat|nested] 11+ messages in thread

* iMX6 PCIe MSI issues
  2018-11-26  1:53 ` Richard Zhu
@ 2018-11-26 16:22   ` Robert Hancock
  0 siblings, 0 replies; 11+ messages in thread
From: Robert Hancock @ 2018-11-26 16:22 UTC (permalink / raw)
  To: linux-arm-kernel

It doesn't appear that patch has any effect on this issue, because the
dw_handle_msi_irq function in question is never being called at all - I
added some printk messages to verify that.

I then suspected it could be some issues with an interrupt happening
during initialization blocking the processing of further interrupts, but
if I manually ack all of the possible interrupts by writing all ones to
the MSI Interrupt Status register,  I can see that the pending
interrupts are cleared, and if new interrupts are raised they show up in
that register again, but vector 152 for PCIe INTD/MSI is still not
asserted in the GIC. Manually asserting it by writing to the GIC pending
register causes the pending interrupts to be handled.

It's like the vector is somehow not hooked up to the GIC or there's some
other register that has to be set to enable the interrupts to actually
be raised that I'm not aware of, but I'm currently at a loss to explain it..

On 2018-11-25 7:53 p.m., Richard Zhu wrote:
> Hi Robert:
> Can you make reference to the following URL?
> 
> https://patchwork.ozlabs.org/patch/989802/
> It may be helpful.
> 
> 
> Best Regards
> Richard Zhu
> Office: 86-21-28937189
> Mobile: 86-13386059786
> 
> 
>> -----Original Message-----
>> From: Robert Hancock [mailto:hancock at sedsystems.ca]
>> Sent: 2018?11?24? 6:17
>> To: linux-arm-kernel at lists.infradead.org
>> Cc: Richard Zhu <hongxing.zhu@nxp.com>; l.stach at pengutronix.de
>> Subject: iMX6 PCIe MSI issues
>>
>> I am working with a custom FPGA PCI Express endpoint connected to an NXP
>> iMX6D processor running the 4.19.2 kernel. It seems happy using INTx
>> interrupts but when trying to enable MSI the device driver is not receiving any
>> interrupts.
>>
>> From some register poking I have figured out:
>> -the MSI address set on the PCIe device is correctly set in the iMX MSI
>> controller's MSI Controller Address register (0x1ffc820) -the interrupt vectors
>> are enabled in the MSI controller's Interrupt Enable register (0x1ffc828) -the
>> interrupt vectors are not masked in the MSI controller's Interrupt Mask
>> register (0x1ffc82c) -The MSI controller's Interrupt Status register (0x1ffc830)
>> shows that the requested interrupt vectors are pending -In the ARM GIC,
>> vector 152 (for msi_ctrl_int) is enabled in the IS enable register (0x00a01110),
>> but not set in the IS pending (0x00a01210) or IS active (0x00a01310) registers
>> -Vector 152 is not masked in the GPC interrupt mask (0x00a01310) -Vector
>> 152 is not active in the GPC interrupt status (0x00a01310)
>>
>> So it appears the MSI controller is receiving and recognizing the MSI from the
>> device, but the interrupt is not making it into the GIC for some reason. If I
>> manually set vector 152 to pending in the GIC, the dw_handle_msi_irq
>> handler in pci-designware-host.c does get called along with the interrupt
>> handler(s) for the PCIe device, so it appears the chain from that point on is
>> working:
>>
>> # devmem 0x00a01210 32 0x1000000
>>
>> I found someone else reporting this in 2014 with an unknown kernel version
>> on the NXP forums here, but with no resolution listed there:
>>
>> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcom
>> munity.nxp.com%2Fthread%2F318307&amp;data=02%7C01%7Chongxing.zhu
>> %40nxp.com%7C83e67fc00a0a488ed23908d651917763%7C686ea1d3bc2b4
>> c6fa92cd99c5c301635%7C0%7C0%7C636786082558025273&amp;sdata=c%
>> 2BATbyH0928oYCMejdXByUI9GSv5entWGgNlmZ4E7Nc%3D&amp;reserved=0
>>
>> Any ideas on what may be going wrong? My next step may be to try an older
>> kernel version to see if this got broken at some point.
>>
>> --
>> Robert Hancock
>> Senior Software Developer
>> SED Systems
>> Email: hancock at sedsystems.ca

-- 
Robert Hancock
Senior Software Developer
SED Systems
Email: hancock at sedsystems.ca

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: iMX6 PCIe MSI issues
  2018-11-23 22:17 iMX6 PCIe MSI issues Robert Hancock
@ 2018-11-26 16:31   ` Fabio Estevam
  2018-11-26 16:31   ` Fabio Estevam
  1 sibling, 0 replies; 11+ messages in thread
From: Fabio Estevam @ 2018-11-26 16:31 UTC (permalink / raw)
  To: hancock, Tim Harvey, Trent Piepho
  Cc: moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	Richard Zhu, Lucas Stach, linux-pci

Adding Trent and Tim (as I think they managed to fix some imx6 MSI issues)

On Fri, Nov 23, 2018 at 8:17 PM Robert Hancock <hancock@sedsystems.ca> wrote:
>
> I am working with a custom FPGA PCI Express endpoint connected to an NXP
> iMX6D processor running the 4.19.2 kernel. It seems happy using INTx
> interrupts but when trying to enable MSI the device driver is not
> receiving any interrupts.
>
> From some register poking I have figured out:
> -the MSI address set on the PCIe device is correctly set in the iMX MSI
> controller's MSI Controller Address register (0x1ffc820)
> -the interrupt vectors are enabled in the MSI controller's Interrupt
> Enable register (0x1ffc828)
> -the interrupt vectors are not masked in the MSI controller's Interrupt
> Mask register (0x1ffc82c)
> -The MSI controller's Interrupt Status register (0x1ffc830) shows that
> the requested interrupt vectors are pending
> -In the ARM GIC, vector 152 (for msi_ctrl_int) is enabled in the IS
> enable register (0x00a01110), but not set in the IS pending (0x00a01210)
> or IS active (0x00a01310) registers
> -Vector 152 is not masked in the GPC interrupt mask (0x00a01310)
> -Vector 152 is not active in the GPC interrupt status (0x00a01310)
>
> So it appears the MSI controller is receiving and recognizing the MSI
> from the device, but the interrupt is not making it into the GIC for
> some reason. If I manually set vector 152 to pending in the GIC, the
> dw_handle_msi_irq handler in pci-designware-host.c does get called along
> with the interrupt handler(s) for the PCIe device, so it appears the
> chain from that point on is working:
>
> # devmem 0x00a01210 32 0x1000000
>
> I found someone else reporting this in 2014 with an unknown kernel
> version on the NXP forums here, but with no resolution listed there:
>
> https://community.nxp.com/thread/318307
>
> Any ideas on what may be going wrong? My next step may be to try an
> older kernel version to see if this got broken at some point.
>
> --
> Robert Hancock
> Senior Software Developer
> SED Systems
> Email: hancock@sedsystems.ca
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* iMX6 PCIe MSI issues
@ 2018-11-26 16:31   ` Fabio Estevam
  0 siblings, 0 replies; 11+ messages in thread
From: Fabio Estevam @ 2018-11-26 16:31 UTC (permalink / raw)
  To: linux-arm-kernel

Adding Trent and Tim (as I think they managed to fix some imx6 MSI issues)

On Fri, Nov 23, 2018 at 8:17 PM Robert Hancock <hancock@sedsystems.ca> wrote:
>
> I am working with a custom FPGA PCI Express endpoint connected to an NXP
> iMX6D processor running the 4.19.2 kernel. It seems happy using INTx
> interrupts but when trying to enable MSI the device driver is not
> receiving any interrupts.
>
> From some register poking I have figured out:
> -the MSI address set on the PCIe device is correctly set in the iMX MSI
> controller's MSI Controller Address register (0x1ffc820)
> -the interrupt vectors are enabled in the MSI controller's Interrupt
> Enable register (0x1ffc828)
> -the interrupt vectors are not masked in the MSI controller's Interrupt
> Mask register (0x1ffc82c)
> -The MSI controller's Interrupt Status register (0x1ffc830) shows that
> the requested interrupt vectors are pending
> -In the ARM GIC, vector 152 (for msi_ctrl_int) is enabled in the IS
> enable register (0x00a01110), but not set in the IS pending (0x00a01210)
> or IS active (0x00a01310) registers
> -Vector 152 is not masked in the GPC interrupt mask (0x00a01310)
> -Vector 152 is not active in the GPC interrupt status (0x00a01310)
>
> So it appears the MSI controller is receiving and recognizing the MSI
> from the device, but the interrupt is not making it into the GIC for
> some reason. If I manually set vector 152 to pending in the GIC, the
> dw_handle_msi_irq handler in pci-designware-host.c does get called along
> with the interrupt handler(s) for the PCIe device, so it appears the
> chain from that point on is working:
>
> # devmem 0x00a01210 32 0x1000000
>
> I found someone else reporting this in 2014 with an unknown kernel
> version on the NXP forums here, but with no resolution listed there:
>
> https://community.nxp.com/thread/318307
>
> Any ideas on what may be going wrong? My next step may be to try an
> older kernel version to see if this got broken at some point.
>
> --
> Robert Hancock
> Senior Software Developer
> SED Systems
> Email: hancock at sedsystems.ca
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: iMX6 PCIe MSI issues
  2018-11-26 16:31   ` Fabio Estevam
@ 2018-11-26 17:09     ` Trent Piepho
  -1 siblings, 0 replies; 11+ messages in thread
From: Trent Piepho @ 2018-11-26 17:09 UTC (permalink / raw)
  To: festevam, hancock, tharvey
  Cc: linux-arm-kernel, l.stach, linux-pci, hongxing.zhu

There is a bug that appeared in 4.14 that will result in an MSI getting
dropped if it occurs during or shortly after that/another MSI interrupt
handler is run.  Obviously, then means one needs to get at least one
MSI to work in the first place to see the bug!

Robert's description also has MSI status set in dwc msi status register
(0x830), that would not be the case for the MSI race.

An interrupt is only passed up to the GIC on a 0->1 transition in the
dwc msi status bit.  We see it's a 1 now, but was the GIC interrupt
enabled when the transition happened?  It's not said below if that was
checked.

Try clearing the status (write a *1* to the bit clear it) in the dwc
msi status register, check that it is now zero, and then see if another
MSI causes it to become set, and does that make it to the GIC?

If it does become set, but no irq to the GIC, then I have no idea what
is there to stop it.  This part of the chip is not documented well.

Also, I think the new irq domain stuff in 4.17 breaks irq accounting to
the GIC chain interrupt (152) to the dwc msi domain.  It'll always show
as zero in /proc/interrupts.  But I've mostly been working in 4.16 so
I'm not sure about the precise interaction of irq domains and
/proc/interrupts yet.

On Mon, 2018-11-26 at 14:31 -0200, Fabio Estevam wrote:
> Adding Trent and Tim (as I think they managed to fix some imx6 MSI
> issues)
> 
> On Fri, Nov 23, 2018 at 8:17 PM Robert Hancock <hancock@sedsystems.ca
> > wrote:
> > 
> > I am working with a custom FPGA PCI Express endpoint connected to
> > an NXP
> > iMX6D processor running the 4.19.2 kernel. It seems happy using
> > INTx
> > interrupts but when trying to enable MSI the device driver is not
> > receiving any interrupts.
> > 
> > From some register poking I have figured out:
> > -the MSI address set on the PCIe device is correctly set in the iMX
> > MSI
> > controller's MSI Controller Address register (0x1ffc820)
> > -the interrupt vectors are enabled in the MSI controller's
> > Interrupt
> > Enable register (0x1ffc828)
> > -the interrupt vectors are not masked in the MSI controller's
> > Interrupt
> > Mask register (0x1ffc82c)
> > -The MSI controller's Interrupt Status register (0x1ffc830) shows
> > that
> > the requested interrupt vectors are pending
> > -In the ARM GIC, vector 152 (for msi_ctrl_int) is enabled in the IS
> > enable register (0x00a01110), but not set in the IS pending
> > (0x00a01210)
> > or IS active (0x00a01310) registers
> > -Vector 152 is not masked in the GPC interrupt mask (0x00a01310)
> > -Vector 152 is not active in the GPC interrupt status (0x00a01310)
> > 
> > So it appears the MSI controller is receiving and recognizing the
> > MSI
> > from the device, but the interrupt is not making it into the GIC
> > for
> > some reason. If I manually set vector 152 to pending in the GIC,
> > the
> > dw_handle_msi_irq handler in pci-designware-host.c does get called
> > along
> > with the interrupt handler(s) for the PCIe device, so it appears
> > the
> > chain from that point on is working:
> > 
> > # devmem 0x00a01210 32 0x1000000
> > 
> > I found someone else reporting this in 2014 with an unknown kernel
> > version on the NXP forums here, but with no resolution listed
> > there:
> > 
> > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fco
> > mmunity.nxp.com%2Fthread%2F318307&amp;data=02%7C01%7Ctpiepho%40impi
> > nj.com%7Cb1e4af4c58704651bc4e08d653bcaabe%7C6de70f0f73574529a415d8c
> > bb7e93e5e%7C0%7C0%7C636788467119945424&amp;sdata=I1b%2BZ1L99MErNA44
> > JlffTejqZlFSWhSkLeSFmv830Rg%3D&amp;reserved=0
> > 
> > Any ideas on what may be going wrong? My next step may be to try an
> > older kernel version to see if this got broken at some point.
> > 
> > --
> > Robert Hancock
> > Senior Software Developer
> > SED Systems
> > Email: hancock@sedsystems.ca
> > 
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flis
> > ts.infradead.org%2Fmailman%2Flistinfo%2Flinux-arm-
> > kernel&amp;data=02%7C01%7Ctpiepho%40impinj.com%7Cb1e4af4c58704651bc
> > 4e08d653bcaabe%7C6de70f0f73574529a415d8cbb7e93e5e%7C0%7C0%7C6367884
> > 67119945424&amp;sdata=6jndN8yOGxm60y%2B2fUuWTZnNvAs967PL6KnoncXyb6w
> > %3D&amp;reserved=0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* iMX6 PCIe MSI issues
@ 2018-11-26 17:09     ` Trent Piepho
  0 siblings, 0 replies; 11+ messages in thread
From: Trent Piepho @ 2018-11-26 17:09 UTC (permalink / raw)
  To: linux-arm-kernel

There is a bug that appeared in 4.14 that will result in an MSI getting
dropped if it occurs during or shortly after that/another MSI interrupt
handler is run.  Obviously, then means one needs to get at least one
MSI to work in the first place to see the bug!

Robert's description also has MSI status set in dwc msi status register
(0x830), that would not be the case for the MSI race.

An interrupt is only passed up to the GIC on a 0->1 transition in the
dwc msi status bit.  We see it's a 1 now, but was the GIC interrupt
enabled when the transition happened?  It's not said below if that was
checked.

Try clearing the status (write a *1* to the bit clear it) in the dwc
msi status register, check that it is now zero, and then see if another
MSI causes it to become set, and does that make it to the GIC?

If it does become set, but no irq to the GIC, then I have no idea what
is there to stop it.  This part of the chip is not documented well.

Also, I think the new irq domain stuff in 4.17 breaks irq accounting to
the GIC chain interrupt (152) to the dwc msi domain.  It'll always show
as zero in /proc/interrupts.  But I've mostly been working in 4.16 so
I'm not sure about the precise interaction of irq domains and
/proc/interrupts yet.

On Mon, 2018-11-26 at 14:31 -0200, Fabio Estevam wrote:
> Adding Trent and Tim (as I think they managed to fix some imx6 MSI
> issues)
> 
> On Fri, Nov 23, 2018 at 8:17 PM Robert Hancock <hancock@sedsystems.ca
> > wrote:
> > 
> > I am working with a custom FPGA PCI Express endpoint connected to
> > an NXP
> > iMX6D processor running the 4.19.2 kernel. It seems happy using
> > INTx
> > interrupts but when trying to enable MSI the device driver is not
> > receiving any interrupts.
> > 
> > From some register poking I have figured out:
> > -the MSI address set on the PCIe device is correctly set in the iMX
> > MSI
> > controller's MSI Controller Address register (0x1ffc820)
> > -the interrupt vectors are enabled in the MSI controller's
> > Interrupt
> > Enable register (0x1ffc828)
> > -the interrupt vectors are not masked in the MSI controller's
> > Interrupt
> > Mask register (0x1ffc82c)
> > -The MSI controller's Interrupt Status register (0x1ffc830) shows
> > that
> > the requested interrupt vectors are pending
> > -In the ARM GIC, vector 152 (for msi_ctrl_int) is enabled in the IS
> > enable register (0x00a01110), but not set in the IS pending
> > (0x00a01210)
> > or IS active (0x00a01310) registers
> > -Vector 152 is not masked in the GPC interrupt mask (0x00a01310)
> > -Vector 152 is not active in the GPC interrupt status (0x00a01310)
> > 
> > So it appears the MSI controller is receiving and recognizing the
> > MSI
> > from the device, but the interrupt is not making it into the GIC
> > for
> > some reason. If I manually set vector 152 to pending in the GIC,
> > the
> > dw_handle_msi_irq handler in pci-designware-host.c does get called
> > along
> > with the interrupt handler(s) for the PCIe device, so it appears
> > the
> > chain from that point on is working:
> > 
> > # devmem 0x00a01210 32 0x1000000
> > 
> > I found someone else reporting this in 2014 with an unknown kernel
> > version on the NXP forums here, but with no resolution listed
> > there:
> > 
> > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fco
> > mmunity.nxp.com%2Fthread%2F318307&amp;data=02%7C01%7Ctpiepho%40impi
> > nj.com%7Cb1e4af4c58704651bc4e08d653bcaabe%7C6de70f0f73574529a415d8c
> > bb7e93e5e%7C0%7C0%7C636788467119945424&amp;sdata=I1b%2BZ1L99MErNA44
> > JlffTejqZlFSWhSkLeSFmv830Rg%3D&amp;reserved=0
> > 
> > Any ideas on what may be going wrong? My next step may be to try an
> > older kernel version to see if this got broken at some point.
> > 
> > --
> > Robert Hancock
> > Senior Software Developer
> > SED Systems
> > Email: hancock at sedsystems.ca
> > 
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel at lists.infradead.org
> > https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flis
> > ts.infradead.org%2Fmailman%2Flistinfo%2Flinux-arm-
> > kernel&amp;data=02%7C01%7Ctpiepho%40impinj.com%7Cb1e4af4c58704651bc
> > 4e08d653bcaabe%7C6de70f0f73574529a415d8cbb7e93e5e%7C0%7C0%7C6367884
> > 67119945424&amp;sdata=6jndN8yOGxm60y%2B2fUuWTZnNvAs967PL6KnoncXyb6w
> > %3D&amp;reserved=0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: iMX6 PCIe MSI issues
  2018-11-26 17:09     ` Trent Piepho
@ 2018-11-26 19:24       ` Robert Hancock
  -1 siblings, 0 replies; 11+ messages in thread
From: Robert Hancock @ 2018-11-26 19:24 UTC (permalink / raw)
  To: Trent Piepho, festevam, tharvey
  Cc: linux-arm-kernel, l.stach, linux-pci, hongxing.zhu

On 2018-11-26 11:09 a.m., Trent Piepho wrote:
> There is a bug that appeared in 4.14 that will result in an MSI getting
> dropped if it occurs during or shortly after that/another MSI interrupt
> handler is run.  Obviously, then means one needs to get at least one
> MSI to work in the first place to see the bug!
> 
> Robert's description also has MSI status set in dwc msi status register
> (0x830), that would not be the case for the MSI race.
> 
> An interrupt is only passed up to the GIC on a 0->1 transition in the
> dwc msi status bit.  We see it's a 1 now, but was the GIC interrupt
> enabled when the transition happened?  It's not said below if that was
> checked.
> 
> Try clearing the status (write a *1* to the bit clear it) in the dwc
> msi status register, check that it is now zero, and then see if another
> MSI causes it to become set, and does that make it to the GIC?

I've tried that (writing ones to the status register, verifying it goes
to zero, raising another interrupt) and it doesn't seem to make it to
the GIC even though the status register has transitioned from zero to
non-zero.

> 
> If it does become set, but no irq to the GIC, then I have no idea what
> is there to stop it.  This part of the chip is not documented well.
> 
> Also, I think the new irq domain stuff in 4.17 breaks irq accounting to
> the GIC chain interrupt (152) to the dwc msi domain.  It'll always show
> as zero in /proc/interrupts.  But I've mostly been working in 4.16 so
> I'm not sure about the precise interaction of irq domains and
> /proc/interrupts yet.

I'm not actually seeing the MSI interrupt showing up in /proc/interrupts
at all in 4.19. From adding some debug output into the dwc PCIe code, it
appears it's using Linux IRQ 24 as the chaining interrupt, but there's
no entry in /proc/interrupts for either Linux IRQ 24 or GIC vector 152.
Not sure if there is supposed to be or not. It does appear that the
vector isn't masked in the GIC in any case, however, and when I force
the interrupt into the GIC pending register, things seem to happen
properly after that.

> 
> On Mon, 2018-11-26 at 14:31 -0200, Fabio Estevam wrote:
>> Adding Trent and Tim (as I think they managed to fix some imx6 MSI
>> issues)
>>
>> On Fri, Nov 23, 2018 at 8:17 PM Robert Hancock <hancock@sedsystems.ca
>>> wrote:
>>>
>>> I am working with a custom FPGA PCI Express endpoint connected to
>>> an NXP
>>> iMX6D processor running the 4.19.2 kernel. It seems happy using
>>> INTx
>>> interrupts but when trying to enable MSI the device driver is not
>>> receiving any interrupts.
>>>
>>> From some register poking I have figured out:
>>> -the MSI address set on the PCIe device is correctly set in the iMX
>>> MSI
>>> controller's MSI Controller Address register (0x1ffc820)
>>> -the interrupt vectors are enabled in the MSI controller's
>>> Interrupt
>>> Enable register (0x1ffc828)
>>> -the interrupt vectors are not masked in the MSI controller's
>>> Interrupt
>>> Mask register (0x1ffc82c)
>>> -The MSI controller's Interrupt Status register (0x1ffc830) shows
>>> that
>>> the requested interrupt vectors are pending
>>> -In the ARM GIC, vector 152 (for msi_ctrl_int) is enabled in the IS
>>> enable register (0x00a01110), but not set in the IS pending
>>> (0x00a01210)
>>> or IS active (0x00a01310) registers
>>> -Vector 152 is not masked in the GPC interrupt mask (0x00a01310)
>>> -Vector 152 is not active in the GPC interrupt status (0x00a01310)
>>>
>>> So it appears the MSI controller is receiving and recognizing the
>>> MSI
>>> from the device, but the interrupt is not making it into the GIC
>>> for
>>> some reason. If I manually set vector 152 to pending in the GIC,
>>> the
>>> dw_handle_msi_irq handler in pci-designware-host.c does get called
>>> along
>>> with the interrupt handler(s) for the PCIe device, so it appears
>>> the
>>> chain from that point on is working:
>>>
>>> # devmem 0x00a01210 32 0x1000000
>>>
>>> I found someone else reporting this in 2014 with an unknown kernel
>>> version on the NXP forums here, but with no resolution listed
>>> there:
>>>
>>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fco
>>> mmunity.nxp.com%2Fthread%2F318307&amp;data=02%7C01%7Ctpiepho%40impi
>>> nj.com%7Cb1e4af4c58704651bc4e08d653bcaabe%7C6de70f0f73574529a415d8c
>>> bb7e93e5e%7C0%7C0%7C636788467119945424&amp;sdata=I1b%2BZ1L99MErNA44
>>> JlffTejqZlFSWhSkLeSFmv830Rg%3D&amp;reserved=0
>>>
>>> Any ideas on what may be going wrong? My next step may be to try an
>>> older kernel version to see if this got broken at some point.
>>>
>>> --
>>> Robert Hancock
>>> Senior Software Developer
>>> SED Systems
>>> Email: hancock@sedsystems.ca
>>>
>>> _______________________________________________
>>> linux-arm-kernel mailing list
>>> linux-arm-kernel@lists.infradead.org
>>> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flis
>>> ts.infradead.org%2Fmailman%2Flistinfo%2Flinux-arm-
>>> kernel&amp;data=02%7C01%7Ctpiepho%40impinj.com%7Cb1e4af4c58704651bc
>>> 4e08d653bcaabe%7C6de70f0f73574529a415d8cbb7e93e5e%7C0%7C0%7C6367884
>>> 67119945424&amp;sdata=6jndN8yOGxm60y%2B2fUuWTZnNvAs967PL6KnoncXyb6w
>>> %3D&amp;reserved=0

-- 
Robert Hancock
Senior Software Developer
SED Systems
Phone: (306) 933-1567
Email: hancock@sedsystems.ca

^ permalink raw reply	[flat|nested] 11+ messages in thread

* iMX6 PCIe MSI issues
@ 2018-11-26 19:24       ` Robert Hancock
  0 siblings, 0 replies; 11+ messages in thread
From: Robert Hancock @ 2018-11-26 19:24 UTC (permalink / raw)
  To: linux-arm-kernel

On 2018-11-26 11:09 a.m., Trent Piepho wrote:
> There is a bug that appeared in 4.14 that will result in an MSI getting
> dropped if it occurs during or shortly after that/another MSI interrupt
> handler is run.  Obviously, then means one needs to get at least one
> MSI to work in the first place to see the bug!
> 
> Robert's description also has MSI status set in dwc msi status register
> (0x830), that would not be the case for the MSI race.
> 
> An interrupt is only passed up to the GIC on a 0->1 transition in the
> dwc msi status bit.  We see it's a 1 now, but was the GIC interrupt
> enabled when the transition happened?  It's not said below if that was
> checked.
> 
> Try clearing the status (write a *1* to the bit clear it) in the dwc
> msi status register, check that it is now zero, and then see if another
> MSI causes it to become set, and does that make it to the GIC?

I've tried that (writing ones to the status register, verifying it goes
to zero, raising another interrupt) and it doesn't seem to make it to
the GIC even though the status register has transitioned from zero to
non-zero.

> 
> If it does become set, but no irq to the GIC, then I have no idea what
> is there to stop it.  This part of the chip is not documented well.
> 
> Also, I think the new irq domain stuff in 4.17 breaks irq accounting to
> the GIC chain interrupt (152) to the dwc msi domain.  It'll always show
> as zero in /proc/interrupts.  But I've mostly been working in 4.16 so
> I'm not sure about the precise interaction of irq domains and
> /proc/interrupts yet.

I'm not actually seeing the MSI interrupt showing up in /proc/interrupts
at all in 4.19. From adding some debug output into the dwc PCIe code, it
appears it's using Linux IRQ 24 as the chaining interrupt, but there's
no entry in /proc/interrupts for either Linux IRQ 24 or GIC vector 152.
Not sure if there is supposed to be or not. It does appear that the
vector isn't masked in the GIC in any case, however, and when I force
the interrupt into the GIC pending register, things seem to happen
properly after that.

> 
> On Mon, 2018-11-26 at 14:31 -0200, Fabio Estevam wrote:
>> Adding Trent and Tim (as I think they managed to fix some imx6 MSI
>> issues)
>>
>> On Fri, Nov 23, 2018 at 8:17 PM Robert Hancock <hancock@sedsystems.ca
>>> wrote:
>>>
>>> I am working with a custom FPGA PCI Express endpoint connected to
>>> an NXP
>>> iMX6D processor running the 4.19.2 kernel. It seems happy using
>>> INTx
>>> interrupts but when trying to enable MSI the device driver is not
>>> receiving any interrupts.
>>>
>>> From some register poking I have figured out:
>>> -the MSI address set on the PCIe device is correctly set in the iMX
>>> MSI
>>> controller's MSI Controller Address register (0x1ffc820)
>>> -the interrupt vectors are enabled in the MSI controller's
>>> Interrupt
>>> Enable register (0x1ffc828)
>>> -the interrupt vectors are not masked in the MSI controller's
>>> Interrupt
>>> Mask register (0x1ffc82c)
>>> -The MSI controller's Interrupt Status register (0x1ffc830) shows
>>> that
>>> the requested interrupt vectors are pending
>>> -In the ARM GIC, vector 152 (for msi_ctrl_int) is enabled in the IS
>>> enable register (0x00a01110), but not set in the IS pending
>>> (0x00a01210)
>>> or IS active (0x00a01310) registers
>>> -Vector 152 is not masked in the GPC interrupt mask (0x00a01310)
>>> -Vector 152 is not active in the GPC interrupt status (0x00a01310)
>>>
>>> So it appears the MSI controller is receiving and recognizing the
>>> MSI
>>> from the device, but the interrupt is not making it into the GIC
>>> for
>>> some reason. If I manually set vector 152 to pending in the GIC,
>>> the
>>> dw_handle_msi_irq handler in pci-designware-host.c does get called
>>> along
>>> with the interrupt handler(s) for the PCIe device, so it appears
>>> the
>>> chain from that point on is working:
>>>
>>> # devmem 0x00a01210 32 0x1000000
>>>
>>> I found someone else reporting this in 2014 with an unknown kernel
>>> version on the NXP forums here, but with no resolution listed
>>> there:
>>>
>>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fco
>>> mmunity.nxp.com%2Fthread%2F318307&amp;data=02%7C01%7Ctpiepho%40impi
>>> nj.com%7Cb1e4af4c58704651bc4e08d653bcaabe%7C6de70f0f73574529a415d8c
>>> bb7e93e5e%7C0%7C0%7C636788467119945424&amp;sdata=I1b%2BZ1L99MErNA44
>>> JlffTejqZlFSWhSkLeSFmv830Rg%3D&amp;reserved=0
>>>
>>> Any ideas on what may be going wrong? My next step may be to try an
>>> older kernel version to see if this got broken at some point.
>>>
>>> --
>>> Robert Hancock
>>> Senior Software Developer
>>> SED Systems
>>> Email: hancock at sedsystems.ca
>>>
>>> _______________________________________________
>>> linux-arm-kernel mailing list
>>> linux-arm-kernel at lists.infradead.org
>>> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flis
>>> ts.infradead.org%2Fmailman%2Flistinfo%2Flinux-arm-
>>> kernel&amp;data=02%7C01%7Ctpiepho%40impinj.com%7Cb1e4af4c58704651bc
>>> 4e08d653bcaabe%7C6de70f0f73574529a415d8cbb7e93e5e%7C0%7C0%7C6367884
>>> 67119945424&amp;sdata=6jndN8yOGxm60y%2B2fUuWTZnNvAs967PL6KnoncXyb6w
>>> %3D&amp;reserved=0

-- 
Robert Hancock
Senior Software Developer
SED Systems
Phone: (306) 933-1567
Email: hancock at sedsystems.ca

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: iMX6 PCIe MSI issues
  2018-11-26 19:24       ` Robert Hancock
@ 2018-11-27 18:53         ` Trent Piepho
  -1 siblings, 0 replies; 11+ messages in thread
From: Trent Piepho @ 2018-11-27 18:53 UTC (permalink / raw)
  To: festevam, hancock, tharvey
  Cc: linux-arm-kernel, l.stach, linux-pci, hongxing.zhu

On Mon, 2018-11-26 at 13:24 -0600, Robert Hancock wrote:
> 
> > Also, I think the new irq domain stuff in 4.17 breaks irq accounting to
> > the GIC chain interrupt (152) to the dwc msi domain.  It'll always show
> > as zero in /proc/interrupts.  But I've mostly been working in 4.16 so
> > I'm not sure about the precise interaction of irq domains and
> > /proc/interrupts yet.
> 
> I'm not actually seeing the MSI interrupt showing up in /proc/interrupts
> at all in 4.19. From adding some debug output into the dwc PCIe code, it
> appears it's using Linux IRQ 24 as the chaining interrupt, but there's
> no entry in /proc/interrupts for either Linux IRQ 24 or GIC vector 152.
> Not sure if there is supposed to be or not. It does appear that the
> vector isn't masked in the GIC in any case, however, and when I force
> the interrupt into the GIC pending register, things seem to happen
> properly after that.

In 4.16, the MSI chaining interrupt does show up in /proc/interrupts
and does increment.  Also shows up as trace events too.

In 4.17, it no longer appears in /proc/interrupts.  Finding the Linux
irq number is non-obvious, as you've seen.  It will show up in
/sys/kernel/irq and /sys/kernel/debug/irq/irqs, but the count is always
zero.  IMHO, not an improvement.

So if you're using that count in /sys to determine that the GIC irq
never fired, then it's not conclusive.  It always reads zero.

But the same problem 2014 would obviously predate the 4.17 kernel.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* iMX6 PCIe MSI issues
@ 2018-11-27 18:53         ` Trent Piepho
  0 siblings, 0 replies; 11+ messages in thread
From: Trent Piepho @ 2018-11-27 18:53 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 2018-11-26 at 13:24 -0600, Robert Hancock wrote:
> 
> > Also, I think the new irq domain stuff in 4.17 breaks irq accounting to
> > the GIC chain interrupt (152) to the dwc msi domain.  It'll always show
> > as zero in /proc/interrupts.  But I've mostly been working in 4.16 so
> > I'm not sure about the precise interaction of irq domains and
> > /proc/interrupts yet.
> 
> I'm not actually seeing the MSI interrupt showing up in /proc/interrupts
> at all in 4.19. From adding some debug output into the dwc PCIe code, it
> appears it's using Linux IRQ 24 as the chaining interrupt, but there's
> no entry in /proc/interrupts for either Linux IRQ 24 or GIC vector 152.
> Not sure if there is supposed to be or not. It does appear that the
> vector isn't masked in the GIC in any case, however, and when I force
> the interrupt into the GIC pending register, things seem to happen
> properly after that.

In 4.16, the MSI chaining interrupt does show up in /proc/interrupts
and does increment.  Also shows up as trace events too.

In 4.17, it no longer appears in /proc/interrupts.  Finding the Linux
irq number is non-obvious, as you've seen.  It will show up in
/sys/kernel/irq and /sys/kernel/debug/irq/irqs, but the count is always
zero.  IMHO, not an improvement.

So if you're using that count in /sys to determine that the GIC irq
never fired, then it's not conclusive.  It always reads zero.

But the same problem 2014 would obviously predate the 4.17 kernel.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-11-27 18:53 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-23 22:17 iMX6 PCIe MSI issues Robert Hancock
2018-11-26  1:53 ` Richard Zhu
2018-11-26 16:22   ` Robert Hancock
2018-11-26 16:31 ` Fabio Estevam
2018-11-26 16:31   ` Fabio Estevam
2018-11-26 17:09   ` Trent Piepho
2018-11-26 17:09     ` Trent Piepho
2018-11-26 19:24     ` Robert Hancock
2018-11-26 19:24       ` Robert Hancock
2018-11-27 18:53       ` Trent Piepho
2018-11-27 18:53         ` Trent Piepho

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.