linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Muni Sekhar <munisekharrms@gmail.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: pcie: xilinx: kernel hang - ISR readl()
Date: Sat, 18 Jan 2020 07:16:14 +0530	[thread overview]
Message-ID: <CAHhAz+jddRTi++jTXgPMMm8H0LQC=nz7kRLOodQGD0SRki7g=A@mail.gmail.com> (raw)
In-Reply-To: <20200109043505.GA223446@google.com>

On Thu, Jan 9, 2020 at 10:05 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Thu, Jan 09, 2020 at 08:47:51AM +0530, Muni Sekhar wrote:
> > On Thu, Jan 9, 2020 at 1:45 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > On Tue, Jan 07, 2020 at 09:45:13PM +0530, Muni Sekhar wrote:
> > > > Hi,
> > > >
> > > > I have module with Xilinx FPGA. It implements UART(s), SPI(s),
> > > > parallel I/O and interfaces them to the Host CPU via PCI Express bus.
> > > > I see that my system freezes without capturing the crash dump for
> > > > certain tests. I debugged this issue and it was tracked down to the
> > > > below mentioned interrupt handler code.
> > > >
> > > >
> > > > In ISR, first reads the Interrupt Status register using ‘readl()’ as
> > > > given below.
> > > >     status = readl(ctrl->reg + INT_STATUS);
> > > >
> > > >
> > > > And then clears the pending interrupts using ‘writel()’ as given blow.
> > > >         writel(status, ctrl->reg + INT_STATUS);
> > > >
> > > >
> > > > I've noticed a kernel hang if INT_STATUS register read again after
> > > > clearing the pending interrupts.
> > > >
> > > > Can someone clarify me why the kernel hangs without crash dump incase
> > > > if I read the INT_STATUS register using readl() after clearing the
> > > > pending bits?
> > > >
> > > > Can readl() block?
> > >
> > > readl() should not block in software.  Obviously at the hardware CPU
> > > instruction level, the read instruction has to wait for the result of
> > > the read.  Since that data is provided by the device, i.e., your FPGA,
> > > it's possible there's a problem there.
> >
> > Thank you very much for your reply.
> > Where can I find the details about what is protocol for reading the
> > ‘memory mapped IO’? Can you point me to any useful links..
> > I tried locate the exact point of the kernel code where CPU waits for
> > read instruction as given below.
> > readl() -> __raw_readl() -> return *(const volatile u32 __force *)add
> > Do I need to check for the assembly instructions, here?
>
> The C pointer dereference, e.g., "*address", will be some sort of a
> "load" instruction in assembly.  The CPU wait isn't explicit; it's
> just that when you load a value, the CPU waits for the value.
>
> > > Can you tell whether the FPGA has received the Memory Read for
> > > INT_STATUS and sent the completion?
I have not seen any ‘missing’ completions on the logic analyser. Is
there any other ways to debug this one?

> >
> > Is there a way to know this with the help of software debugging(either
> > enabling dynamic debugging or adding new debug prints)? Can you please
> > point some tools\hw needed to find this?
>
> You could learn this either via a PCIe analyzer (expensive piece of
> hardware) or possibly some logic in the FPGA that would log PCIe
> transactions in a buffer and make them accessible via some other
> interface (you mentioned it had parallel and other interfaces).
>
> > > On the architectures I'm familiar with, if a device doesn't respond,
> > > something would eventually time out so the CPU doesn't wait forever.
> >
> > What is timeout here? I mean how long CPU waits for completion? Since
> > this code runs from interrupt context, does it causes the system to
> > freeze if timeout is more?
>
> The Root Port should have a Completion Timeout.  This is required by
> the PCIe spec.  The *reporting* of the timeout is somewhat
> implementation-specific since the reporting is outside the PCIe
> domain.  I don't know the duration of the timeout, but it certainly
> shouldn't be long enough to look like a "system freeze".
>
> > lspci output:
> > $ lspci
> > 00:00.0 Host bridge: Intel Corporation Atom Processor Z36xxx/Z37xxx
> > Series SoC Transaction Register (rev 11)
> > 00:02.0 VGA compatible controller: Intel Corporation Atom Processor
> > Z36xxx/Z37xxx Series Graphics & Display (rev 11)
> > 00:13.0 SATA controller: Intel Corporation Atom Processor E3800 Series
> > SATA AHCI Controller (rev 11)
> > 00:14.0 USB controller: Intel Corporation Atom Processor
> > Z36xxx/Z37xxx, Celeron N2000 Series USB xHCI (rev 11)
> > 00:1a.0 Encryption controller: Intel Corporation Atom Processor
> > Z36xxx/Z37xxx Series Trusted Execution Engine (rev 11)
> > 00:1b.0 Audio device: Intel Corporation Atom Processor Z36xxx/Z37xxx
> > Series High Definition Audio Controller (rev 11)
> > 00:1c.0 PCI bridge: Intel Corporation Atom Processor E3800 Series PCI
> > Express Root Port 1 (rev 11)
> > 00:1c.2 PCI bridge: Intel Corporation Atom Processor E3800 Series PCI
> > Express Root Port 3 (rev 11)
> > 00:1c.3 PCI bridge: Intel Corporation Atom Processor E3800 Series PCI
> > Express Root Port 4 (rev 11)
> > 00:1d.0 USB controller: Intel Corporation Atom Processor Z36xxx/Z37xxx
> > Series USB EHCI (rev 11)
> > 00:1f.0 ISA bridge: Intel Corporation Atom Processor Z36xxx/Z37xxx
> > Series Power Control Unit (rev 11)
> > 00:1f.3 SMBus: Intel Corporation Atom Processor E3800 Series SMBus
> > Controller (rev 11)
> > 01:00.0 RAM memory: PLDA Device 5555
>
> Is this 01:00.0 device the FPGA?
>
> > 03:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network
> > Connection (rev 03)



-- 
Thanks,
Sekhar

  parent reply	other threads:[~2020-01-18  1:46 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-07 16:15 pcie: xilinx: kernel hang - ISR readl() Muni Sekhar
2020-01-08 14:33 ` Muni Sekhar
2020-01-08 20:15 ` Bjorn Helgaas
2020-01-09  3:17   ` Muni Sekhar
2020-01-09  4:35     ` Bjorn Helgaas
2020-01-09  4:50       ` Muni Sekhar
2020-01-18  1:46       ` Muni Sekhar [this message]
2020-01-28 17:40         ` Bjorn Helgaas
2020-01-30 16:07       ` Muni Sekhar
2020-01-30 19:00         ` Bjorn Helgaas
2020-01-31 11:33           ` David Laight
2020-01-31 16:34           ` Muni Sekhar
2020-01-31 20:46             ` Bjorn Helgaas
2020-02-01  3:14               ` Muni Sekhar
2020-02-01 18:29                 ` Bjorn Helgaas
2020-02-04 15:33                   ` Muni Sekhar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHhAz+jddRTi++jTXgPMMm8H0LQC=nz7kRLOodQGD0SRki7g=A@mail.gmail.com' \
    --to=munisekharrms@gmail.com \
    --cc=helgaas@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).