From: Bjorn Helgaas <helgaas@kernel.org>
To: Kelvin.Cao@microchip.com
Cc: kurt.schwemmer@microsemi.com, bhelgaas@google.com,
linux-pci@vger.kernel.org, kelvincao@outlook.com,
logang@deltatee.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/5] PCI/switchtec: Error out MRPC execution when no GAS access
Date: Tue, 5 Oct 2021 21:33:10 -0500 [thread overview]
Message-ID: <20211006023310.GA1137022@bhelgaas> (raw)
In-Reply-To: <e7654b51f2a28e033200c6de9c0a2d9c53c646d3.camel@microchip.com>
On Wed, Oct 06, 2021 at 12:37:02AM +0000, Kelvin.Cao@microchip.com wrote:
> On Tue, 2021-10-05 at 15:11 -0500, Bjorn Helgaas wrote:
> > On Mon, Oct 04, 2021 at 08:51:06PM +0000, Kelvin.Cao@microchip.com
> > wrote:
> > > On Sat, 2021-10-02 at 10:11 -0500, Bjorn Helgaas wrote:
> > > > I *thought* the problem was that the PCIe Memory Read failed
> > > > and the Root Complex fabricated ~0 data to complete the CPU
> > > > read. But now I'm not sure, because it sounds like it might
> > > > be that the PCIe transaction succeeds, but it reads data that
> > > > hasn't been updated by the firmware, i.e., it reads 'in
> > > > progress' because firmware hasn't updated it to 'done'.
> > >
> > > The original message was sort of misleading. After a firmware
> > > reset, CPU getting ~0 for the PCIe Memory Read doesn't explain
> > > the hang. In a MRPC execution (DMA MRPC mode), the MRPC status
> > > which is located in the host memory, gets initialized by the CPU
> > > and updated/finalized by the firmware. In the situation of a
> > > firmware reset, any MRPC initiated afterwards will not get the
> > > status updated by the firmware per the reason you pointed out
> > > above (or similar, to my understanding, firmware can no longer
> > > DMA data to host memory in such cases), therefore the MRPC
> > > execution will never end.
> >
> > I'm glad this makes sense to you, because it still doesn't to me.
> >
> > check_access() does an MMIO read to something in BAR0. If that
> > read returns ~0, it means either the PCIe Memory Read was
> > successful and the Switchtec device supplied ~0 data (maybe
> > because firmware has not initialized that part of the BAR) or the
> > PCIe Memory Read failed and the root complex fabricated the ~0
> > data.
> >
> > I'd like to know which one is happening so we can clarify the
> > commit log text about "MRPC command executions hang indefinitely"
> > and "host wil fail all GAS reads." It's not clear whether these
> > are PCIe protocol issues or driver/firmware interaction issues.
>
> I think it's the latter case, the ~0 data was fabricated by the root
> complex, as the MMIO read in check_access() always returns ~0 until
> a reboot or a rescan happens.
If the root complex fabricates ~0, that means a PCIe transaction
failed, i.e., the device didn't respond. Rescan only does config
reads and writes. Why should that cause the PCIe transactions to
magically start working?
next prev parent reply other threads:[~2021-10-06 2:33 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-24 11:08 [PATCH 0/5] Switchtec Fixes and Improvements kelvin.cao
2021-09-24 11:08 ` [PATCH 1/5] PCI/switchtec: Error out MRPC execution when no GAS access kelvin.cao
2021-10-01 20:18 ` Bjorn Helgaas
2021-10-01 20:29 ` Logan Gunthorpe
2021-10-01 23:49 ` Kelvin.Cao
2021-10-02 15:11 ` Bjorn Helgaas
2021-10-04 20:51 ` Kelvin.Cao
2021-10-05 20:11 ` Bjorn Helgaas
2021-10-06 0:37 ` Kelvin.Cao
2021-10-06 2:33 ` Bjorn Helgaas [this message]
2021-10-06 5:49 ` Kelvin.Cao
2021-10-06 14:19 ` Bjorn Helgaas
2021-10-06 19:00 ` Kelvin.Cao
2021-10-06 20:20 ` Bjorn Helgaas
2021-10-06 21:27 ` Kelvin.Cao
2021-10-07 21:23 ` Bjorn Helgaas
2021-10-08 0:06 ` Kelvin.Cao
2021-10-08 11:03 ` Bjorn Helgaas
2021-10-01 22:58 ` Kelvin.Cao
2021-10-01 23:52 ` Logan Gunthorpe
2021-10-02 0:05 ` Kelvin.Cao
2021-09-24 11:08 ` [PATCH 2/5] PCI/switchtec: Fix a MRPC error status handling issue kelvin.cao
2021-09-24 11:08 ` [PATCH 3/5] PCI/switchtec: Update the way of getting management VEP instance ID kelvin.cao
2021-09-24 11:08 ` [PATCH 4/5] PCI/switchtec: Replace ENOTSUPP with EOPNOTSUPP kelvin.cao
2021-09-24 11:08 ` [PATCH 5/5] PCI/switchtec: Add check of event support kelvin.cao
2021-09-24 15:53 ` [PATCH 0/5] Switchtec Fixes and Improvements Logan Gunthorpe
2021-09-25 5:27 ` Kelvin.Cao
2021-09-27 16:39 ` Bjorn Helgaas
2021-09-27 18:25 ` Kelvin.Cao
2021-10-08 17:05 ` Bjorn Helgaas
2021-10-08 17:23 ` Logan Gunthorpe
2021-10-08 18:25 ` Bjorn Helgaas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211006023310.GA1137022@bhelgaas \
--to=helgaas@kernel.org \
--cc=Kelvin.Cao@microchip.com \
--cc=bhelgaas@google.com \
--cc=kelvincao@outlook.com \
--cc=kurt.schwemmer@microsemi.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=logang@deltatee.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).