linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Kelvin.Cao@microchip.com
Cc: kurt.schwemmer@microsemi.com, bhelgaas@google.com,
	kelvincao@outlook.com, logang@deltatee.com,
	linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org
Subject: Re: [PATCH 1/5] PCI/switchtec: Error out MRPC execution when no GAS access
Date: Thu, 7 Oct 2021 16:23:17 -0500	[thread overview]
Message-ID: <20211007212317.GA1268429@bhelgaas> (raw)
In-Reply-To: <54263d552d05f2fae706e83aa4c2b31b1983b0e2.camel@microchip.com>

On Wed, Oct 06, 2021 at 09:27:49PM +0000, Kelvin.Cao@microchip.com wrote:
> On Wed, 2021-10-06 at 15:20 -0500, Bjorn Helgaas wrote:
> > On Wed, Oct 06, 2021 at 07:00:55PM +0000, Kelvin.Cao@microchip.com
> > wrote:

> > So wait, you mean you just intentionally ask the firmware to
> > reset, knowing that the device will be unusable until the user
> > reboots or does a manual rescan?  And the way to improve this is
> > for the driver to report an error to the user instead of hanging?
> > I *guess* that might be some sort of improvement, but seems like a
> > pretty small one.
> 
> Yes, however, I believe it's something our users really like to
> have...  With this, they can do their user space
> programming/scripting more easily in a synchronous fashion.
> 
> > >   - The firwmare crashes and doesn't respond, which normally is
> > >   the reason for users to issue a firmware reset command to try
> > >   to recover it via either the driver or a sideband interface.
> > >   The firmware may not be able to recover by a reset in some
> > >   extream situations like hardware errors, so that an error
> > >   return is probably all the users can get before another level
> > >   of recovery happens.
> > > 
> > > So I'd think this patch is still making the driver better in
> > > some way.

OK.  I still think the fact that all these different mechanisms can
reset the device behind your back and make the switch and anything on
the other side of it just stop working is ..., well, let's just say
it's quite surprising to me.

Well, at least this isn't quite so much a mystery anymore and maybe we
can improve the commit log.  E.g., maybe something like this:

  A firmware hard reset may be initiated by various mechanisms
  including a UART interface, TWI sideband interface from BMC, MRPC
  command from userspace, etc.  The switchtec management driver is
  unaware of these resets.

  The reset clears PCI state including the BARs and Memory Space
  Enable bits, so the device no longer responds to the MMIO accesses
  the driver uses to operate it.

  MMIO reads to the device will fail with a PCIe error.  When the root
  complex handles that error, it typically fabricates ~0 data to
  complete the CPU read.

  Check for this sort of error by reading the device ID from MMIO
  space.  This ID can never be ~0, so if we see that value, it
  probably means the PCIe Memory Read failed and we should return an
  error indication to the application using the switchtec driver.

  reply	other threads:[~2021-10-07 21:23 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-24 11:08 [PATCH 0/5] Switchtec Fixes and Improvements kelvin.cao
2021-09-24 11:08 ` [PATCH 1/5] PCI/switchtec: Error out MRPC execution when no GAS access kelvin.cao
2021-10-01 20:18   ` Bjorn Helgaas
2021-10-01 20:29     ` Logan Gunthorpe
2021-10-01 23:49       ` Kelvin.Cao
2021-10-02 15:11         ` Bjorn Helgaas
2021-10-04 20:51           ` Kelvin.Cao
2021-10-05 20:11             ` Bjorn Helgaas
2021-10-06  0:37               ` Kelvin.Cao
2021-10-06  2:33                 ` Bjorn Helgaas
2021-10-06  5:49                   ` Kelvin.Cao
2021-10-06 14:19                     ` Bjorn Helgaas
2021-10-06 19:00                       ` Kelvin.Cao
2021-10-06 20:20                         ` Bjorn Helgaas
2021-10-06 21:27                           ` Kelvin.Cao
2021-10-07 21:23                             ` Bjorn Helgaas [this message]
2021-10-08  0:06                               ` Kelvin.Cao
2021-10-08 11:03                                 ` Bjorn Helgaas
2021-10-01 22:58     ` Kelvin.Cao
2021-10-01 23:52       ` Logan Gunthorpe
2021-10-02  0:05         ` Kelvin.Cao
2021-09-24 11:08 ` [PATCH 2/5] PCI/switchtec: Fix a MRPC error status handling issue kelvin.cao
2021-09-24 11:08 ` [PATCH 3/5] PCI/switchtec: Update the way of getting management VEP instance ID kelvin.cao
2021-09-24 11:08 ` [PATCH 4/5] PCI/switchtec: Replace ENOTSUPP with EOPNOTSUPP kelvin.cao
2021-09-24 11:08 ` [PATCH 5/5] PCI/switchtec: Add check of event support kelvin.cao
2021-09-24 15:53 ` [PATCH 0/5] Switchtec Fixes and Improvements Logan Gunthorpe
2021-09-25  5:27   ` Kelvin.Cao
2021-09-27 16:39 ` Bjorn Helgaas
2021-09-27 18:25   ` Kelvin.Cao
2021-10-08 17:05 ` Bjorn Helgaas
2021-10-08 17:23   ` Logan Gunthorpe
2021-10-08 18:25     ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211007212317.GA1268429@bhelgaas \
    --to=helgaas@kernel.org \
    --cc=Kelvin.Cao@microchip.com \
    --cc=bhelgaas@google.com \
    --cc=kelvincao@outlook.com \
    --cc=kurt.schwemmer@microsemi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=logang@deltatee.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).