linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Kuppuswamy Sathyanarayanan  <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	ashok.raj@intel.com, Olof Johansson <olof@lixom.net>
Subject: Re: [PATCH v17 11/12] PCI/DPC: Add Error Disconnect Recover (EDR) support
Date: Fri, 6 Mar 2020 17:23:08 -0600	[thread overview]
Message-ID: <20200306232308.GA254242@google.com> (raw)
In-Reply-To: <90e97009-29ae-f807-406d-59cefe7e6d3f@linux.intel.com>

On Fri, Mar 06, 2020 at 02:42:14PM -0800, Kuppuswamy Sathyanarayanan wrote:
> On 3/6/20 1:00 PM, Bjorn Helgaas wrote:
> > On Thu, Mar 05, 2020 at 10:32:33PM -0800, Kuppuswamy, Sathyanarayanan wrote:
> > > On 3/5/2020 7:47 PM, Bjorn Helgaas wrote:
> > > > On Tue, Mar 03, 2020 at 06:36:34PM -0800, sathyanarayanan.kuppuswamy@linux.intel.com wrote:
> > > > > From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> > > > > +void pci_acpi_add_edr_notifier(struct pci_dev *pdev)
> > > > > +{
> > > > > +	struct acpi_device *adev = ACPI_COMPANION(&pdev->dev);
> > > > > +	acpi_status astatus;
> > > > > +
> > > > > +	if (!adev) {
> > > > > +		pci_dbg(pdev, "No valid ACPI node, so skip EDR init\n");
> > > > > +		return;
> > > > > +	}
> > > > > +
> > > > > +	/*
> > > > > +	 * Per the Downstream Port Containment Related Enhancements ECN to
> > > > > +	 * the PCI Firmware Spec, r3.2, sec 4.5.1, table 4-6, EDR support
> > > > > +	 * can only be enabled if DPC is controlled by firmware.
> > > > > +	 *
> > > > > +	 * TODO: Remove dependency on ACPI FIRMWARE_FIRST bit to
> > > > > +	 * determine ownership of DPC between firmware or OS.
> > > > > +	 * Per the Downstream Port Containment Related Enhancements
> > > > > +	 * ECN to the PCI Firmware Spec, r3.2, sec 4.5.1, table 4-5,
> > > > > +	 * OS can use bit 7 of _OSC control field to negotiate control
> > > > > +	 * over DPC Capability.
> > > > > +	 */
> > > > > +	if (!pcie_aer_get_firmware_first(pdev) || pcie_ports_dpc_native) {
> > > > > +		pci_dbg(pdev, "OS handles AER/DPC, so skip EDR init\n");
> > > > > +		return;
> > > > > +	}
> > > > > +
> > > > > +	astatus = acpi_install_notify_handler(adev->handle, ACPI_SYSTEM_NOTIFY,
> > > > > +					      edr_handle_event, pdev);
> > > > 
> > > > It does not say anything about "EDR notification only being
> > > > used if firmware owns DPC."
> > > > 
> > > > We should install an EDR notify handler because we told the
> > > > firmware that we support EDR notifications.  I don't think we
> > > > should make it any more complicated than that.
>
> I agree with your above statement. Since we told firmware *we
> support* EDR notification, we should make that true by installing
> the notification handler unconditionally.
> 
> But, based on inferences from PCI FW 3.2 ECN-DPC spec, current use
> case of EDR notification is only to handle error recovery for the
> case where DPC is owned by firmware and firmware sends EDR event. if
> you agree with above comment, is it alright if we add the following
> check in EDR notification handler ?
> 
> Although spec does not restrict it, current tested use case of EDR
> is to handle notification for firmware DPC case.
> 
> 218         if (!pcie_aer_get_firmware_first(pdev) || pcie_ports_dpc_native
> || (host->native_dpc))
> 219                 return;

No, I do not think we should add a check like this.  There's no basis
in the spec for doing this.  pcie_aer_get_firmware_first() looks at
HEST, which isn't mentioned at all in relation to EDR.  Checks like
this make it really hard to understand the code, and I don't believe
in making things fail simply because we haven't tested the scenario.

> > > Also check the following reference from section 2 of EDR ECN. It also
> > > clarifies EDR feature is only used when firmware owns DPC.
> > > 
> > >      PCIe Base Specification suggests that Downstream Port Containment
> > >      may be controlled either by the Firmware or the Operating System. It
> > >      also suggests that the Firmware retain ownership of Downstream Port
> > >      Containment if it also owns AER. When the Firmware owns Downstream
> > >      Port Containment, *it is expected to use the new “Error Disconnect
> > >      Recover” notification to alert OSPM of a Downstream Port Containment
> > >      event*.
> > The text in section 2 will not become part of the spec, so we can't
> > rely on it to tell us how to implement things.  Even if it did, this
> > section does not say "OS should only install an EDR notify handler if
> > firmware owns DPC."  It just means that if firmware owns DPC, the OS
> > will not learn about DPC events directly via DPC interrupts, so
> > firmware has to use another mechanism, e.g., EDR, to tell the OS about
> > them.
> > 
> > If an OS requests DPC control, it must support both DPC and EDR
> > (sec 4.5.2.4).  However, I think an OS may support EDR but not DPC
> > (although your patches don't support this configuration).
>
> Any use cases for above configuration ? Current PCI FW 3.2 ECN-DPC
> spec does not mention any uses cases where EDR can be used outside
> the scope of DPC ?
> 
> If required I can add this support. It should be easy to add it. In
> non DPC case, EDR notification handler would mostly be empty. Please
> let me know if you want me add this part of next patch set.

I don't think there's a need to add support for this.  I just
mentioned it as part of the point that we shouldn't tie EDR to DPC
unnecessarily.

> > > Although installing them when OS owns DPC should not affect
> > > anything, it also opens up a additional way for firmware to mess
> > > up things. For example, consider a case when firmware gives OS
> > > control of DPC, but still sends EDR notification to OS. Although
> > > it's unrealistic, I am just giving an example.
>
> > Can you outline the problem that occurs in this scenario?  It
> > seems like the EDR notify handler could still work.  The OS can
> > access DPC at any time (not just during the EDR window).
>
> When OS owns DPC and firmware sends a EDR event, it could create
> race between DPC interrupt handler and EDR event handler. Although
> from hardware perspective it should not make difference, since both
> code paths does the same thing.

Yes, that's true.  I think we should wait until there is a problem
here before doing anything.

> > > > I don't think we should even test pcie_ports_dpc_native here.  If we
> > > > told the platform we can handle EDR notifications, we should be
> > > > prepared to get them, regardless of whether the user booted with
> > > > "pcie_ports=dpc-native".
> > > As per the command line parameter documentation, setting
> > > pcie_ports=dpc-native means, we will be using native PCIe service
> > > for DPC.  So if DPC is handled by OS, as per my argument mentioned
> > > above (EDR is only useful if DPC handled by firmware), there is no
> > > use in installing EDR notification.
> > > 
> > > https://github.com/torvalds/linux/blob/master/Documentation/admin-guide/kernel-parameters.txt#L3642
> > > 
> > > dpc-native - Use native PCIe service for DPC only.
> > It doesn't hurt anything to install a notify handler that never
> > receives a notification.  It might be an issue if we tell firmware
> > we're prepared for notifications but we don't install a handler.
> Agreed. Shall I send another version with this and "static inline" fix ?

No need.  Just take a look at my review/edr branch.  I intend to tweak
some commit logs and (maybe) make the "clear status" functions void
since there are only one or two minor uses of the return values.  But
it's pretty much what I hope to merge.

Bjorn

  reply	other threads:[~2020-03-06 23:23 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-04  2:36 [PATCH v17 00/12] Add Error Disconnect Recover (EDR) support sathyanarayanan.kuppuswamy
2020-03-04  2:36 ` [PATCH v17 01/12] PCI/ERR: Update error status after reset_link() sathyanarayanan.kuppuswamy
2020-03-04  2:36 ` [PATCH v17 02/12] PCI/AER: Move pci_cleanup_aer_error_status_regs() declaration to pci.h sathyanarayanan.kuppuswamy
2020-03-04  2:36 ` [PATCH v17 03/12] PCI/ERR: Remove service dependency in pcie_do_recovery() sathyanarayanan.kuppuswamy
2020-03-17 14:40   ` Christoph Hellwig
2020-03-04  2:36 ` [PATCH v17 04/12] PCI: portdrv: remove unnecessary pcie_port_find_service() sathyanarayanan.kuppuswamy
2020-03-04  2:36 ` [PATCH v17 05/12] PCI: portdrv: remove reset_link member from pcie_port_service_driver sathyanarayanan.kuppuswamy
2020-03-17 14:41   ` Christoph Hellwig
2020-03-17 14:55     ` Kuppuswamy, Sathyanarayanan
2020-03-04  2:36 ` [PATCH v17 06/12] Documentation: PCI: Remove reset_link references sathyanarayanan.kuppuswamy
2020-03-17 14:42   ` Christoph Hellwig
2020-03-17 15:05     ` Kuppuswamy, Sathyanarayanan
2020-03-17 15:07       ` Christoph Hellwig
2020-03-17 16:03         ` Bjorn Helgaas
2020-03-17 17:06           ` Christoph Hellwig
2020-03-19 22:52             ` Bjorn Helgaas
2020-03-04  2:36 ` [PATCH v17 07/12] PCI/ERR: Return status of pcie_do_recovery() sathyanarayanan.kuppuswamy
2020-03-04  2:36 ` [PATCH v17 08/12] PCI/DPC: Cache DPC capabilities in pci_init_capabilities() sathyanarayanan.kuppuswamy
2020-03-04  2:36 ` [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode sathyanarayanan.kuppuswamy
2020-03-06  5:45   ` Kuppuswamy, Sathyanarayanan
2020-03-06 16:04     ` Bjorn Helgaas
2020-03-06 16:11       ` Kuppuswamy, Sathyanarayanan
2020-03-06 16:41         ` Bjorn Helgaas
2020-03-10  2:40   ` Bjorn Helgaas
2020-03-10  4:28     ` Kuppuswamy, Sathyanarayanan
2020-03-10 18:14       ` Austin.Bolen
2020-03-10 19:32         ` Bjorn Helgaas
2020-03-10 20:06           ` Austin.Bolen
2020-03-10 20:41             ` Kuppuswamy Sathyanarayanan
2020-03-10 20:49               ` Austin.Bolen
2020-03-11 14:45             ` Bjorn Helgaas
2020-03-11 15:19               ` Austin.Bolen
2020-03-11 17:12                 ` Bjorn Helgaas
2020-03-11 17:27                   ` Austin.Bolen
2020-03-11 20:33                     ` Bjorn Helgaas
2020-03-11 21:25                       ` Kuppuswamy Sathyanarayanan
2020-03-11 21:53                         ` Austin.Bolen
2020-03-11 22:11                           ` Kuppuswamy Sathyanarayanan
2020-03-11 22:23                             ` Bjorn Helgaas
2020-03-11 23:07                               ` Kuppuswamy Sathyanarayanan
2020-03-12 19:53                                 ` Bjorn Helgaas
2020-03-12 21:02                                   ` Austin.Bolen
2020-03-12 21:29                                     ` Kuppuswamy Sathyanarayanan
2020-03-12 21:52                                       ` Bjorn Helgaas
2020-03-12 22:02                                         ` Kuppuswamy Sathyanarayanan
2020-03-12 22:36                                           ` Bjorn Helgaas
2020-03-12 21:59                                   ` Kuppuswamy Sathyanarayanan
2020-03-12 22:32                                     ` Bjorn Helgaas
2020-03-13  6:22                                       ` Kuppuswamy, Sathyanarayanan
2020-03-13 19:28                                         ` Bjorn Helgaas
2020-03-13 20:26                                           ` Kuppuswamy Sathyanarayanan
2020-03-19 23:03                                             ` Bjorn Helgaas
2020-03-19 23:20                                               ` Kuppuswamy, Sathyanarayanan
2020-03-11 22:13                         ` Bjorn Helgaas
2020-03-11 22:41                           ` Kuppuswamy Sathyanarayanan
2020-03-11 18:12                   ` Kuppuswamy Sathyanarayanan
2020-03-11 22:05             ` Bjorn Helgaas
2020-03-04  2:36 ` [PATCH v17 10/12] PCI/DPC: Export DPC error recovery functions sathyanarayanan.kuppuswamy
2020-03-17 14:43   ` Christoph Hellwig
2020-03-04  2:36 ` [PATCH v17 11/12] PCI/DPC: Add Error Disconnect Recover (EDR) support sathyanarayanan.kuppuswamy
2020-03-06  3:47   ` Bjorn Helgaas
2020-03-06  6:32     ` Kuppuswamy, Sathyanarayanan
2020-03-06 21:00       ` Bjorn Helgaas
2020-03-06 22:42         ` Kuppuswamy Sathyanarayanan
2020-03-06 23:23           ` Bjorn Helgaas [this message]
2020-03-07  0:19             ` Kuppuswamy Sathyanarayanan
2020-03-04  2:36 ` [PATCH v17 12/12] PCI/ACPI: Enable EDR support sathyanarayanan.kuppuswamy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200306232308.GA254242@google.com \
    --to=helgaas@kernel.org \
    --cc=ashok.raj@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=olof@lixom.net \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).