All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Lukas Wunner <lukas@wunner.de>
Cc: Kuppuswamy Sathyanarayanan 
	<sathyanarayanan.kuppuswamy@linux.intel.com>,
	Sinan Kaya <okaya@kernel.org>, Ashok Raj <ashok.raj@intel.com>,
	Keith Busch <kbusch@kernel.org>,
	Yicong Yang <yangyicong@hisilicon.com>,
	linux-pci@vger.kernel.org, Russell Currey <ruscur@russell.cc>,
	Oliver OHalloran <oohall@gmail.com>,
	Stuart Hayes <stuart.w.hayes@gmail.com>,
	Mika Westerberg <mika.westerberg@linux.intel.com>
Subject: Re: [PATCH 1/4] PCI: pciehp: Ignore Link Down/Up caused by error-induced Hot Reset
Date: Mon, 11 Oct 2021 12:25:15 -0500	[thread overview]
Message-ID: <20211011172515.GA1663896@bhelgaas> (raw)
In-Reply-To: <20211010091407.GA13471@wunner.de>

On Sun, Oct 10, 2021 at 11:14:07AM +0200, Lukas Wunner wrote:
> On Thu, Oct 07, 2021 at 06:00:20PM -0500, Bjorn Helgaas wrote:
> > On Sat, Jul 31, 2021 at 02:39:01PM +0200, Lukas Wunner wrote:
> > > The issue isn't DPC-specific, it also occurs when an error is handled by
> > > AER through aer_root_reset().  So while the issue was noticed only now,
> > > it's been around since 2006 when AER support was first introduced.
> > > 
> > > Fixes: 6c2b374d7485 ("PCI-Express AER implemetation: AER core and aerdriver")
> [...]
> > > Cc: stable@vger.kernel.org # v2.6.19+: ba952824e6c1: PCI/portdrv: Report reset for frozen channel
> > 
> > Have you tried backporting this to v2.6.19?  I bet it's tough.  Not
> > sure we should suggest that stable pick this up unless there's a
> > reasonable path to do that.
> 
> The oldest kernel.org stable release that still receives updates is v4.4.
> There may be older distribution kernels out there which continue to be
> supported.  We don't know for sure, so I think it's customary to tag
> the release that introduced an issue and leave it to the stable and
> distribution maintainers to choose what they backport.
> 
> It's true that patches often do not apply cleanly to older releases.
> There are some good folks who regularly take up the thankless task of
> backporting (Sudip Mukherjee to name but one), but I've also frequently
> backported patches myself where necessary.
> 
> 
> > > +config PCI_ERROR_RECOVERY
> > > +	def_bool PCIEAER || EEH
> > 
> > I'm having a hard time connecting this to the code.
> [...]
> > But this still seems like it's kind of dangling.  It's not obvious to
> > me why pciehp_slot_reset() should be inside that #ifdef.  Do we need
> > the #ifdef for a functional reason, or is it there because we know it
> > will never be called?  If the latter, I don't think the savings are
> > worth it.
> 
> The motivation for the #ifdef was merely to reduce code size if neither
> of PCIEAER or EEH is enabled.  Happy to remove it.

That'd be great.

> I have a different question though.  We've often discussed deprecating
> portdrv and moving port services to the core instead.  I've stumbled
> across commit a39bd851dccf ("PCI/PM: Clear PCIe PME Status bit in core,
> not PCIe port driver"), wherein you moved pcie_pme_root_status_cleanup()
> from portdrv to the core, which allowed you to drop the ->resume_noirq
> callback from portdrv.
> 
> I've been thinking what moving port services to the core would look like
> and what your vision for that might be.  If that commit is any indication,
> it seems you'd probably prefer that pciehp_slot_reset() (as introduced in
> the present patch) is called directly from report_slot_reset() in
> drivers/pci/pcie/err.c.
> 
> That would eliminate much of the plumbing contained in the present patch
> to allow each port service driver to define a ->slot_reset callback and
> iterate over those callbacks.  pciehp is probably the only port service
> which requires special handling upon ->slot_reset and the same goes for
> a lot of the error handling and power management callbacks defined by
> port services.  So all that plumbing is probably just a very roundabout
> way of doing things.
> 
> When calling into port service code from core code, one needs to find
> the port service's driver data.  For now we can resort to
> pcie_port_find_device(), but I imagine once we move all port services
> to the core, we'll be able to access their data directly from
> struct pci_dev, e.g. via pdev->hotplug or pdev->pcie->hotplug pointers.

We managed to get rid of pcie_port_find_service() recently, and I'd
love to get rid of pcie_port_find_device() as well.  I'd like to get
rid of struct pcie_device, too, but that's visible via sysfs and would
be harder.  But maybe we could at least stop using it internally.

I don't know exactly what the ->slot_reset() path would look like, and
I'm not suggesting you need to change that for this patch.

Bjorn

  reply	other threads:[~2021-10-11 17:25 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-31 12:39 [PATCH 0/4] pciehp error recovery fix + cleanups Lukas Wunner
2021-07-31 12:39 ` [PATCH 1/4] PCI: pciehp: Ignore Link Down/Up caused by error-induced Hot Reset Lukas Wunner
2021-09-29 20:40   ` stuart hayes
2021-10-07 23:00   ` Bjorn Helgaas
2021-10-10  9:14     ` Lukas Wunner
2021-10-11 17:25       ` Bjorn Helgaas [this message]
2021-07-31 12:39 ` [PATCH 2/4] PCI/portdrv: Remove unused resume err_handler Lukas Wunner
2021-07-31 12:39 ` [PATCH 3/4] PCI/portdrv: Remove unused pcie_port_bus_{,un}register() declarations Lukas Wunner
2021-07-31 12:39 ` [PATCH 4/4] PCI/ERR: Reduce compile time for CONFIG_PCIEAER=n Lukas Wunner
2021-10-15 19:29 ` [PATCH 0/4] pciehp error recovery fix + cleanups Bjorn Helgaas
2021-10-16  9:06   ` Lukas Wunner
2021-10-16 14:23     ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211011172515.GA1663896@bhelgaas \
    --to=helgaas@kernel.org \
    --cc=ashok.raj@intel.com \
    --cc=kbusch@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=mika.westerberg@linux.intel.com \
    --cc=okaya@kernel.org \
    --cc=oohall@gmail.com \
    --cc=ruscur@russell.cc \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=stuart.w.hayes@gmail.com \
    --cc=yangyicong@hisilicon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.