linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Lukas Wunner <lukas@wunner.de>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Kuppuswamy Sathyanarayanan 
	<sathyanarayanan.kuppuswamy@linux.intel.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Linux PCI <linux-pci@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"Raj, Ashok" <ashok.raj@intel.com>,
	Keith Busch <kbusch@kernel.org>,
	knsathya@kernel.org, Sinan Kaya <okaya@kernel.org>
Subject: Re: [PATCH v2 1/1] PCI: pciehp: Skip DLLSC handling if DPC is triggered
Date: Wed, 17 Mar 2021 06:31:14 +0100	[thread overview]
Message-ID: <20210317053114.GA32370@wunner.de> (raw)
In-Reply-To: <CAPcyv4jxTcUEgcfPRckHqrUPy8gR7ZJsxDaeU__pSq6PqJERAQ@mail.gmail.com>

On Tue, Mar 16, 2021 at 10:08:31PM -0700, Dan Williams wrote:
> On Tue, Mar 16, 2021 at 9:14 PM Lukas Wunner <lukas@wunner.de> wrote:
> >
> > On Fri, Mar 12, 2021 at 07:32:08PM -0800, sathyanarayanan.kuppuswamy@linux.intel.com wrote:
> > > +     if ((events == PCI_EXP_SLTSTA_DLLSC) && is_dpc_reset_active(pdev)) {
> > > +             ctrl_info(ctrl, "Slot(%s): DLLSC event(DPC), skipped\n",
> > > +                       slot_name(ctrl));
> > > +             ret = IRQ_HANDLED;
> > > +             goto out;
> > > +     }
> >
> > Two problems here:
> >
> > (1) If recovery fails, the link will *remain* down, so there'll be
> >     no Link Up event.  You've filtered the Link Down event, thus the
> >     slot will remain in ON_STATE even though the device in the slot is
> >     no longer accessible.  That's not good, the slot should be brought
> >     down in this case.
> 
> Can you elaborate on why that is "not good" from the end user
> perspective? From a driver perspective the device driver context is
> lost and the card needs servicing. The service event starts a new
> cycle of slot-attention being triggered and that syncs the slot-down
> state at that time.

All of pciehp's code assumes that if the link is down, the slot must be
off.  A slot which is in ON_STATE for a prolonged period of time even
though the link is down is an oddity the code doesn't account for.

If the link goes down, the slot should be brought into OFF_STATE.
(It's okay though to delay bringdown until DPC recovery has completed
unsuccessfully, which is what the patch I'm proposing does.)

I don't understand what you mean by "service event".  Someone unplugging
and replugging the NVMe drive?


> > (2) If recovery succeeds, there's a race where pciehp may call
> >     is_dpc_reset_active() *after* dpc_reset_link() has finished.
> >     So both the DPC Trigger Status bit as well as pdev->dpc_reset_active
> >     will be cleared.  Thus, the Link Up event is not filtered by pciehp
> >     and the slot is brought down and back up even though DPC recovery
> >     was succesful, which seems undesirable.
> 
> The hotplug driver never saw the Link Down, so what does it do when
> the slot transitions from Link Up to Link Up? Do you mean the Link
> Down might fire after the dpc recovery has completed if the hotplug
> notification was delayed?

If the Link Down is filtered and the Link Up is not, pciehp will
bring down the slot and then bring it back up.  That's because pciehp
can't really tell whether a DLLSC event is Link Up or Link Down.

It just knows that the link was previously up, is now up again,
but must have been down intermittently, so transactions to the
device in the slot may have been lost and the slot is therefore
brought down for safety.  Because the link is up, it is then
brought back up.

Thanks,

Lukas

  reply	other threads:[~2021-03-17  5:31 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-13  3:32 [PATCH v2 1/1] PCI: pciehp: Skip DLLSC handling if DPC is triggered sathyanarayanan.kuppuswamy
2021-03-13  3:35 ` Kuppuswamy, Sathyanarayanan
2021-03-17  4:13 ` Lukas Wunner
2021-03-17  5:08   ` Dan Williams
2021-03-17  5:31     ` Lukas Wunner [this message]
2021-03-17 16:31       ` Dan Williams
2021-03-17 17:19         ` Sathyanarayanan Kuppuswamy Natarajan
2021-03-17 17:45           ` Dan Williams
2021-03-17 17:54             ` Sathyanarayanan Kuppuswamy Natarajan
2021-03-17 19:01               ` Lukas Wunner
2021-03-17 20:02                 ` Kuppuswamy, Sathyanarayanan
2021-03-18 15:35                   ` Sinan Kaya
2021-03-28  9:53                   ` Lukas Wunner
2021-03-17 19:09             ` Lukas Wunner
2021-03-17 19:22               ` Raj, Ashok
2021-03-17 19:40                 ` Lukas Wunner
2021-03-28  5:49   ` Kuppuswamy, Sathyanarayanan
2021-03-28  9:07     ` Lukas Wunner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210317053114.GA32370@wunner.de \
    --to=lukas@wunner.de \
    --cc=ashok.raj@intel.com \
    --cc=bhelgaas@google.com \
    --cc=dan.j.williams@intel.com \
    --cc=kbusch@kernel.org \
    --cc=knsathya@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=okaya@kernel.org \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).