All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bharat Bhushan <bharat.bhushan@nxp.com>
To: Alex Williamson <alex.williamson@redhat.com>,
	Bjorn Helgaas <helgaas@kernel.org>
Cc: "linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"bharatb.yadav@gmail.com" <bharatb.yadav@gmail.com>,
	David Daney <david.daney@cavium.com>,
	Jan Glauber <jglauber@cavium.com>,
	Maik Broemme <mbroemme@libmpq.org>,
	Chris Blake <chrisrblake93@gmail.com>
Subject: RE: [PATCH] PCI: Mark NXP LS1088 to avoid bus reset bus
Date: Wed, 28 Nov 2018 04:31:56 +0000	[thread overview]
Message-ID: <VI1PR04MB48457E23AA3354316C7782D69AD10@VI1PR04MB4845.eurprd04.prod.outlook.com> (raw)
In-Reply-To: <20181127090830.084fedf1@x1.home>

Hi,

> -----Original Message-----
> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Tuesday, November 27, 2018 9:39 PM
> To: Bjorn Helgaas <helgaas@kernel.org>
> Cc: Bharat Bhushan <bharat.bhushan@nxp.com>; linux-pci@vger.kernel.org;
> linux-kernel@vger.kernel.org; bharatb.yadav@gmail.com; David Daney
> <david.daney@cavium.com>; Jan Glauber <jglauber@cavium.com>; Maik
> Broemme <mbroemme@libmpq.org>; Chris Blake
> <chrisrblake93@gmail.com>
> Subject: Re: [PATCH] PCI: Mark NXP LS1088 to avoid bus reset bus
> 
> On Tue, 27 Nov 2018 09:33:56 -0600
> Bjorn Helgaas <helgaas@kernel.org> wrote:
> 
> > [+cc David, Jan, Alex, Maik, Chris]
> >
> > On Tue, Nov 27, 2018 at 08:46:33AM +0000, Bharat Bhushan wrote:
> > > NXP (Freescale Vendor ID) LS1088 chips do not behave correctly after
> > > bus reset with e1000e. Link state of device does not comes UP and so
> > > config space never accessible again.
> >
> > Previous similar commits:
> >
> >   822155100e58 ("PCI: Mark Cavium CN8xxx to avoid bus reset")
> >   8e2e03179923 ("PCI: Mark Atheros AR9580 to avoid bus reset")
> >   9ac0108c2bac ("PCI: Mark Atheros AR9485 and QCA9882 to avoid bus
> reset")
> >   c3e59ee4e766 ("PCI: Mark Atheros AR93xx to avoid bus reset")
> >
> > 1) Please make your subject match (remove the spurious "bus" at the
> > end)

Will correct, added by mistake 

> >
> > 2) This should probably be marked for stable (v3.14 and later, since
> > the quirk itself appeared in v3.19 and marked for v3.14 and later
> > stable kernels).  Maybe even mark it as "Fixes: c3e59ee4e766..." to
> > connect it.

Ok,

> >
> > 3) The 1957:80c0 PCI ID doesn't appear in
> >
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpci
> -
> ids.ucw.cz%2F&amp;data=02%7C01%7Cbharat.bhushan%40nxp.com%7C296
> 02a2efa584249221808d65482945b%7C686ea1d3bc2b4c6fa92cd99c5c301635%7
> C0%7C0%7C636789317139032063&amp;sdata=3jkRMa1NljSCp%2BvZP0kgz7D
> PWPJZH8d7JXhCE5vCCMk%3D&amp;reserved=0; can you add it?
> >

Yes, I will add

> > 4) Is there a hardware erratum for this?  If so, please include the
> > URL here.

No h/w errata as of now.

> >
> > 5) Can you reproduce the problem using the same endpoint (e1000e) on a
> > different system with a different bridge?

I have multiple LS1088 boards and I can observe problem with all LS1088 boards.
While when I  uses same PCI device on other NXP board (LS2088) then it works fine.

> >
> > 6) Have you looked at this with a PCIe analyzer?  It would be very
> > interesting to compare the boot-time or system reboot path with the
> > individual bus reset path you're fixing.

I have not used PCIe analyzer, 

> >
> > Since there are several similar reports and they sometimes involve the
> > same devices (both your patch and 822155100e58 mention e1000e), I'm a
> > little suspicious that we're doing something wrong in the bus reset
> > path.
> 
> I agree, entirely excluding bus resets is not something to be taken lightly.  It's
> less than ideal for an endpoint and a fairly major functional gap for a
> downstream port.  It should really be considered a last resort.
> 
> > I think bus reset uses Secondary Bus Reset in the Bridge Control
> > register.  That's a generic mechanism that I would expect to be pretty
> > well-tested.  I suspect the BIOS probably uses it in the reboot path,
> > and the device probably works after that.
> >
> > So I wonder if the Linux delay isn't quite long enough, or our first
> > access to the device isn't quite right, e.g., maybe there's some issue
> > with the bus/device number capture (PCIe r4.0, sec 2.2.6.2).
> 
> Tweaking the delay would be a reasonable solution, though we are seeing
> some issues where users with lots of assigned devices that require bus
> resets experience long delays as vfio file descriptors are closed sequentially
> on exit.

In pci_reset_secondary_bus() I have tried to increase the delay after reset but not helped. 
Do I need to add delay at some other place as well? 

Thanks
-Bharat

>  So perhaps we could flag downstream ports requiring an extra delay,
> if that becomes a solution.  Your mention of the bus/device number also
> reminds me of the issue we saw on Threadripper where there were patches
> proposed to re-write the secondary and subordinate bus numbers after
> reset.  AMD was able to resolve that in a firmware update, but there could
> be something similar occurring here. Thanks,
> 
> Alex
> 
> > > Signed-off-by: Bharat Bhushan <Bharat.Bhushan@nxp.com>
> > > ---
> > >  drivers/pci/quirks.c | 7 +++++++
> > >  1 file changed, 7 insertions(+)
> > >
> > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index
> > > 4700d24e5d55..b9ae4e9f101a 100644
> > > --- a/drivers/pci/quirks.c
> > > +++ b/drivers/pci/quirks.c
> > > @@ -3391,6 +3391,13 @@
> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0033,
> quirk_no_bus_reset);
> > >   */
> > >  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_CAVIUM, 0xa100,
> > > quirk_no_bus_reset);
> > >
> > > +/*
> > > + * NXP (Freescale Vendor ID) LS1088 chips do not behave correctly
> > > +after
> > > + * bus reset. Link state of device does not comes UP and so config
> > > +space
> > > + * never accessible again.
> > > + */
> > > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_FREESCALE, 0x80c0,
> > > +quirk_no_bus_reset);
> > > +
> > >  static void quirk_no_pm_reset(struct pci_dev *dev)  {
> > >  	/*
> > > --
> > > 2.19.1
> > >


  reply	other threads:[~2018-11-28  4:32 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-27  8:46 [PATCH] PCI: Mark NXP LS1088 to avoid bus reset bus Bharat Bhushan
2018-11-27 15:33 ` Bjorn Helgaas
2018-11-27 16:08   ` Alex Williamson
2018-11-28  4:31     ` Bharat Bhushan [this message]
2018-11-28 20:15       ` Bjorn Helgaas
2018-11-30  5:29         ` Bharat Bhushan
2018-11-30  5:56           ` Alex Williamson
2018-11-30  6:24             ` Bharat Bhushan
2018-11-30 16:19               ` Alex Williamson
2018-12-03 11:01                 ` Bharat Bhushan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=VI1PR04MB48457E23AA3354316C7782D69AD10@VI1PR04MB4845.eurprd04.prod.outlook.com \
    --to=bharat.bhushan@nxp.com \
    --cc=alex.williamson@redhat.com \
    --cc=bharatb.yadav@gmail.com \
    --cc=chrisrblake93@gmail.com \
    --cc=david.daney@cavium.com \
    --cc=helgaas@kernel.org \
    --cc=jglauber@cavium.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mbroemme@libmpq.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.