All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bjorn Helgaas <bhelgaas@google.com>
To: Xiangliang Yu <yuxiangl@marvell.com>
Cc: yxlraid <yxlraid@gmail.com>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/2] PCI: fix system hang issue of Marvell SATA host controller
Date: Fri, 8 Mar 2013 10:01:03 -0700	[thread overview]
Message-ID: <CAErSpo6ascpN=bBUUzkk5W3-Wee_1ZztAyTwnKd-tq=L_9Q3mw@mail.gmail.com> (raw)
In-Reply-To: <F766E4F80769BD478052FB6533FA745D25F436861C@SC-VEXCH4.marvell.com>

On Thu, Mar 7, 2013 at 11:51 PM, Xiangliang Yu <yuxiangl@marvell.com> wrote:
> Hi, Bjorn
>
>> >> > Fix system hang issue: if first accessed resource file of BAR0 ~
>> >> > BAR4, system will hang after executing lspci command
>> >>
>> >> This needs more explanation.  We've already read the BARs by the time
>> >> header quirks are run, so apparently it's not just the mere act of
>> >> accessing a BAR that causes a hang.
>> >>
>> >> We need to know exactly what's going on here.  For example, do BARs
>> >> 0-4 exist?  Does the device decode accesses to the regions described
>> >> by the BARs?  The PCI core has to know what resources the device uses,
>> >> so if the device decodes accesses, we can't just throw away the
>> >> start/end information.
>> > The BARs 0-4 is exist and the PCI device is enable IO space, but user access
>> the regions file by udevadm command with info parameter, the system will hang.
>> > Like this: udevadmin info --attribut-walk
>> --path=/sys/device/pci-device/000:*.
>> > Because the device is just AHCI host controller, don't need the BAR0 ~ 4 region
>> file.
>> > Is my explanation ok for the patch?
>>
>> No, I still don't know what causes the hang; I only know that udevadm
>> can trigger it.  I don't want to just paper over the problem until we
>> know what the root cause is.
>>
>> Does "lspci -H1 -vv" also cause a hang?  What about "setpci -s<dev>
>> BASE_ADDRESS_0"?  "setpci -H1 -s<dev> BASE_ADDRESS_0"?
> The commands are ok because the commands can't find the device after accessing IO port.
> The root cause is that accessing of IO port will make the chip go bad. So, the point of the patch is don't export capability of the IO accessing.

Ah, so the problem is not with accessing the BAR in config space.  The
problem is with accessing the I/O port space mapped by the BAR.  Is
that right?

Does "udevadm info --attribute-walk" really access the device address
space mapped by the BARs?  That seems surprising to me, and I don't
see any indication of it when I try it on an AHCI device on my system:

# udevadm info --attribute-walk --path=/sys/devices/pci0000:00/0000:00:1f.2

Udevadm info starts with the device specified by the devpath and then
walks up the chain of parent devices. It prints for every device
found, all possible attributes in the udev rules key format.
A rule to match, can be composed by the attributes of the device
and the attributes from one single parent device.

  looking at device '/devices/pci0000:00/0000:00:1f.2':
    KERNEL=="0000:00:1f.2"
    SUBSYSTEM=="pci"
    DRIVER=="ahci"
    ATTR{irq}=="40"
    ATTR{subsystem_vendor}=="0x17aa"
    ATTR{broken_parity_status}=="0"
    ATTR{class}=="0x010601"
    ATTR{consistent_dma_mask_bits}=="64"
    ATTR{dma_mask_bits}=="64"
    ATTR{local_cpus}=="00000000,00000000,00000000,00000000,00000000,00000000,00000000,0000000f"
    ATTR{device}=="0x3b2f"
    ATTR{enable}=="1"
    ATTR{msi_bus}==""
    ATTR{local_cpulist}=="0-3"
    ATTR{vendor}=="0x8086"
    ATTR{subsystem_device}=="0x2168"
    ATTR{numa_node}=="-1"

  looking at parent device '/devices/pci0000:00':
    KERNELS=="pci0000:00"
    SUBSYSTEMS==""
    DRIVERS==""

>> >> > ---
>> >> >  drivers/pci/quirks.c |   15 +++++++++++++++
>> >> >  1 files changed, 15 insertions(+), 0 deletions(-)
>> >> >
>> >> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>> >> > index 0369fb6..d49f8dc 100644
>> >> > --- a/drivers/pci/quirks.c
>> >> > +++ b/drivers/pci/quirks.c
>> >> > @@ -44,6 +44,21 @@ static void quirk_mmio_always_on(struct pci_dev *dev)
>> >> >  DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_ANY_ID, PCI_ANY_ID,
>> >> >                                 PCI_CLASS_BRIDGE_HOST, 8,
>> >> quirk_mmio_always_on);
>> >> >
>> >> > +/* The BAR0 ~ BAR4 of Marvell 9125 device can't be accessed
>> >> > +*  by IO resource file, and need to skip the files
>> >> > +*/
>> >> > +static void quirk_marvell_mask_bar(struct pci_dev *dev)
>> >> > +{
>> >> > +       int i;
>> >> > +
>> >> > +       for (i = 0; i < 5; i++)
>> >> > +               if (dev->resource[i].start)
>> >> > +                       dev->resource[i].start =
>> >> > +                               dev->resource[i].end = 0;
>> >> > +}
>> >> > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9125,
>> >> > +                               quirk_marvell_mask_bar);
>> >> > +
>> >> >  /* The Mellanox Tavor device gives false positive parity errors
>> >> >   * Mark this device with a broken_parity_status, to allow
>> >> >   * PCI scanning code to "skip" this now blacklisted device.
>> >> > --
>> >> > 1.7.5.4
>> >> >

  reply	other threads:[~2013-03-08 17:01 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-07 14:29 [PATCH 2/2] PCI: fix system hang issue of Marvell SATA host controller yxlraid
2013-03-07 16:28 ` Bjorn Helgaas
2013-03-08  3:07   ` Xiangliang Yu
2013-03-08  4:19     ` Bjorn Helgaas
2013-03-08  6:51       ` Xiangliang Yu
2013-03-08 17:01         ` Bjorn Helgaas [this message]
2013-03-09 14:49           ` Xiangliang Yu
2013-03-09 23:24             ` Myron Stowe
     [not found]               ` <F766E4F80769BD478052FB6533FA745D25F440A64D@SC-VEXCH4.marvell.com>
2013-03-11 21:19                 ` Myron Stowe
     [not found]                   ` <F766E4F80769BD478052FB6533FA745D25F440A9C6@SC-VEXCH4.marvell.com>
2013-03-12 16:21                     ` Bjorn Helgaas
2013-03-13  9:40                       ` Xiangliang Yu
2013-03-14 15:03                         ` Myron Stowe
2013-03-17  0:13                           ` Myron Stowe
2013-03-21 16:00                             ` Myron Stowe
2013-03-09  3:18         ` Myron Stowe
2013-03-14  4:16           ` Robert Hancock
2013-03-14 15:02             ` Myron Stowe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAErSpo6ascpN=bBUUzkk5W3-Wee_1ZztAyTwnKd-tq=L_9Q3mw@mail.gmail.com' \
    --to=bhelgaas@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=yuxiangl@marvell.com \
    --cc=yxlraid@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.