linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Jianmin Lv <lvjianmin@loongson.cn>
Cc: "Huacai Chen" <chenhuacai@loongson.cn>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"Lorenzo Pieralisi" <lorenzo.pieralisi@arm.com>,
	"Rob Herring" <robh@kernel.org>,
	"Krzysztof Wilczyński" <kw@linux.com>,
	linux-pci@vger.kernel.org, "Xuefeng Li" <lixuefeng@loongson.cn>,
	"Huacai Chen" <chenhuacai@gmail.com>,
	"Jiaxun Yang" <jiaxun.yang@flygoat.com>
Subject: Re: [PATCH V14 4/7] PCI: loongson: Don't access non-existant devices
Date: Tue, 28 Jun 2022 11:04:02 -0500	[thread overview]
Message-ID: <20220628160402.GA1842175@bhelgaas> (raw)
In-Reply-To: <4dbddb05-a0b4-047e-8784-c89279221f20@loongson.cn>

On Tue, Jun 28, 2022 at 09:03:02PM +0800, Jianmin Lv wrote:
> On 2022/6/28 上午5:38, Bjorn Helgaas wrote:
> > On Fri, Jun 17, 2022 at 03:43:27PM +0800, Huacai Chen wrote:
> > > On LS2K/LS7A, some non-existant devices don't return 0xffffffff when
> > > scanning. This is a hardware flaw but we can only avoid it by software
> > > now.
> > 
> > We should say what *does* happen if we do a config read to a device
> > that doesn't exit.  Machine check, hang, etc?
> 
> The device is a hidden device(only for debug) that should not be
> scanned. If scanned in a non-normal way, the machine is hang(one
> case in ltp pci test can trigger the issue, which is explained
> below).

Reading the Vendor ID is the *normal* way to scan for a device.  It
seems that this hardware just hangs in some cases when the device
doesn't exist.

> > Generally speaking we only probe for functions > 0 if .0 is marked as
> > multi-function, so I guess this means 00:09.0 is marked as a
> > multi-function device, but config reads to 00:09.1 would fail?
> 
> Yes, definitely. Actually, the 00:09.0 is a single device, so fun1(09.1)
> will not be scanned(e.g. the fun1 will be not scanned on pci enumeration
> during kernel booting).
> 
> But, there is one situation: when running ltp pci test case on LS7A,
> the 00:08.2 is a sata controller(a valid device), and the bus number(0)
> and devfn(0x42) are inputted to kernel api pci_scan_slot(), which has
> clear note: devfn must have zero function. So, apparently, the inputted
> devfn's function is not zero, but 2, and then in the pci_scan_slot():
> 
>         for (fn = next_fn(bus, dev, 0); fn > 0; fn = next_fn(bus, dev, fn))
> {
>                 dev = pci_scan_single_device(bus, devfn + fn);
>                 ...
>         }
> 
> 08.2,08.3...and 09.1 will be scanned one by one, so the 09.1(fun1) is
> scanned.

Does the "((bus == 0) && (device >= 9 && device <= 20) && (function > 0))"
test catch *all* devfns where the hang occurs?  I wouldn't want to
only avoid the ones that LTP happens to use.  If we did that, a future
LTP change could easily break things again.  But I assume you know
exactly what devices are present on the root bus.

> > > -	if (priv->data->flags & FLAG_DEV_FIX &&
> > > -			!pci_is_root_bus(bus) && PCI_SLOT(devfn) > 0)
> > > +	if ((priv->data->flags & FLAG_DEV_FIX) && bus->self) {
> > > +		if (!pci_is_root_bus(bus) && (device > 0))
> > > +			return NULL;
> > > +	}
> > > +
> > > +	/* Don't access non-existant devices */
> > > +	if (!pdev_is_existant(busnum, device, function))
> > >   		return NULL;
> > 
> > Is this a "forever" hardware bug that will never be fixed, or should
> > there be a flag like FLAG_DEV_FIX so we only do this on the broken
> > devices?
> 
> No, the next new version LS7A will correct it, so maybe we can use
> FLAG_DEV_FIX-like to address it.

You should add the flag now instead of waiting for the new hardware.
Otherwise you may not remember or notice the need to make this
conditional on the hardware version, you'll wonder why the fixed
hardware doesn't enumerate devices correctly.

Bjorn

  reply	other threads:[~2022-06-28 16:05 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-17  7:43 [PATCH V14 0/7] PCI: Loongson pci improvements and quirks Huacai Chen
2022-06-17  7:43 ` [PATCH V14 1/7] PCI/ACPI: Guard ARM64-specific mcfg_quirks Huacai Chen
2022-06-27 21:53   ` Bjorn Helgaas
2022-06-28  2:52     ` Huacai Chen
2022-06-17  7:43 ` [PATCH V14 2/7] PCI: loongson: Use generic 8/16/32-bit config ops on LS2K/LS7A Huacai Chen
2022-06-17  7:43 ` [PATCH V14 3/7] PCI: loongson: Add ACPI init support Huacai Chen
2022-06-17  7:43 ` [PATCH V14 4/7] PCI: loongson: Don't access non-existant devices Huacai Chen
2022-06-27 21:38   ` Bjorn Helgaas
2022-06-28 13:03     ` Jianmin Lv
2022-06-28 16:04       ` Bjorn Helgaas [this message]
2022-06-29  0:33         ` Jianmin Lv
2022-06-29 10:03           ` Huacai Chen
2022-06-17  7:43 ` [PATCH V14 5/7] PCI: loongson: Improve the MRRS quirk for LS7A Huacai Chen
2022-06-17  7:43 ` [PATCH V14 6/7] PCI: Add quirk for LS7A to avoid reboot failure Huacai Chen
2022-06-17  7:43 ` [PATCH V14 7/7] PCI: Add quirk for multifunction devices of LS7A Huacai Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220628160402.GA1842175@bhelgaas \
    --to=helgaas@kernel.org \
    --cc=bhelgaas@google.com \
    --cc=chenhuacai@gmail.com \
    --cc=chenhuacai@loongson.cn \
    --cc=jiaxun.yang@flygoat.com \
    --cc=kw@linux.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=lixuefeng@loongson.cn \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=lvjianmin@loongson.cn \
    --cc=robh@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).