linux-kernel-mentees.lists.linuxfoundation.org archive mirror
 help / color / mirror / Atom feed
From: "Oliver O'Halloran" <oohall@gmail.com>
To: Arnd Bergmann <arnd@arndb.de>
Cc: linux-pci <linux-pci@vger.kernel.org>,
	Paul Mackerras <paulus@samba.org>,
	sparclinux <sparclinux@vger.kernel.org>,
	Toan Le <toan@os.amperecomputing.com>,
	Christoph Hellwig <hch@lst.de>,
	Marek Vasut <marek.vasut+renesas@gmail.com>,
	Rob Herring <robh@kernel.org>,
	Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>,
	Sagi Grimberg <sagi@grimberg.me>,
	Kevin Hilman <khilman@baylibre.com>,
	Russell King <linux@armlinux.org.uk>,
	Ley Foon Tan <ley.foon.tan@intel.com>,
	Greg Ungerer <gerg@linux-m68k.org>,
	Geert Uytterhoeven <geert@linux-m68k.org>,
	Jakub Kicinski <kuba@kernel.org>,
	Matt Turner <mattst88@gmail.com>,
	linux-kernel-mentees@lists.linuxfoundation.org,
	Guenter Roeck <linux@roeck-us.net>,
	Bjorn Helgaas <helgaas@kernel.org>, Ray Jui <rjui@broadcom.com>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	Jens Axboe <axboe@fb.com>,
	Ivan Kokshaysky <ink@jurassic.park.msu.ru>,
	Keith Busch <kbusch@kernel.org>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	Richard Henderson <rth@twiddle.net>,
	Juergen Gross <jgross@suse.com>,
	Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
	Scott Branden <sbranden@broadcom.com>,
	Jingoo Han <jingoohan1@gmail.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Philipp Zabel <p.zabel@pengutronix.de>,
	"Saheed O. Bolarinwa" <refactormyself@gmail.com>,
	Gustavo Pimentel <gustavo.pimentel@synopsys.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	"David S. Miller" <davem@davemloft.net>,
	Heiner Kallweit <hkallweit1@gmail.com>
Subject: Re: [Linux-kernel-mentees] [RFC PATCH 00/35] Move all PCIBIOS* definitions into arch/x86
Date: Wed, 15 Jul 2020 14:18:47 +1000	[thread overview]
Message-ID: <CAOSf1CEviMYySQhyQGks8hHjST-85wGpxEBasuxwSX_homBJ2A@mail.gmail.com> (raw)
In-Reply-To: <CAK8P3a3EZX8=649R9cYF6_=ivh1Xyrgsc5mUtS=d5yvQ3doZaQ@mail.gmail.com>

On Wed, Jul 15, 2020 at 8:03 AM Arnd Bergmann <arnd@arndb.de> wrote:
>
> - Most error checking is static: PCIBIOS_BAD_REGISTER_NUMBER
>   only happens if you pass an invalid register number, but most
>   callers pass a compile-time constant register number that is
>   known to be correct, or the driver would never work. Similarly,
>   PCIBIOS_DEVICE_NOT_FOUND wouldn't normally happen
>   since you pass a valid pci_device pointer that was already
>   probed.

Having some feedback about obvious programming errors is still useful
when doing driver development. Reporting those via printk() would
probably be more useful to those who care though.

> - config space accesses are very rare compared to memory
>   space access and on the hardware side the error handling
>   would be similar, but readl/writel don't return errors, they just
>   access wrong registers or return 0xffffffff.
>   arch/powerpc/kernel/eeh.c has a ton extra code written to
>   deal with it, but no other architectures do.

TBH the EEH MMIO hooks were probably a mistake to begin with. Errors
detected via MMIO are almost always asynchronous to the error itself
so you usually just wind up with a misleading stack trace rather than
any kind of useful synchronous error reporting. It seems like most
drivers don't bother checking for 0xFFs either and rely on the
asynchronous reporting via .error_detected() instead, so I have to
wonder what the point is. I've been thinking of removing the MMIO
hooks and using a background poller to check for errors on each PHB
periodically (assuming we don't have an EEH interrupt) instead. That
would remove the requirement for eeh_dev_check_failure() to be
interrupt safe too, so it might even let us fix all the godawful races
in EEH.

> - If we add code to detect errors in pci_read_config_*
>   and do some of the stuff from powerpc's
>   eeh_dev_check_failure(), we are more likely to catch
>   intermittent failures when drivers don't check, or bugs
>   with invalid arguments in device drivers than relying on
>   drivers to get their error handling right when those code
>   paths don't ever get covered in normal testing.

Adding some kind of error detection to the generic config accessors
wouldn't hurt, but detection is only half the problem. The main job of
eeh_dev_check_failure() is waking up the EEH recovery thread which
actually handles notifying drivers, device resets, etc and you'd want
something in the PCI core. Right now there's two implementations of
that reporting logic: one for EEH in arch/powerpc/eeh_driver.c and one
for AER/DPC in drivers/pci/pcie/err.c. I think the latter could be
moved into the PCI core easily enough since there's not much about it
that's really specific to PCIe. Ideally we could drop the EEH specific
one too, but I'm not sure how to implement that without it devolving
into callback spaghetti.

Oliver
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

  parent reply	other threads:[~2020-07-15  4:19 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-13 12:22 [Linux-kernel-mentees] [RFC PATCH 00/35] Move all PCIBIOS* definitions into arch/x86 Saheed O. Bolarinwa
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 01/35] xen-pciback: Change PCIBIOS_SUCCESSFUL to 0 Saheed O. Bolarinwa
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 02/35] ssb: " Saheed O. Bolarinwa
2020-07-13 17:16   ` Larry Finger
2020-07-13 19:13     ` Saheed Bolarinwa
2020-07-13 18:29       ` Arnd Bergmann
2020-07-13 18:35       ` Larry Finger
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 03/35] scsi: ipr: " Saheed O. Bolarinwa
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 04/35] scsi: ipr: Tidy Success/Failure checks Saheed O. Bolarinwa
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 05/35] PCI: Change PCIBIOS_SUCCESSFUL to 0 Saheed O. Bolarinwa
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 06/35] PCI: Tidy Success/Failure checks Saheed O. Bolarinwa
2020-07-13 13:59   ` Gustavo Pimentel
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 07/35] PCI: Change PCIBIOS_SUCCESSFUL to 0 Saheed O. Bolarinwa
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 08/35] PCI: Tidy Success/Failure checks Saheed O. Bolarinwa
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 09/35] nvme-pci: Change PCIBIOS_SUCCESSFUL to 0 Saheed O. Bolarinwa
2020-07-13 16:42   ` Rajashekar, Revanth
2020-07-13 18:24     ` Saheed Bolarinwa
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 10/35] nvme-pci: Tidy Success/Failure checks Saheed O. Bolarinwa
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 11/35] r8169: Change PCIBIOS_SUCCESSFUL to 0 Saheed O. Bolarinwa
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 12/35] r8169: Tidy Success/Failure checks Saheed O. Bolarinwa
2020-07-13 13:45   ` Heiner Kallweit
2020-07-13 13:09     ` Saheed Bolarinwa
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 13/35] cxl: Change PCIBIOS_SUCCESSFUL to 0 Saheed O. Bolarinwa
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 14/35] i2c/busses: " Saheed O. Bolarinwa
2020-07-17 14:58   ` Jean Delvare
2020-07-18 19:05     ` Saheed Bolarinwa
2020-07-22 11:06       ` Wolfram Sang
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 15/35] i2c/busses: Tidy Success/Failure checks Saheed O. Bolarinwa
2020-07-17 15:11   ` Jean Delvare
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 16/35] hwmon: (sis5595) Change PCIBIOS_SUCCESSFUL to 0 Saheed O. Bolarinwa
2020-07-14  5:02   ` Guenter Roeck
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 17/35] hwmon: (sis5595) Tidy Success/Failure checks Saheed O. Bolarinwa
2020-07-14  5:04   ` Guenter Roeck
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 18/35] bcma: Change PCIBIOS_SUCCESSFUL to 0 Saheed O. Bolarinwa
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 19/35] atm: " Saheed O. Bolarinwa
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 20/35] atm: Tidy Success/Failure checks Saheed O. Bolarinwa
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 21/35] atm: Fix Style ERROR- assignment in if condition Saheed O. Bolarinwa
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 22/35] unicore32: Change PCIBIOS_SUCCESSFUL to 0 Saheed O. Bolarinwa
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 23/35] sparc/PCI: " Saheed O. Bolarinwa
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 24/35] sh: " Saheed O. Bolarinwa
2020-07-20 21:41   ` Rich Felker
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 25/35] sh: Tidy Success/Failure checks Saheed O. Bolarinwa
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 26/35] powerpc: Change PCIBIOS_SUCCESSFUL to 0 Saheed O. Bolarinwa
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 27/35] powerpc: Tidy Success/Failure checks Saheed O. Bolarinwa
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 28/35] mips: Change PCIBIOS_SUCCESSFUL to 0 Saheed O. Bolarinwa
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 29/35] mips: Tidy Success/Failure checks Saheed O. Bolarinwa
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 30/35] microblaze: Change PCIBIOS_SUCCESSFUL to 0 Saheed O. Bolarinwa
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 31/35] m68k: " Saheed O. Bolarinwa
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 32/35] arm/PCI: " Saheed O. Bolarinwa
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 33/35] arm/PCI: Tidy Success/Failure checks Saheed O. Bolarinwa
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 34/35] PCI: Change PCIBIOS_SUCCESSFUL to 0 Saheed O. Bolarinwa
2020-07-13 12:22 ` [Linux-kernel-mentees] [RFC PATCH 35/35] alpha: Tidy Success/Failure checks Saheed O. Bolarinwa
2020-07-13 15:08 ` [Linux-kernel-mentees] [RFC PATCH 00/35] Move all PCIBIOS* definitions into arch/x86 Arnd Bergmann
2020-07-14 18:45   ` Bjorn Helgaas
2020-07-14 21:02     ` Kjetil Oftedal
2020-07-15  2:14       ` Benjamin Herrenschmidt
2020-07-14 22:01     ` Arnd Bergmann
2020-07-14 23:46       ` Bjorn Helgaas
2020-07-15  2:19         ` Benjamin Herrenschmidt
2020-07-15  6:47         ` Arnd Bergmann
2020-07-15 14:24           ` David Laight
2020-07-15 22:01             ` Bjorn Helgaas
2020-07-16  8:18               ` David Laight
2020-07-15 22:26           ` Benjamin Herrenschmidt
2020-07-15  4:18       ` Oliver O'Halloran [this message]
2020-07-15 14:38         ` David Laight
2020-07-15 22:12           ` Bjorn Helgaas
2020-07-15 22:49             ` Benjamin Herrenschmidt
2020-07-16  8:07               ` David Laight
2020-07-14 23:14     ` Rob Herring
2020-07-15  2:12     ` Benjamin Herrenschmidt
2020-07-13 22:01 ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOSf1CEviMYySQhyQGks8hHjST-85wGpxEBasuxwSX_homBJ2A@mail.gmail.com \
    --to=oohall@gmail.com \
    --cc=arnd@arndb.de \
    --cc=axboe@fb.com \
    --cc=bhelgaas@google.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=davem@davemloft.net \
    --cc=geert@linux-m68k.org \
    --cc=gerg@linux-m68k.org \
    --cc=gustavo.pimentel@synopsys.com \
    --cc=hch@lst.de \
    --cc=helgaas@kernel.org \
    --cc=hkallweit1@gmail.com \
    --cc=ink@jurassic.park.msu.ru \
    --cc=jgross@suse.com \
    --cc=jingoohan1@gmail.com \
    --cc=kbusch@kernel.org \
    --cc=khilman@baylibre.com \
    --cc=kuba@kernel.org \
    --cc=ley.foon.tan@intel.com \
    --cc=linux-kernel-mentees@lists.linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=linux@roeck-us.net \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=marek.vasut+renesas@gmail.com \
    --cc=mattst88@gmail.com \
    --cc=p.zabel@pengutronix.de \
    --cc=paulus@samba.org \
    --cc=refactormyself@gmail.com \
    --cc=rjui@broadcom.com \
    --cc=robh@kernel.org \
    --cc=rth@twiddle.net \
    --cc=sagi@grimberg.me \
    --cc=sbranden@broadcom.com \
    --cc=sparclinux@vger.kernel.org \
    --cc=toan@os.amperecomputing.com \
    --cc=tsbogend@alpha.franken.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).