linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/15] net: taint when the device driver firmware crashes
@ 2020-05-09  4:35 Luis Chamberlain
  2020-05-09  4:35 ` [PATCH 01/15] taint: add module firmware crash taint support Luis Chamberlain
                   ` (16 more replies)
  0 siblings, 17 replies; 34+ messages in thread
From: Luis Chamberlain @ 2020-05-09  4:35 UTC (permalink / raw)
  To: jeyu
  Cc: akpm, arnd, rostedt, mingo, aquini, cai, dyoung, bhe, peterz,
	tglx, gpiccoli, pmladek, tiwai, schlad, andriy.shevchenko,
	keescook, daniel.vetter, will, mchehab+samsung, kvalo, davem,
	netdev, linux-kernel, Luis Chamberlain

Device driver firmware can crash, and sometimes, this can leave your
system in a state which makes the device or subsystem completely
useless. Detecting this by inspecting /proc/sys/kernel/tainted instead
of scraping some magical words from the kernel log, which is driver
specific, is much easier. So instead this series provides a helper which
lets drivers annotate this and shows how to use this on networking
drivers.

My methodology for finding when firmware crashes is to git grep for
"crash" and then doing some study of the code to see if this indeed
a place where the firmware crashes. In some places this is quite
obvious.

I'm starting off with networking first, if this gets merged later on I
can focus on the other drivers, but I already have some work done on
other subsytems.

Review, flames, etc are greatly appreciated.

This work, only on networking drivers, can be found on my git tree as well:

https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20200509-taint-firmware-net

Luis Chamberlain (15):
  taint: add module firmware crash taint support
  ethernet/839: use new module_firmware_crashed()
  bnx2x: use new module_firmware_crashed()
  bnxt: use new module_firmware_crashed()
  bna: use new module_firmware_crashed()
  liquidio: use new module_firmware_crashed()
  cxgb4: use new module_firmware_crashed()
  ehea: use new module_firmware_crashed()
  qed: use new module_firmware_crashed()
  soc: qcom: ipa: use new module_firmware_crashed()
  wimax/i2400m: use new module_firmware_crashed()
  ath10k: use new module_firmware_crashed()
  ath6kl: use new module_firmware_crashed()
  brcm80211: use new module_firmware_crashed()
  mwl8k: use new module_firmware_crashed()

 drivers/net/ethernet/8390/axnet_cs.c                |  4 +++-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c    |  1 +
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c   |  1 +
 drivers/net/ethernet/brocade/bna/bfa_ioc.c          |  1 +
 drivers/net/ethernet/cavium/liquidio/lio_main.c     |  1 +
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c     |  1 +
 drivers/net/ethernet/ibm/ehea/ehea_main.c           |  2 ++
 drivers/net/ethernet/qlogic/qed/qed_debug.c         |  3 +++
 drivers/net/ipa/ipa_modem.c                         |  1 +
 drivers/net/wimax/i2400m/rx.c                       |  1 +
 drivers/net/wireless/ath/ath10k/pci.c               |  2 ++
 drivers/net/wireless/ath/ath10k/sdio.c              |  2 ++
 drivers/net/wireless/ath/ath10k/snoc.c              |  1 +
 drivers/net/wireless/ath/ath6kl/hif.c               |  1 +
 .../net/wireless/broadcom/brcm80211/brcmfmac/core.c |  1 +
 drivers/net/wireless/marvell/mwl8k.c                |  1 +
 include/linux/kernel.h                              |  3 ++-
 include/linux/module.h                              | 13 +++++++++++++
 include/trace/events/module.h                       |  3 ++-
 kernel/module.c                                     |  5 +++--
 kernel/panic.c                                      |  1 +
 21 files changed, 44 insertions(+), 5 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 34+ messages in thread
* RE: [PATCH 06/15] liquidio: use new module_firmware_crashed()
@ 2020-05-11 16:33 Derek Chickles
  0 siblings, 0 replies; 34+ messages in thread
From: Derek Chickles @ 2020-05-11 16:33 UTC (permalink / raw)
  To: Luis Chamberlain, jeyu
  Cc: akpm, arnd, rostedt, mingo, aquini, cai, dyoung, bhe, peterz,
	tglx, gpiccoli, pmladek, tiwai, schlad, andriy.shevchenko,
	keescook, daniel.vetter, will, mchehab+samsung, kvalo, davem,
	netdev, linux-kernel, Satananda Burla, Felix Manlunas

> From: Luis Chamberlain <mcgrof@kernel.org>
> Sent: Friday, May 8, 2020 9:36 PM
> To: jeyu@kernel.org
> Cc: akpm@linux-foundation.org; arnd@arndb.de; rostedt@goodmis.org;
> mingo@redhat.com; aquini@redhat.com; cai@lca.pw; dyoung@redhat.com;
> bhe@redhat.com; peterz@infradead.org; tglx@linutronix.de;
> gpiccoli@canonical.com; pmladek@suse.com; tiwai@suse.de;
> schlad@suse.de; andriy.shevchenko@linux.intel.com;
> keescook@chromium.org; daniel.vetter@ffwll.ch; will@kernel.org;
> mchehab+samsung@kernel.org; kvalo@codeaurora.org;
> davem@davemloft.net; netdev@vger.kernel.org; linux-
> kernel@vger.kernel.org; Luis Chamberlain <mcgrof@kernel.org>; Derek
> Chickles <dchickles@marvell.com>; Satananda Burla <sburla@marvell.com>;
> Felix Manlunas <fmanlunas@marvell.com>
> Subject: [PATCH 06/15] liquidio: use new module_firmware_crashed()
> 
> ----------------------------------------------------------------------
> This makes use of the new module_firmware_crashed() to help annotate
> when firmware for device drivers crash. When firmware crashes devices can
> sometimes become unresponsive, and recovery sometimes requires a driver
> unload / reload and in the worst cases a reboot.
> 
> Using a taint flag allows us to annotate when this happens clearly.
> 
> Cc: Derek Chickles <dchickles@marvell.com>
> Cc: Satanand Burla <sburla@marvell.com>
> Cc: Felix Manlunas <fmanlunas@marvell.com>
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---
>  drivers/net/ethernet/cavium/liquidio/lio_main.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/ethernet/cavium/liquidio/lio_main.c
> b/drivers/net/ethernet/cavium/liquidio/lio_main.c
> index 66d31c018c7e..f18085262982 100644
> --- a/drivers/net/ethernet/cavium/liquidio/lio_main.c
> +++ b/drivers/net/ethernet/cavium/liquidio/lio_main.c
> @@ -801,6 +801,7 @@ static int liquidio_watchdog(void *param)
>  			continue;
> 
>  		WRITE_ONCE(oct->cores_crashed, true);
> +		module_firmware_crashed();
>  		other_oct = get_other_octeon_device(oct);
>  		if (other_oct)
>  			WRITE_ONCE(other_oct->cores_crashed, true);
> --
> 2.25.1

Thanks!

Reviewed-by: Derek Chickles <dchickles@marvell.com>


^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2020-05-15 20:39 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-09  4:35 [PATCH 00/15] net: taint when the device driver firmware crashes Luis Chamberlain
2020-05-09  4:35 ` [PATCH 01/15] taint: add module firmware crash taint support Luis Chamberlain
2020-05-09 15:18   ` Rafael Aquini
2020-05-09 16:46     ` Luis Chamberlain
2020-05-10  2:19       ` Randy Dunlap
2020-05-09  4:35 ` [PATCH 02/15] ethernet/839: use new module_firmware_crashed() Luis Chamberlain
2020-05-09  4:35 ` [PATCH 03/15] bnx2x: " Luis Chamberlain
2020-05-09  4:35 ` [PATCH 04/15] bnxt: " Luis Chamberlain
2020-05-09  4:35 ` [PATCH 05/15] bna: " Luis Chamberlain
2020-05-09  4:35 ` [PATCH 06/15] liquidio: " Luis Chamberlain
2020-05-09  4:35 ` [PATCH 07/15] cxgb4: " Luis Chamberlain
2020-05-09  4:35 ` [PATCH 08/15] ehea: " Luis Chamberlain
2020-05-09  4:35 ` [PATCH 09/15] qed: " Luis Chamberlain
2020-05-09  6:32   ` [EXT] " Igor Russkikh
2020-05-09 16:42     ` Luis Chamberlain
2020-05-12 16:23       ` Igor Russkikh
2020-05-12 17:34         ` Luis Chamberlain
2020-05-14 14:53           ` Igor Russkikh
2020-05-15 20:32             ` Luis Chamberlain
2020-05-15 20:37               ` Igor Russkikh
2020-05-09  4:35 ` [PATCH 10/15] soc: qcom: ipa: " Luis Chamberlain
2020-05-09  4:35 ` [PATCH 11/15] wimax/i2400m: " Luis Chamberlain
2020-05-09  4:35 ` [PATCH 12/15] ath10k: " Luis Chamberlain
2020-05-09  4:35 ` [PATCH 13/15] ath6kl: " Luis Chamberlain
2020-05-09  4:35 ` [PATCH 14/15] brcm80211: " Luis Chamberlain
2020-05-09  4:35 ` [PATCH 15/15] mwl8k: " Luis Chamberlain
2020-05-09 18:35 ` [PATCH 00/15] net: taint when the device driver firmware crashes Jakub Kicinski
2020-05-11 14:11   ` Luis Chamberlain
2020-05-10  1:01 ` Shannon Nelson
2020-05-10  1:58   ` Andrew Lunn
2020-05-10  2:15     ` Shannon Nelson
2020-05-11 14:13       ` Luis Chamberlain
2020-05-11 19:21   ` Steven Rostedt
2020-05-11 16:33 [PATCH 06/15] liquidio: use new module_firmware_crashed() Derek Chickles

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).