linux-wireless.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Brian Norris <briannorris@chromium.org>
To: Luis Chamberlain <mcgrof@kernel.org>
Cc: jeyu@kernel.org, "David S. Miller" <davem@davemloft.net>,
	kuba@kernel.org, linux-wireless <linux-wireless@vger.kernel.org>,
	aquini@redhat.com, linux-doc@vger.kernel.org,
	peterz@infradead.org, Daniel Vetter <daniel.vetter@ffwll.ch>,
	linux@dominikbrodowski.net,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Masahiro Yamada <yamada.masahiro@socionext.com>,
	glider@google.com, GR-everest-linux-l2@marvell.com,
	mchehab+samsung@kernel.org, will@kernel.org,
	michael.chan@broadcom.com, Rob Herring <robh@kernel.org>,
	paulmck@kernel.org, bhe@redhat.com, corbet@lwn.net,
	mchehab+huawei@kernel.org, ath10k <ath10k@lists.infradead.org>,
	derosier@gmail.com, Takashi Iwai <tiwai@suse.de>,
	mingo@redhat.com, Dmitry Vyukov <dvyukov@google.com>,
	Sami Tolvanen <samitolvanen@google.com>,
	yzaikin@google.com, dyoung@redhat.com, pmladek@suse.com,
	elver@google.com, sburla@marvell.com, aelior@marvell.com,
	Kees Cook <keescook@chromium.org>, Arnd Bergmann <arnd@arndb.de>,
	sfr@canb.auug.org.au, gpiccoli@canonical.com,
	Steven Rostedt <rostedt@goodmis.org>,
	fmanlunas@marvell.com, cai@lca.pw, tglx@linutronix.de,
	Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
	Johannes Berg <johannes@sipsolutions.net>,
	Kalle Valo <kvalo@codeaurora.org>,
	"<netdev@vger.kernel.org>" <netdev@vger.kernel.org>,
	rdunlap@infradead.org, schlad@suse.de,
	Doug Anderson <dianders@chromium.org>,
	vkoul@kernel.org, mhiramat@kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	dchickles@marvell.com, bauerman@linux.ibm.com
Subject: Re: [PATCH v3 5/8] ath10k: use new taint_firmware_crashed()
Date: Tue, 2 Jun 2020 14:01:12 -0700	[thread overview]
Message-ID: <CA+ASDXMR-Aa9322QjUTxiD2zwXDUig1eyG7GAAJJDvuUg1AXdA@mail.gmail.com> (raw)
In-Reply-To: <20200526145815.6415-6-mcgrof@kernel.org>

On Tue, May 26, 2020 at 7:58 AM Luis Chamberlain <mcgrof@kernel.org> wrote:
>
> This makes use of the new taint_firmware_crashed() to help
> annotate when firmware for device drivers crash. When firmware
> crashes devices can sometimes become unresponsive, and recovery
> sometimes requires a driver unload / reload and in the worst cases
> a reboot.

Just for the record, the underlying problem you seem to be complaining
about does not appear to be a firmware crash at all. It does happen to
result in a firmware crash report much later on (because when the PCIe
endpoint is this hosed, sooner or later the driver thinks the firmware
is dead), but it's not likely the root cause. More below.

> Using a taint flag allows us to annotate when this happens clearly.
>
> I have run into this situation with this driver with the latest
> firmware as of today, May 21, 2020 using v5.6.0, leaving me at
> a state at which my only option is to reboot. Driver removal and
> addition does not fix the situation. This is reported on kernel.org
> bugzilla korg#207851 [0].

I took a look, and replied there:
https://bugzilla.kernel.org/show_bug.cgi?id=207851#c2

Per the above, it seems more likely you have a PCI or power management
problem, not an ath10k or ath10k-firmware problem.

> But this isn't the first firmware crash reported,
> others have been filed before and none of these bugs have yet been
> addressed [1] [2] [3].  Including my own I see these firmware crash
> reports:

Yes, firmware does crash. Sometimes repeatedly. It also happens to be
closed source, so it's nearly impossible for the average Linux dev to
debug. But FWIW, those 3 all appear to be recoverable -- and then they
crash again a few minutes later. So just as claimed on prior
iterations of this patchset, ath10k is doing fine at recovery [*] --
it's "only" the firmware that's a problem. (And, if a WiFi firmware
doesn't like something in the RF environment...it's totally
understandable that the crash will happen more than once. Of course
that sucks, but it's not unexpected.) Crucially, rebooting won't
really do anything to help these people, AIUI.

Maybe what you really want is to taint the kernel every time a
non-free firmware is loaded ;)

I'd also note that those 3 reports are 3 years old. There have been
many ath10k-firmware updates since then, so it's not necessarily fair
to dig those back up. Also, bugzilla.kernel.org is totally ignored by
many linux-wireless@ folks. But I digress...

All in all, I have no interest in this proposal, for many of the
reasons already mentioned on previous iterations. It's way too coarse
and won't be useful in understanding what's going on in a system, IMO,
at least for ath10k. But it's also easy enough to ignore, so if it
makes somebody happy to claim a taint, then so be it.

Regards,
Brian

[*] Although, at least one of those doesn't appear to be as "clean" of
a recovery attempt as typical. Maybe there are some lurking driver
bugs in there too.


>   * korg#207851 [0]
>   * korg#197013 [1]
>   * korg#201237 [2]
>   * korg#195987 [3]
>
> [0] https://bugzilla.kernel.org/show_bug.cgi?id=207851
> [1] https://bugzilla.kernel.org/show_bug.cgi?id=197013
> [2] https://bugzilla.kernel.org/show_bug.cgi?id=201237
> [3] https://bugzilla.kernel.org/show_bug.cgi?id=195987

  reply	other threads:[~2020-06-02 21:09 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-26 14:58 [PATCH v3 0/8] kernel: taint when the driver firmware crashes Luis Chamberlain
2020-05-26 14:58 ` [PATCH v3 1/8] kernel.h: move taint and system state flags to uapi Luis Chamberlain
2020-05-26 14:58 ` [PATCH v3 2/8] panic: add uevent support Luis Chamberlain
2020-05-26 14:58 ` [PATCH v3 3/8] taint: add firmware crash taint support Luis Chamberlain
2020-05-26 14:58 ` [PATCH v3 4/8] panic: make taint data type clearer Luis Chamberlain
2020-05-26 14:58 ` [PATCH v3 5/8] ath10k: use new taint_firmware_crashed() Luis Chamberlain
2020-06-02 21:01   ` Brian Norris [this message]
2020-05-26 14:58 ` [PATCH v3 6/8] bnxt_en: " Luis Chamberlain
2020-05-26 18:09   ` Michael Chan
2020-05-26 14:58 ` [PATCH v3 7/8] liquidio: " Luis Chamberlain
2020-05-26 14:58 ` [PATCH v3 8/8] qed: " Luis Chamberlain
2020-05-26 22:46 ` [PATCH v3 0/8] kernel: taint when the driver firmware crashes Jakub Kicinski
2020-05-26 23:07   ` Luis Chamberlain
2020-05-26 23:30     ` Jakub Kicinski
2020-05-27  3:19       ` Luis Chamberlain
2020-05-27 21:36         ` Jakub Kicinski
2020-05-28 14:27           ` Luis Chamberlain
2020-05-28 15:04             ` Ben Greear
2020-05-28 16:33               ` Luis Chamberlain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+ASDXMR-Aa9322QjUTxiD2zwXDUig1eyG7GAAJJDvuUg1AXdA@mail.gmail.com \
    --to=briannorris@chromium.org \
    --cc=GR-everest-linux-l2@marvell.com \
    --cc=aelior@marvell.com \
    --cc=akpm@linux-foundation.org \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=aquini@redhat.com \
    --cc=arnd@arndb.de \
    --cc=ath10k@lists.infradead.org \
    --cc=bauerman@linux.ibm.com \
    --cc=bhe@redhat.com \
    --cc=cai@lca.pw \
    --cc=corbet@lwn.net \
    --cc=daniel.vetter@ffwll.ch \
    --cc=davem@davemloft.net \
    --cc=dchickles@marvell.com \
    --cc=derosier@gmail.com \
    --cc=dianders@chromium.org \
    --cc=dvyukov@google.com \
    --cc=dyoung@redhat.com \
    --cc=elver@google.com \
    --cc=fmanlunas@marvell.com \
    --cc=glider@google.com \
    --cc=gpiccoli@canonical.com \
    --cc=jeyu@kernel.org \
    --cc=johannes@sipsolutions.net \
    --cc=keescook@chromium.org \
    --cc=kuba@kernel.org \
    --cc=kvalo@codeaurora.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=linux@dominikbrodowski.net \
    --cc=mcgrof@kernel.org \
    --cc=mchehab+huawei@kernel.org \
    --cc=mchehab+samsung@kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=michael.chan@broadcom.com \
    --cc=mingo@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=rdunlap@infradead.org \
    --cc=robh@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=samitolvanen@google.com \
    --cc=sburla@marvell.com \
    --cc=schlad@suse.de \
    --cc=sfr@canb.auug.org.au \
    --cc=tglx@linutronix.de \
    --cc=tiwai@suse.de \
    --cc=vkoul@kernel.org \
    --cc=will@kernel.org \
    --cc=yamada.masahiro@socionext.com \
    --cc=yzaikin@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).