From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D8049C433DF for ; Mon, 18 May 2020 20:08:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B50F620758 for ; Mon, 18 May 2020 20:08:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726775AbgERUIy (ORCPT ); Mon, 18 May 2020 16:08:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54620 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726250AbgERUIx (ORCPT ); Mon, 18 May 2020 16:08:53 -0400 Received: from sipsolutions.net (s3.sipsolutions.net [IPv6:2a01:4f8:191:4433::2]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6E1ACC061A0C; Mon, 18 May 2020 13:08:53 -0700 (PDT) Received: by sipsolutions.net with esmtpsa (TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.93) (envelope-from ) id 1jam3D-00FjeM-Ku; Mon, 18 May 2020 22:07:51 +0200 Message-ID: Subject: Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed() From: Johannes Berg To: Luis Chamberlain Cc: Steve deRosier , Ben Greear , jeyu@kernel.org, akpm@linux-foundation.org, arnd@arndb.de, rostedt@goodmis.org, mingo@redhat.com, aquini@redhat.com, cai@lca.pw, dyoung@redhat.com, bhe@redhat.com, peterz@infradead.org, tglx@linutronix.de, gpiccoli@canonical.com, pmladek@suse.com, Takashi Iwai , schlad@suse.de, andriy.shevchenko@linux.intel.com, keescook@chromium.org, daniel.vetter@ffwll.ch, will@kernel.org, mchehab+samsung@kernel.org, Kalle Valo , "David S. Miller" , Network Development , LKML , linux-wireless , ath10k@lists.infradead.org Date: Mon, 18 May 2020 22:07:49 +0200 In-Reply-To: <20200518195950.GP11244@42.do-not-panic.com> (sfid-20200518_215954_551733_20DE2085) References: <20200515212846.1347-13-mcgrof@kernel.org> <2b74a35c726e451b2fab2b5d0d301e80d1f4cdc7.camel@sipsolutions.net> <20200518165154.GH11244@42.do-not-panic.com> <4ad0668d-2de9-11d7-c3a1-ad2aedd0c02d@candelatech.com> <20200518170934.GJ11244@42.do-not-panic.com> <20200518171801.GL11244@42.do-not-panic.com> <20200518190930.GO11244@42.do-not-panic.com> <20200518195950.GP11244@42.do-not-panic.com> (sfid-20200518_215954_551733_20DE2085) Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.36.2 (3.36.2-1.fc32) MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 2020-05-18 at 19:59 +0000, Luis Chamberlain wrote: > > Err, no. Those two are most definitely related. Have you looked at (most > > or some or whatever) staging drivers recently? Those contain all kinds > > of garbage that might do whatever with your kernel. > > No, I stay away :) :) > > That's all fine, I just don't think it's appropriate to pretend that > > your kernel is now 'tainted' (think about the meaning of that word) when > > the firmware of some random device crashed. > > If the firmware crash *does* require driver remove / addition again, > or a reboot, would you think that this is a situation that merits a taint? Not really. In my experience, that's more likely a hardware issue (card not properly seated, for example) that a bus reset happens to "fix". > > It's pretty clear, but even then, first of all I doubt this is the case > > for many of the places that you've sprinkled the annotation on, > > We can remove it, for this driver I can vouch for its location as it did > reach a state where I required a reboot. And its not the first time this > has happened. This got me thinking about the bigger picture of the lack > of proper way to address these cases in the kernel, and how the user is > left dumbfounded. Fair, so the driver is still broken wrt. recovery here. I still don't think that's a situation where e.g. the system should say "hey you have a taint here, if your graphics go bad now you should not report that bug" (which is effectively what the single taint bit does). > > and secondly it actually hides useful information. > > What is it hiding? Most importantly, which device crashed. Secondarily I'd say how many times (*). The information "firmware crashed" is really only useful in relation to the device. If your graphics firmware crashed, yeah, well, you probably won't even see this. If your USB wifi firmware crashed? Not really interesting, you'll anyway just unplug. In fact it's very hard for a USB driver (short of arbitrary memory corruption) to significantly mess up the system. johannes (*) though if it crashed only once, was that because it was wedged enough to be unusable afterwards, or because everything was fine?