From: Luis Chamberlain <mcgrof@kernel.org>
To: Jakub Kicinski <kuba@kernel.org>
Cc: johannes@sipsolutions.net, derosier@gmail.com,
greearb@candelatech.com, jeyu@kernel.org,
akpm@linux-foundation.org, arnd@arndb.de, rostedt@goodmis.org,
mingo@redhat.com, aquini@redhat.com, cai@lca.pw,
dyoung@redhat.com, bhe@redhat.com, peterz@infradead.org,
tglx@linutronix.de, gpiccoli@canonical.com, pmladek@suse.com,
tiwai@suse.de, schlad@suse.de, andriy.shevchenko@linux.intel.com,
keescook@chromium.org, daniel.vetter@ffwll.ch, will@kernel.org,
mchehab+samsung@kernel.org, kvalo@codeaurora.org,
davem@davemloft.net, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-wireless@vger.kernel.org,
ath10k@lists.infradead.org, jiri@resnulli.us,
briannorris@chromium.org
Subject: Re: [RFC 1/2] devlink: add simple fw crash helpers
Date: Fri, 22 May 2020 05:20:46 +0000 [thread overview]
Message-ID: <20200522052046.GY11244@42.do-not-panic.com> (raw)
In-Reply-To: <20200519211531.3702593-1-kuba@kernel.org>
On Tue, May 19, 2020 at 02:15:30PM -0700, Jakub Kicinski wrote:
> Add infra for creating devlink instances for a device to report
Thanks for doing this series as a PoC, counter to the module_firmware_crash()
which I proposed to taint the kernel with a firmware crash flag to the kernel
and module.
For those not famliar about devlink:
https://lwn.net/Articles/677967/
https://www.kernel.org/doc/html/latest/networking/devlink/index.html
The github page also is now 404 as Jiri merged that stuff into iproute2:
git://git.kernel.org/pub/scm/network/iproute2/iproute2.git
> fw crashes. This patch expects the devlink instance to be registered
> at probe time. I belive to be the cleanest. We can also add a devm
> version of the helpers, so that we don't have to do the clean up.
> Or we can go even further and register the devlink instance only
> once error has happened (for the first time, then we can just
> find out if already registered by traversing the list like we
> do here).
>
> With the patch applied and a sample driver converted we get:
>
> $ devlink dev
> pci/0000:07:00.0
>
> Then monitor for errors:
>
> $ devlink mon health
> [health,status] pci/0000:07:00.0:
> reporter fw
> state error error 1 recover 0
> [health,status] pci/0000:07:00.0:
> reporter fw
> state error error 2 recover 0
>
> These are the events I triggered on purpose. One can also inspect
> the health of all devices capable of reporting fw errors:
>
> $ devlink health
> pci/0000:07:00.0:
> reporter fw
> state error error 7 recover 0
>
> Obviously drivers may upgrade to the full devlink health API
> which includes state dump, state dump auto-collect and automatic
> error recovery control.
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> include/linux/devlink.h | 11 +++
> net/core/Makefile | 2 +-
> net/core/devlink_simple_fw_reporter.c | 101 ++++++++++++++++++++++++++
> 3 files changed, 113 insertions(+), 1 deletion(-)
> create mode 100644 include/linux/devlink.h
> create mode 100644 net/core/devlink_simple_fw_reporter.c
>
> diff --git a/include/linux/devlink.h b/include/linux/devlink.h
> new file mode 100644
> index 000000000000..2b73987eefca
> --- /dev/null
> +++ b/include/linux/devlink.h
> @@ -0,0 +1,11 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +#ifndef _LINUX_DEVLINK_H_
> +#define _LINUX_DEVLINK_H_
> +
> +struct device;
> +
> +void devlink_simple_fw_reporter_prepare(struct device *dev);
> +void devlink_simple_fw_reporter_cleanup(struct device *dev);
> +void devlink_simple_fw_reporter_report_crash(struct device *dev);
> +
> +#endif
> diff --git a/net/core/Makefile b/net/core/Makefile
> index 3e2c378e5f31..6f1513781c17 100644
> --- a/net/core/Makefile
> +++ b/net/core/Makefile
> @@ -31,7 +31,7 @@ obj-$(CONFIG_LWTUNNEL_BPF) += lwt_bpf.o
> obj-$(CONFIG_BPF_STREAM_PARSER) += sock_map.o
> obj-$(CONFIG_DST_CACHE) += dst_cache.o
> obj-$(CONFIG_HWBM) += hwbm.o
> -obj-$(CONFIG_NET_DEVLINK) += devlink.o
> +obj-$(CONFIG_NET_DEVLINK) += devlink.o devlink_simple_fw_reporter.o
This was looking super sexy up to here. This is networking specific.
We want something generic for *anything* that requests firmware.
I'm afraid this won't work for something generic. I don't think its
throw-away work though, the idea to provide a generic interface to
dump firmware through netlink might be nice for networking, or other
things.
But I have a feeling we'll want something still more generic than this.
So networking may want to be aware that a firmware crash happened as
part of this network device health thing, but firmware crashing is a
generic thing.
I have now extended my patch set to include uvents and I am more set on
that we need the taint now more than ever.
Luis
> obj-$(CONFIG_GRO_CELLS) += gro_cells.o
> obj-$(CONFIG_FAILOVER) += failover.o
> obj-$(CONFIG_BPF_SYSCALL) += bpf_sk_storage.o
> diff --git a/net/core/devlink_simple_fw_reporter.c b/net/core/devlink_simple_fw_reporter.c
> new file mode 100644
> index 000000000000..48dde9123c3c
> --- /dev/null
> +++ b/net/core/devlink_simple_fw_reporter.c
> @@ -0,0 +1,101 @@
> +#include <linux/devlink.h>
> +#include <linux/list.h>
> +#include <linux/mutex.h>
> +#include <net/devlink.h>
> +
> +struct devlink_simple_fw_reporter {
> + struct list_head list;
> + struct devlink_health_reporter *reporter;
> +};
> +
> +
> +static LIST_HEAD(devlink_simple_fw_reporters);
> +static DEFINE_MUTEX(devlink_simple_fw_reporters_mutex);
> +
> +static const struct devlink_health_reporter_ops simple_devlink_health = {
> + .name = "fw",
> +};
> +
> +static const struct devlink_ops simple_devlink_ops = {
> +};
> +
> +static struct devlink_simple_fw_reporter *
> +devlink_simple_fw_reporter_find_for_dev(struct device *dev)
> +{
> + struct devlink_simple_fw_reporter *simple_devlink, *ret = NULL;
> + struct devlink *devlink;
> +
> + mutex_lock(&devlink_simple_fw_reporters_mutex);
> + list_for_each_entry(simple_devlink, &devlink_simple_fw_reporters,
> + list) {
> + devlink = priv_to_devlink(simple_devlink);
> + if (devlink->dev == dev) {
> + ret = simple_devlink;
> + break;
> + }
> + }
> + mutex_unlock(&devlink_simple_fw_reporters_mutex);
> +
> + return ret;
> +}
> +
> +void devlink_simple_fw_reporter_report_crash(struct device *dev)
> +{
> + struct devlink_simple_fw_reporter *simple_devlink;
> +
> + simple_devlink = devlink_simple_fw_reporter_find_for_dev(dev);
> + if (!simple_devlink)
> + return;
> +
> + devlink_health_report(simple_devlink->reporter, "firmware crash", NULL);
> +}
> +EXPORT_SYMBOL_GPL(devlink_simple_fw_reporter_report_crash);
> +
> +void devlink_simple_fw_reporter_prepare(struct device *dev)
> +{
> + struct devlink_simple_fw_reporter *simple_devlink;
> + struct devlink *devlink;
> +
> + devlink = devlink_alloc(&simple_devlink_ops,
> + sizeof(struct devlink_simple_fw_reporter));
> + if (!devlink)
> + return;
> +
> + if (devlink_register(devlink, dev))
> + goto err_free;
> +
> + simple_devlink = devlink_priv(devlink);
> + simple_devlink->reporter =
> + devlink_health_reporter_create(devlink, &simple_devlink_health,
> + 0, NULL);
> + if (IS_ERR(simple_devlink->reporter))
> + goto err_unregister;
> +
> + mutex_lock(&devlink_simple_fw_reporters_mutex);
> + list_add_tail(&simple_devlink->list, &devlink_simple_fw_reporters);
> + mutex_unlock(&devlink_simple_fw_reporters_mutex);
> +
> + return;
> +
> +err_unregister:
> + devlink_unregister(devlink);
> +err_free:
> + devlink_free(devlink);
> +}
> +EXPORT_SYMBOL_GPL(devlink_simple_fw_reporter_prepare);
> +
> +void devlink_simple_fw_reporter_cleanup(struct device *dev)
> +{
> + struct devlink_simple_fw_reporter *simple_devlink;
> + struct devlink *devlink;
> +
> + simple_devlink = devlink_simple_fw_reporter_find_for_dev(dev);
> + if (!simple_devlink)
> + return;
> +
> + devlink = priv_to_devlink(simple_devlink);
> + devlink_health_reporter_destroy(simple_devlink->reporter);
> + devlink_unregister(devlink);
> + devlink_free(devlink);
> +}
> +EXPORT_SYMBOL_GPL(devlink_simple_fw_reporter_cleanup);
> --
> 2.25.4
>
next prev parent reply other threads:[~2020-05-22 5:20 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20200515212846.1347-1-mcgrof@kernel.org>
2020-05-15 21:28 ` [PATCH v2 12/15] ath10k: use new module_firmware_crashed() Luis Chamberlain
2020-05-16 4:11 ` Rafael Aquini
2020-05-16 13:24 ` Johannes Berg
2020-05-16 13:50 ` Johannes Berg
2020-05-18 16:56 ` Luis Chamberlain
2020-05-19 1:23 ` Brian Norris
2020-05-19 14:02 ` Luis Chamberlain
2020-05-20 0:47 ` Brian Norris
2020-05-20 5:37 ` Emmanuel Grumbach
2020-05-20 8:32 ` Andy Shevchenko
2020-05-21 19:01 ` Brian Norris
2020-05-22 5:12 ` Emmanuel Grumbach
2020-05-22 5:23 ` Luis Chamberlain
2020-05-18 16:51 ` Luis Chamberlain
2020-05-18 16:58 ` Ben Greear
2020-05-18 17:09 ` Luis Chamberlain
2020-05-18 17:15 ` Ben Greear
2020-05-18 17:18 ` Luis Chamberlain
2020-05-18 18:06 ` Steve deRosier
2020-05-18 19:09 ` Luis Chamberlain
2020-05-18 19:25 ` Johannes Berg
2020-05-18 19:59 ` Luis Chamberlain
2020-05-18 20:07 ` Johannes Berg
2020-05-18 21:18 ` Luis Chamberlain
2020-05-18 20:28 ` Jakub Kicinski
2020-05-18 20:29 ` Johannes Berg
2020-05-18 20:35 ` Jakub Kicinski
2020-05-18 20:41 ` Johannes Berg
2020-05-18 20:46 ` Jakub Kicinski
2020-05-18 21:22 ` Luis Chamberlain
2020-05-18 22:16 ` Jakub Kicinski
2020-05-19 1:05 ` Luis Chamberlain
2020-05-19 21:15 ` [RFC 1/2] devlink: add simple fw crash helpers Jakub Kicinski
2020-05-22 5:20 ` Luis Chamberlain [this message]
2020-05-22 17:17 ` Jakub Kicinski
2020-05-22 20:46 ` Johannes Berg
2020-05-22 21:51 ` Luis Chamberlain
2020-05-22 23:23 ` Steve deRosier
2020-05-22 23:44 ` Luis Chamberlain
2020-05-25 9:07 ` Andy Shevchenko
2020-05-25 17:08 ` Ben Greear
2020-05-25 20:57 ` Jakub Kicinski
2020-07-30 13:56 ` Johannes Berg
2020-05-22 21:49 ` Luis Chamberlain
2020-05-19 21:15 ` [RFC 2/2] i2400m: use devlink health reporter Jakub Kicinski
2020-05-15 21:28 ` [PATCH v2 13/15] ath6kl: use new module_firmware_crashed() Luis Chamberlain
2020-05-16 4:12 ` Rafael Aquini
2020-05-15 21:28 ` [PATCH v2 14/15] brcm80211: " Luis Chamberlain
2020-05-16 4:13 ` Rafael Aquini
2020-05-15 21:28 ` [PATCH v2 15/15] mwl8k: " Luis Chamberlain
2020-05-16 4:13 ` Rafael Aquini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200522052046.GY11244@42.do-not-panic.com \
--to=mcgrof@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=andriy.shevchenko@linux.intel.com \
--cc=aquini@redhat.com \
--cc=arnd@arndb.de \
--cc=ath10k@lists.infradead.org \
--cc=bhe@redhat.com \
--cc=briannorris@chromium.org \
--cc=cai@lca.pw \
--cc=daniel.vetter@ffwll.ch \
--cc=davem@davemloft.net \
--cc=derosier@gmail.com \
--cc=dyoung@redhat.com \
--cc=gpiccoli@canonical.com \
--cc=greearb@candelatech.com \
--cc=jeyu@kernel.org \
--cc=jiri@resnulli.us \
--cc=johannes@sipsolutions.net \
--cc=keescook@chromium.org \
--cc=kuba@kernel.org \
--cc=kvalo@codeaurora.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-wireless@vger.kernel.org \
--cc=mchehab+samsung@kernel.org \
--cc=mingo@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=pmladek@suse.com \
--cc=rostedt@goodmis.org \
--cc=schlad@suse.de \
--cc=tglx@linutronix.de \
--cc=tiwai@suse.de \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).