All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC v2] device coredump: add new device coredump class
@ 2014-09-05  8:50 Johannes Berg
  2014-09-05  9:40 ` Jonas Gorski
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Johannes Berg @ 2014-09-05  8:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, Daniel Vetter, Emmanuel Grumbach,
	Luciano Coelho, Kalle Valo, dri-devel, linux-wireless

From: Johannes Berg <johannes.berg@intel.com>

Many devices run firmware and/or complex hardware, and most of that
can have bugs. When it misbehaves, however, it is often much harder
to debug than software running on the host.

Introduce a "device coredump" mechanism to allow dumping internal
device/firmware state through a generalized mechanism. As devices
are different and information needed can vary accordingly, this
doesn't prescribe a file format - it just provides mechanism to
get data to be able to capture it in a generalized way (e.g. in
distributions.)

Note that generalized capturing of such data may result in privacy
issues, so users generally need to be involved. In order to allow
certain users/system integrators/... to disable the feature at all,
introduce a Kconfig option to override the drivers that would like
to have the feature.

For now, this provides two ways of dumping data:
 1) with a vmalloc'ed area, that is then given to the subsystem
    and freed after retrieval or timeout
 2) with a generalized reader/free function method

We could/should add more options, e.g. a list of pages, since the
vmalloc area is very limited on some architectures.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
---
 MAINTAINERS                 |   7 ++
 drivers/base/Kconfig        |  21 ++++
 drivers/base/Makefile       |   1 +
 drivers/base/devcoredump.c  | 256 ++++++++++++++++++++++++++++++++++++++++++++
 include/linux/devcoredump.h |  35 ++++++
 5 files changed, 320 insertions(+)
 create mode 100644 drivers/base/devcoredump.c
 create mode 100644 include/linux/devcoredump.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 2f85f55c8fb8..8e31d053fb14 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2847,6 +2847,13 @@ T:	git git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb.git
 S:	Maintained
 F:	drivers/usb/dwc3/
 
+DEVICE COREDUMP (DEV_COREDUMP)
+M:	Johannes Berg <johannes@sipsolutions.net>
+L:	linux-kernel@vger.kernel.org
+S:	Maintained
+F:	drivers/base/devcoredump.c
+F:	include/linux/devcoredump.h
+
 DEVICE FREQUENCY (DEVFREQ)
 M:	MyungJoo Ham <myungjoo.ham@samsung.com>
 M:	Kyungmin Park <kyungmin.park@samsung.com>
diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
index 4e7f0ff83ae7..134f763d90fd 100644
--- a/drivers/base/Kconfig
+++ b/drivers/base/Kconfig
@@ -165,6 +165,27 @@ config FW_LOADER_USER_HELPER_FALLBACK
 
 	  If you are unsure about this, say N here.
 
+config WANT_DEV_COREDUMP
+	bool
+	help
+	  Drivers should "select" this option if they desire to use the
+	  device coredump mechanism.
+
+config DISABLE_DEV_COREDUMP
+	bool "Disable device coredump" if EXPERT
+	help
+	  Disable the device coredump mechanism despite drivers wanting to
+	  use it; this allows for more sensitive systems or systems that
+	  don't want to ever access the information to not have the code,
+	  nor keep any data.
+
+	  If unsure, say N.
+
+config DEV_COREDUMP
+	bool
+	default y if WANT_DEV_COREDUMP
+	depends on !DISABLE_DEV_COREDUMP
+
 config DEBUG_DRIVER
 	bool "Driver Core verbose debug messages"
 	depends on DEBUG_KERNEL
diff --git a/drivers/base/Makefile b/drivers/base/Makefile
index 4aab26ec0292..6922cd6850a2 100644
--- a/drivers/base/Makefile
+++ b/drivers/base/Makefile
@@ -21,6 +21,7 @@ obj-$(CONFIG_SYS_HYPERVISOR) += hypervisor.o
 obj-$(CONFIG_REGMAP)	+= regmap/
 obj-$(CONFIG_SOC_BUS) += soc.o
 obj-$(CONFIG_PINCTRL) += pinctrl.o
+obj-$(CONFIG_DEV_COREDUMP) += devcoredump.o
 
 ccflags-$(CONFIG_DEBUG_DRIVER) := -DDEBUG
 
diff --git a/drivers/base/devcoredump.c b/drivers/base/devcoredump.c
new file mode 100644
index 000000000000..d6ecf13a5903
--- /dev/null
+++ b/drivers/base/devcoredump.c
@@ -0,0 +1,256 @@
+/*
+ * This file is provided under the GPLv2 license.
+ *
+ * GPL LICENSE SUMMARY
+ *
+ * Copyright(c) 2014 Intel Mobile Communications GmbH
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution
+ * in the file called COPYING.
+ *
+ * Contact Information:
+ *  Intel Linux Wireless <ilw@linux.intel.com>
+ * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+ */
+#include <linux/module.h>
+#include <linux/device.h>
+#include <linux/devcoredump.h>
+#include <linux/list.h>
+#include <linux/slab.h>
+#include <linux/fs.h>
+#include <linux/workqueue.h>
+
+MODULE_AUTHOR("Johannes Berg <johannes@sipsolutions.net>");
+MODULE_DESCRIPTION("Device Coredump support");
+MODULE_LICENSE("GPL");
+
+/* if data isn't read by userspace after 5 minutes then delete it */
+#define DEVCD_TIMEOUT	(HZ * 60 * 5)
+
+struct devcd_entry {
+	struct device devcd_dev;
+	const void *data;
+	size_t datalen;
+	struct module *owner;
+	ssize_t (*read)(char *buffer, loff_t offset, size_t count,
+			const void *data, size_t datalen);
+	void (*free)(const void *data);
+	struct delayed_work del_wk;
+	struct device *failing_dev;
+};
+
+static struct devcd_entry *dev_to_devcd(struct device *dev)
+{
+	return container_of(dev, struct devcd_entry, devcd_dev);
+}
+
+static void devcd_dev_release(struct device *dev)
+{
+	struct devcd_entry *devcd = dev_to_devcd(dev);
+
+	devcd->free(devcd->data);
+	module_put(devcd->owner);
+
+	/*
+	 * this seems racy, but I don't see a notifier or such on
+	 * a struct device to know when it goes away?
+	 */
+	if (devcd->failing_dev->kobj.sd)
+		sysfs_delete_link(&devcd->failing_dev->kobj, &dev->kobj,
+				  "dev_coredump");
+
+	put_device(devcd->failing_dev);
+	kfree(devcd);
+}
+
+static void devcd_del(struct work_struct *wk)
+{
+	struct devcd_entry *devcd;
+
+	devcd = container_of(wk, struct devcd_entry, del_wk.work);
+
+	device_del(&devcd->devcd_dev);
+	put_device(&devcd->devcd_dev);
+}
+
+static ssize_t devcd_data_read(struct file *filp, struct kobject *kobj,
+			       struct bin_attribute *bin_attr,
+			       char *buffer, loff_t offset, size_t count)
+{
+	struct device *dev = kobj_to_dev(kobj);
+	struct devcd_entry *devcd = dev_to_devcd(dev);
+
+	return devcd->read(buffer, offset, count, devcd->data, devcd->datalen);
+}
+
+static ssize_t devcd_data_write(struct file *filp, struct kobject *kobj,
+				struct bin_attribute *bin_attr,
+				char *buffer, loff_t offset, size_t count)
+{
+	struct device *dev = kobj_to_dev(kobj);
+	struct devcd_entry *devcd = dev_to_devcd(dev);
+
+	mod_delayed_work(system_wq, &devcd->del_wk, 0);
+
+	return count;
+}
+
+static struct bin_attribute devcd_attr_data = {
+	.attr = { .name = "data", .mode = S_IRUSR | S_IWUSR, },
+	.size = 0,
+	.read = devcd_data_read,
+	.write = devcd_data_write,
+};
+
+static struct bin_attribute *devcd_dev_bin_attrs[] = {
+	&devcd_attr_data, NULL,
+};
+
+static const struct attribute_group devcd_dev_group = {
+	.bin_attrs = devcd_dev_bin_attrs,
+};
+
+static const struct attribute_group *devcd_dev_groups[] = {
+	&devcd_dev_group, NULL,
+};
+
+static struct class devcd_class = {
+	.name		= "devcoredump",
+	.owner		= THIS_MODULE,
+	.dev_release	= devcd_dev_release,
+	.dev_groups	= devcd_dev_groups,
+};
+
+static ssize_t devcd_readv(char *buffer, loff_t offset, size_t count,
+			   const void *data, size_t datalen)
+{
+	if (offset > datalen)
+		return -EINVAL;
+
+	if (offset + count > datalen)
+		count = datalen - offset;
+
+	if (count)
+		memcpy(buffer, ((u8 *)data) + offset, count);
+
+	return count;
+}
+
+/**
+ * dev_coredumpv - create firmware coredump with vmalloc data
+ * @dev: the struct device for the crashed device
+ * @data: vmalloc data containing the firmware coredump
+ * @datalen: length of the data
+ * @gfp: allocation flags
+ */
+void dev_coredumpv(struct device *dev, const void *data, size_t datalen,
+		   gfp_t gfp)
+{
+	dev_coredumpm(dev, NULL, data, datalen, gfp, devcd_readv, vfree);
+}
+EXPORT_SYMBOL(dev_coredumpv);
+
+static int devcd_match_failing(struct device *dev, const void *failing)
+{
+	struct devcd_entry *devcd = dev_to_devcd(dev);
+
+	return devcd->failing_dev == failing;
+}
+
+/**
+ * dev_coredumpm - create firmware coredump with read/free methods
+ * @dev: the struct device for the crashed device
+ * @data: data cookie for the @read/@free functions
+ * @datalen: length of the data
+ * @gfp: allocation flags
+ * @read: function to read from the given buffer
+ * @free: function to free the given buffer
+ */
+void dev_coredumpm(struct device *dev, struct module *owner,
+		   const void *data, size_t datalen, gfp_t gfp,
+		   ssize_t (*read)(char *buffer, loff_t offset, size_t count,
+				   const void *data, size_t datalen),
+		   void (*free)(const void *data))
+{
+	static atomic_t devcd_count = ATOMIC_INIT(0);
+	struct devcd_entry *devcd;
+	struct device *existing;
+
+	existing = class_find_device(&devcd_class, NULL, dev,
+				     devcd_match_failing);
+	if (existing) {
+		put_device(existing);
+		return;
+	}
+
+	if (!try_module_get(owner))
+		return;
+
+	devcd = kzalloc(sizeof(*devcd), gfp);
+	if (!devcd)
+		goto put_module;
+
+	devcd->owner = owner;
+	devcd->data = data;
+	devcd->datalen = datalen;
+	devcd->read = read;
+	devcd->free = free;
+	devcd->failing_dev = get_device(dev);
+
+	device_initialize(&devcd->devcd_dev);
+
+	dev_set_name(&devcd->devcd_dev, "devcd%d",
+		     atomic_inc_return(&devcd_count));
+	devcd->devcd_dev.class = &devcd_class;
+
+	if (device_add(&devcd->devcd_dev))
+		goto put_device;
+
+	if (sysfs_create_link(&devcd->devcd_dev.kobj, &dev->kobj,
+			      "failing_device"))
+		/* nothing - symlink will be missing */;
+
+	if (sysfs_create_link(&dev->kobj, &devcd->devcd_dev.kobj,
+			      "dev_coredump"))
+		/* nothing - symlink will be missing */;
+
+	INIT_DELAYED_WORK(&devcd->del_wk, devcd_del);
+	schedule_delayed_work(&devcd->del_wk, DEVCD_TIMEOUT);
+
+	return;
+ put_device:
+	put_device(&devcd->devcd_dev);
+ put_module:
+	module_put(owner);
+}
+EXPORT_SYMBOL(dev_coredumpm);
+
+static int __init devcoredump_init(void)
+{
+	return class_register(&devcd_class);
+}
+module_init(devcoredump_init);
+
+static int devcd_free(struct device *dev, void *data)
+{
+	struct devcd_entry *devcd = dev_to_devcd(dev);
+
+	flush_delayed_work(&devcd->del_wk);
+	return 0;
+}
+
+static void __exit devcoredump_exit(void)
+{
+	class_for_each_device(&devcd_class, NULL, NULL, devcd_free);
+	class_unregister(&devcd_class);
+}
+module_exit(devcoredump_exit);
diff --git a/include/linux/devcoredump.h b/include/linux/devcoredump.h
new file mode 100644
index 000000000000..c0a360e99f64
--- /dev/null
+++ b/include/linux/devcoredump.h
@@ -0,0 +1,35 @@
+#ifndef __DEVCOREDUMP_H
+#define __DEVCOREDUMP_H
+
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/vmalloc.h>
+
+#ifdef CONFIG_DEV_COREDUMP
+void dev_coredumpv(struct device *dev, const void *data, size_t datalen,
+		   gfp_t gfp);
+
+void dev_coredumpm(struct device *dev, struct module *owner,
+		   const void *data, size_t datalen, gfp_t gfp,
+		   ssize_t (*read)(char *buffer, loff_t offset, size_t count,
+				   const void *data, size_t datalen),
+		   void (*free)(const void *data));
+#else
+static inline void dev_coredumpv(struct device *dev, const void *data,
+				 size_t datalen, gfp_t gfp)
+{
+	vfree(data);
+}
+
+static inline void
+dev_coredumpm(struct device *dev, struct module *owner,
+	      const void *data, size_t datalen, gfp_t gfp,
+	      ssize_t (*read)(char *buffer, loff_t offset, size_t count,
+			      const void *data, size_t datalen),
+	      void (*free)(const void *data))
+{
+	free(data);
+}
+#endif /* CONFIG_DEV_COREDUMP */
+
+#endif /* __DEVCOREDUMP_H */
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [RFC v2] device coredump: add new device coredump class
  2014-09-05  8:50 [RFC v2] device coredump: add new device coredump class Johannes Berg
@ 2014-09-05  9:40 ` Jonas Gorski
  2014-09-05  9:56     ` Johannes Berg
  2014-09-05 22:13   ` Greg Kroah-Hartman
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 16+ messages in thread
From: Jonas Gorski @ 2014-09-05  9:40 UTC (permalink / raw)
  To: Johannes Berg
  Cc: linux-kernel, Greg Kroah-Hartman, Daniel Vetter,
	Emmanuel Grumbach, Luciano Coelho, Kalle Valo, dri-devel,
	linux-wireless

On Fri, Sep 5, 2014 at 10:50 AM, Johannes Berg
<johannes@sipsolutions.net> wrote:
> From: Johannes Berg <johannes.berg@intel.com>

Can't you just send from the correct address? ;p

(snip)

> diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
> index 4e7f0ff83ae7..134f763d90fd 100644
> --- a/drivers/base/Kconfig
> +++ b/drivers/base/Kconfig
> @@ -165,6 +165,27 @@ config FW_LOADER_USER_HELPER_FALLBACK
>
>           If you are unsure about this, say N here.
>
> +config WANT_DEV_COREDUMP
> +       bool
> +       help
> +         Drivers should "select" this option if they desire to use the
> +         device coredump mechanism.
> +
> +config DISABLE_DEV_COREDUMP
> +       bool "Disable device coredump" if EXPERT
> +       help
> +         Disable the device coredump mechanism despite drivers wanting to
> +         use it; this allows for more sensitive systems or systems that
> +         don't want to ever access the information to not have the code,
> +         nor keep any data.
> +
> +         If unsure, say N.
> +
> +config DEV_COREDUMP
> +       bool
> +       default y if WANT_DEV_COREDUMP
> +       depends on !DISABLE_DEV_COREDUMP

How about the following to avoid negative options:

config DEV_COREDUMP
       bool "Enable device coredump" if EXPERT
       default y if WANT_DEV_COREDUMP
       help
         Enable the device coredump mechanism for drivers wanting to
         use it. Disabling allows for more sensitive systems or systems that
         don't want to ever access the information to not have the code,
         nor keep any data.

         If unsure, say Y.



Jonas

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC v2] device coredump: add new device coredump class
  2014-09-05  9:40 ` Jonas Gorski
@ 2014-09-05  9:56     ` Johannes Berg
  0 siblings, 0 replies; 16+ messages in thread
From: Johannes Berg @ 2014-09-05  9:56 UTC (permalink / raw)
  To: Jonas Gorski
  Cc: linux-kernel, Greg Kroah-Hartman, Daniel Vetter,
	Emmanuel Grumbach, Luciano Coelho, Kalle Valo, dri-devel,
	linux-wireless

On Fri, 2014-09-05 at 11:40 +0200, Jonas Gorski wrote:
> On Fri, Sep 5, 2014 at 10:50 AM, Johannes Berg
> <johannes@sipsolutions.net> wrote:
> > From: Johannes Berg <johannes.berg@intel.com>
> 
> Can't you just send from the correct address? ;p

Not easily :)

> How about the following to avoid negative options:
> 
> config DEV_COREDUMP
>        bool "Enable device coredump" if EXPERT
>        default y if WANT_DEV_COREDUMP
>        help
>          Enable the device coredump mechanism for drivers wanting to
>          use it. Disabling allows for more sensitive systems or systems that
>          don't want to ever access the information to not have the code,
>          nor keep any data.
> 
>          If unsure, say Y.

Yeah, that seems reasonable. I guess I did the negative option because I
was thinking about the negative case ("I really don't want this!!!
111!!") :)

johannes


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC v2] device coredump: add new device coredump class
@ 2014-09-05  9:56     ` Johannes Berg
  0 siblings, 0 replies; 16+ messages in thread
From: Johannes Berg @ 2014-09-05  9:56 UTC (permalink / raw)
  To: Jonas Gorski
  Cc: Emmanuel Grumbach, Greg Kroah-Hartman, Kalle Valo,
	linux-wireless, linux-kernel, dri-devel, Daniel Vetter,
	Luciano Coelho

On Fri, 2014-09-05 at 11:40 +0200, Jonas Gorski wrote:
> On Fri, Sep 5, 2014 at 10:50 AM, Johannes Berg
> <johannes@sipsolutions.net> wrote:
> > From: Johannes Berg <johannes.berg@intel.com>
> 
> Can't you just send from the correct address? ;p

Not easily :)

> How about the following to avoid negative options:
> 
> config DEV_COREDUMP
>        bool "Enable device coredump" if EXPERT
>        default y if WANT_DEV_COREDUMP
>        help
>          Enable the device coredump mechanism for drivers wanting to
>          use it. Disabling allows for more sensitive systems or systems that
>          don't want to ever access the information to not have the code,
>          nor keep any data.
> 
>          If unsure, say Y.

Yeah, that seems reasonable. I guess I did the negative option because I
was thinking about the negative case ("I really don't want this!!!
111!!") :)

johannes

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC v2] device coredump: add new device coredump class
  2014-09-05  8:50 [RFC v2] device coredump: add new device coredump class Johannes Berg
@ 2014-09-05 22:13   ` Greg Kroah-Hartman
  2014-09-05 22:13   ` Greg Kroah-Hartman
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 16+ messages in thread
From: Greg Kroah-Hartman @ 2014-09-05 22:13 UTC (permalink / raw)
  To: Johannes Berg
  Cc: linux-kernel, Daniel Vetter, Emmanuel Grumbach, Luciano Coelho,
	Kalle Valo, dri-devel, linux-wireless

On Fri, Sep 05, 2014 at 10:50:54AM +0200, Johannes Berg wrote:
> From: Johannes Berg <johannes.berg@intel.com>
> 
> Many devices run firmware and/or complex hardware, and most of that
> can have bugs. When it misbehaves, however, it is often much harder
> to debug than software running on the host.
> 
> Introduce a "device coredump" mechanism to allow dumping internal
> device/firmware state through a generalized mechanism. As devices
> are different and information needed can vary accordingly, this
> doesn't prescribe a file format - it just provides mechanism to
> get data to be able to capture it in a generalized way (e.g. in
> distributions.)
> 
> Note that generalized capturing of such data may result in privacy
> issues, so users generally need to be involved. In order to allow
> certain users/system integrators/... to disable the feature at all,
> introduce a Kconfig option to override the drivers that would like
> to have the feature.
> 
> For now, this provides two ways of dumping data:
>  1) with a vmalloc'ed area, that is then given to the subsystem
>     and freed after retrieval or timeout
>  2) with a generalized reader/free function method
> 
> We could/should add more options, e.g. a list of pages, since the
> vmalloc area is very limited on some architectures.
> 
> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
> ---
>  MAINTAINERS                 |   7 ++
>  drivers/base/Kconfig        |  21 ++++
>  drivers/base/Makefile       |   1 +
>  drivers/base/devcoredump.c  | 256 ++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/devcoredump.h |  35 ++++++
>  5 files changed, 320 insertions(+)
>  create mode 100644 drivers/base/devcoredump.c
>  create mode 100644 include/linux/devcoredump.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 2f85f55c8fb8..8e31d053fb14 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -2847,6 +2847,13 @@ T:	git git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb.git
>  S:	Maintained
>  F:	drivers/usb/dwc3/
>  
> +DEVICE COREDUMP (DEV_COREDUMP)
> +M:	Johannes Berg <johannes@sipsolutions.net>
> +L:	linux-kernel@vger.kernel.org
> +S:	Maintained
> +F:	drivers/base/devcoredump.c
> +F:	include/linux/devcoredump.h
> +
>  DEVICE FREQUENCY (DEVFREQ)
>  M:	MyungJoo Ham <myungjoo.ham@samsung.com>
>  M:	Kyungmin Park <kyungmin.park@samsung.com>
> diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
> index 4e7f0ff83ae7..134f763d90fd 100644
> --- a/drivers/base/Kconfig
> +++ b/drivers/base/Kconfig
> @@ -165,6 +165,27 @@ config FW_LOADER_USER_HELPER_FALLBACK
>  
>  	  If you are unsure about this, say N here.
>  
> +config WANT_DEV_COREDUMP
> +	bool
> +	help
> +	  Drivers should "select" this option if they desire to use the
> +	  device coredump mechanism.
> +
> +config DISABLE_DEV_COREDUMP
> +	bool "Disable device coredump" if EXPERT
> +	help
> +	  Disable the device coredump mechanism despite drivers wanting to
> +	  use it; this allows for more sensitive systems or systems that
> +	  don't want to ever access the information to not have the code,
> +	  nor keep any data.
> +
> +	  If unsure, say N.
> +
> +config DEV_COREDUMP
> +	bool
> +	default y if WANT_DEV_COREDUMP
> +	depends on !DISABLE_DEV_COREDUMP
> +
>  config DEBUG_DRIVER
>  	bool "Driver Core verbose debug messages"
>  	depends on DEBUG_KERNEL
> diff --git a/drivers/base/Makefile b/drivers/base/Makefile
> index 4aab26ec0292..6922cd6850a2 100644
> --- a/drivers/base/Makefile
> +++ b/drivers/base/Makefile
> @@ -21,6 +21,7 @@ obj-$(CONFIG_SYS_HYPERVISOR) += hypervisor.o
>  obj-$(CONFIG_REGMAP)	+= regmap/
>  obj-$(CONFIG_SOC_BUS) += soc.o
>  obj-$(CONFIG_PINCTRL) += pinctrl.o
> +obj-$(CONFIG_DEV_COREDUMP) += devcoredump.o
>  
>  ccflags-$(CONFIG_DEBUG_DRIVER) := -DDEBUG
>  
> diff --git a/drivers/base/devcoredump.c b/drivers/base/devcoredump.c
> new file mode 100644
> index 000000000000..d6ecf13a5903
> --- /dev/null
> +++ b/drivers/base/devcoredump.c
> @@ -0,0 +1,256 @@
> +/*
> + * This file is provided under the GPLv2 license.
> + *
> + * GPL LICENSE SUMMARY
> + *
> + * Copyright(c) 2014 Intel Mobile Communications GmbH
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of version 2 of the GNU General Public License as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details.
> + *
> + * The full GNU General Public License is included in this distribution
> + * in the file called COPYING.
> + *
> + * Contact Information:
> + *  Intel Linux Wireless <ilw@linux.intel.com>
> + * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
> + */
> +#include <linux/module.h>
> +#include <linux/device.h>
> +#include <linux/devcoredump.h>
> +#include <linux/list.h>
> +#include <linux/slab.h>
> +#include <linux/fs.h>
> +#include <linux/workqueue.h>
> +
> +MODULE_AUTHOR("Johannes Berg <johannes@sipsolutions.net>");
> +MODULE_DESCRIPTION("Device Coredump support");
> +MODULE_LICENSE("GPL");

As you only allow Y or N as build options, it's not really a "module" :)

> +/* if data isn't read by userspace after 5 minutes then delete it */
> +#define DEVCD_TIMEOUT	(HZ * 60 * 5)
> +
> +struct devcd_entry {
> +	struct device devcd_dev;
> +	const void *data;
> +	size_t datalen;
> +	struct module *owner;
> +	ssize_t (*read)(char *buffer, loff_t offset, size_t count,
> +			const void *data, size_t datalen);
> +	void (*free)(const void *data);
> +	struct delayed_work del_wk;
> +	struct device *failing_dev;
> +};
> +
> +static struct devcd_entry *dev_to_devcd(struct device *dev)
> +{
> +	return container_of(dev, struct devcd_entry, devcd_dev);
> +}
> +
> +static void devcd_dev_release(struct device *dev)
> +{
> +	struct devcd_entry *devcd = dev_to_devcd(dev);
> +
> +	devcd->free(devcd->data);
> +	module_put(devcd->owner);
> +
> +	/*
> +	 * this seems racy, but I don't see a notifier or such on
> +	 * a struct device to know when it goes away?
> +	 */
> +	if (devcd->failing_dev->kobj.sd)
> +		sysfs_delete_link(&devcd->failing_dev->kobj, &dev->kobj,
> +				  "dev_coredump");

What is this link?  It should "just go away" if this:

> +	put_device(devcd->failing_dev);

was the last put_device() call on the failing_dev, right?  So you
shouldn't need to make this call to sysfs_delete_link().

> +	kfree(devcd);
> +}
> +
> +static void devcd_del(struct work_struct *wk)
> +{
> +	struct devcd_entry *devcd;
> +
> +	devcd = container_of(wk, struct devcd_entry, del_wk.work);
> +
> +	device_del(&devcd->devcd_dev);
> +	put_device(&devcd->devcd_dev);
> +}
> +
> +static ssize_t devcd_data_read(struct file *filp, struct kobject *kobj,
> +			       struct bin_attribute *bin_attr,
> +			       char *buffer, loff_t offset, size_t count)
> +{
> +	struct device *dev = kobj_to_dev(kobj);
> +	struct devcd_entry *devcd = dev_to_devcd(dev);
> +
> +	return devcd->read(buffer, offset, count, devcd->data, devcd->datalen);
> +}
> +
> +static ssize_t devcd_data_write(struct file *filp, struct kobject *kobj,
> +				struct bin_attribute *bin_attr,
> +				char *buffer, loff_t offset, size_t count)
> +{
> +	struct device *dev = kobj_to_dev(kobj);
> +	struct devcd_entry *devcd = dev_to_devcd(dev);
> +
> +	mod_delayed_work(system_wq, &devcd->del_wk, 0);
> +
> +	return count;
> +}
> +
> +static struct bin_attribute devcd_attr_data = {
> +	.attr = { .name = "data", .mode = S_IRUSR | S_IWUSR, },
> +	.size = 0,
> +	.read = devcd_data_read,
> +	.write = devcd_data_write,
> +};
> +
> +static struct bin_attribute *devcd_dev_bin_attrs[] = {
> +	&devcd_attr_data, NULL,
> +};
> +
> +static const struct attribute_group devcd_dev_group = {
> +	.bin_attrs = devcd_dev_bin_attrs,
> +};
> +
> +static const struct attribute_group *devcd_dev_groups[] = {
> +	&devcd_dev_group, NULL,
> +};
> +
> +static struct class devcd_class = {
> +	.name		= "devcoredump",
> +	.owner		= THIS_MODULE,
> +	.dev_release	= devcd_dev_release,
> +	.dev_groups	= devcd_dev_groups,
> +};
> +
> +static ssize_t devcd_readv(char *buffer, loff_t offset, size_t count,
> +			   const void *data, size_t datalen)
> +{
> +	if (offset > datalen)
> +		return -EINVAL;
> +
> +	if (offset + count > datalen)
> +		count = datalen - offset;
> +
> +	if (count)
> +		memcpy(buffer, ((u8 *)data) + offset, count);
> +
> +	return count;
> +}
> +
> +/**
> + * dev_coredumpv - create firmware coredump with vmalloc data
> + * @dev: the struct device for the crashed device
> + * @data: vmalloc data containing the firmware coredump
> + * @datalen: length of the data
> + * @gfp: allocation flags
> + */
> +void dev_coredumpv(struct device *dev, const void *data, size_t datalen,
> +		   gfp_t gfp)
> +{
> +	dev_coredumpm(dev, NULL, data, datalen, gfp, devcd_readv, vfree);
> +}
> +EXPORT_SYMBOL(dev_coredumpv);

driver core exports are all EXPORT_SYMBOL_GPL() please.

> +static int devcd_match_failing(struct device *dev, const void *failing)
> +{
> +	struct devcd_entry *devcd = dev_to_devcd(dev);
> +
> +	return devcd->failing_dev == failing;
> +}
> +
> +/**
> + * dev_coredumpm - create firmware coredump with read/free methods
> + * @dev: the struct device for the crashed device
> + * @data: data cookie for the @read/@free functions
> + * @datalen: length of the data
> + * @gfp: allocation flags
> + * @read: function to read from the given buffer
> + * @free: function to free the given buffer
> + */

You forgot @owner in this list.

> +void dev_coredumpm(struct device *dev, struct module *owner,
> +		   const void *data, size_t datalen, gfp_t gfp,
> +		   ssize_t (*read)(char *buffer, loff_t offset, size_t count,
> +				   const void *data, size_t datalen),
> +		   void (*free)(const void *data))
> +{
> +	static atomic_t devcd_count = ATOMIC_INIT(0);
> +	struct devcd_entry *devcd;
> +	struct device *existing;
> +
> +	existing = class_find_device(&devcd_class, NULL, dev,
> +				     devcd_match_failing);
> +	if (existing) {
> +		put_device(existing);
> +		return;
> +	}

I thought multiple dumps per "device" would throw away the older ones?
It's fine if you don't, but you might want to document this behavior in
the kerneldoc for the function.

Other than those very minor things, looks good to me, want to resend it
cleaned up and without the "RFC" in the Subject?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC v2] device coredump: add new device coredump class
@ 2014-09-05 22:13   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 16+ messages in thread
From: Greg Kroah-Hartman @ 2014-09-05 22:13 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Emmanuel Grumbach, Daniel Vetter, Kalle Valo, linux-wireless,
	linux-kernel, dri-devel, Luciano Coelho

On Fri, Sep 05, 2014 at 10:50:54AM +0200, Johannes Berg wrote:
> From: Johannes Berg <johannes.berg@intel.com>
> 
> Many devices run firmware and/or complex hardware, and most of that
> can have bugs. When it misbehaves, however, it is often much harder
> to debug than software running on the host.
> 
> Introduce a "device coredump" mechanism to allow dumping internal
> device/firmware state through a generalized mechanism. As devices
> are different and information needed can vary accordingly, this
> doesn't prescribe a file format - it just provides mechanism to
> get data to be able to capture it in a generalized way (e.g. in
> distributions.)
> 
> Note that generalized capturing of such data may result in privacy
> issues, so users generally need to be involved. In order to allow
> certain users/system integrators/... to disable the feature at all,
> introduce a Kconfig option to override the drivers that would like
> to have the feature.
> 
> For now, this provides two ways of dumping data:
>  1) with a vmalloc'ed area, that is then given to the subsystem
>     and freed after retrieval or timeout
>  2) with a generalized reader/free function method
> 
> We could/should add more options, e.g. a list of pages, since the
> vmalloc area is very limited on some architectures.
> 
> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
> ---
>  MAINTAINERS                 |   7 ++
>  drivers/base/Kconfig        |  21 ++++
>  drivers/base/Makefile       |   1 +
>  drivers/base/devcoredump.c  | 256 ++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/devcoredump.h |  35 ++++++
>  5 files changed, 320 insertions(+)
>  create mode 100644 drivers/base/devcoredump.c
>  create mode 100644 include/linux/devcoredump.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 2f85f55c8fb8..8e31d053fb14 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -2847,6 +2847,13 @@ T:	git git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb.git
>  S:	Maintained
>  F:	drivers/usb/dwc3/
>  
> +DEVICE COREDUMP (DEV_COREDUMP)
> +M:	Johannes Berg <johannes@sipsolutions.net>
> +L:	linux-kernel@vger.kernel.org
> +S:	Maintained
> +F:	drivers/base/devcoredump.c
> +F:	include/linux/devcoredump.h
> +
>  DEVICE FREQUENCY (DEVFREQ)
>  M:	MyungJoo Ham <myungjoo.ham@samsung.com>
>  M:	Kyungmin Park <kyungmin.park@samsung.com>
> diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
> index 4e7f0ff83ae7..134f763d90fd 100644
> --- a/drivers/base/Kconfig
> +++ b/drivers/base/Kconfig
> @@ -165,6 +165,27 @@ config FW_LOADER_USER_HELPER_FALLBACK
>  
>  	  If you are unsure about this, say N here.
>  
> +config WANT_DEV_COREDUMP
> +	bool
> +	help
> +	  Drivers should "select" this option if they desire to use the
> +	  device coredump mechanism.
> +
> +config DISABLE_DEV_COREDUMP
> +	bool "Disable device coredump" if EXPERT
> +	help
> +	  Disable the device coredump mechanism despite drivers wanting to
> +	  use it; this allows for more sensitive systems or systems that
> +	  don't want to ever access the information to not have the code,
> +	  nor keep any data.
> +
> +	  If unsure, say N.
> +
> +config DEV_COREDUMP
> +	bool
> +	default y if WANT_DEV_COREDUMP
> +	depends on !DISABLE_DEV_COREDUMP
> +
>  config DEBUG_DRIVER
>  	bool "Driver Core verbose debug messages"
>  	depends on DEBUG_KERNEL
> diff --git a/drivers/base/Makefile b/drivers/base/Makefile
> index 4aab26ec0292..6922cd6850a2 100644
> --- a/drivers/base/Makefile
> +++ b/drivers/base/Makefile
> @@ -21,6 +21,7 @@ obj-$(CONFIG_SYS_HYPERVISOR) += hypervisor.o
>  obj-$(CONFIG_REGMAP)	+= regmap/
>  obj-$(CONFIG_SOC_BUS) += soc.o
>  obj-$(CONFIG_PINCTRL) += pinctrl.o
> +obj-$(CONFIG_DEV_COREDUMP) += devcoredump.o
>  
>  ccflags-$(CONFIG_DEBUG_DRIVER) := -DDEBUG
>  
> diff --git a/drivers/base/devcoredump.c b/drivers/base/devcoredump.c
> new file mode 100644
> index 000000000000..d6ecf13a5903
> --- /dev/null
> +++ b/drivers/base/devcoredump.c
> @@ -0,0 +1,256 @@
> +/*
> + * This file is provided under the GPLv2 license.
> + *
> + * GPL LICENSE SUMMARY
> + *
> + * Copyright(c) 2014 Intel Mobile Communications GmbH
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of version 2 of the GNU General Public License as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details.
> + *
> + * The full GNU General Public License is included in this distribution
> + * in the file called COPYING.
> + *
> + * Contact Information:
> + *  Intel Linux Wireless <ilw@linux.intel.com>
> + * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
> + */
> +#include <linux/module.h>
> +#include <linux/device.h>
> +#include <linux/devcoredump.h>
> +#include <linux/list.h>
> +#include <linux/slab.h>
> +#include <linux/fs.h>
> +#include <linux/workqueue.h>
> +
> +MODULE_AUTHOR("Johannes Berg <johannes@sipsolutions.net>");
> +MODULE_DESCRIPTION("Device Coredump support");
> +MODULE_LICENSE("GPL");

As you only allow Y or N as build options, it's not really a "module" :)

> +/* if data isn't read by userspace after 5 minutes then delete it */
> +#define DEVCD_TIMEOUT	(HZ * 60 * 5)
> +
> +struct devcd_entry {
> +	struct device devcd_dev;
> +	const void *data;
> +	size_t datalen;
> +	struct module *owner;
> +	ssize_t (*read)(char *buffer, loff_t offset, size_t count,
> +			const void *data, size_t datalen);
> +	void (*free)(const void *data);
> +	struct delayed_work del_wk;
> +	struct device *failing_dev;
> +};
> +
> +static struct devcd_entry *dev_to_devcd(struct device *dev)
> +{
> +	return container_of(dev, struct devcd_entry, devcd_dev);
> +}
> +
> +static void devcd_dev_release(struct device *dev)
> +{
> +	struct devcd_entry *devcd = dev_to_devcd(dev);
> +
> +	devcd->free(devcd->data);
> +	module_put(devcd->owner);
> +
> +	/*
> +	 * this seems racy, but I don't see a notifier or such on
> +	 * a struct device to know when it goes away?
> +	 */
> +	if (devcd->failing_dev->kobj.sd)
> +		sysfs_delete_link(&devcd->failing_dev->kobj, &dev->kobj,
> +				  "dev_coredump");

What is this link?  It should "just go away" if this:

> +	put_device(devcd->failing_dev);

was the last put_device() call on the failing_dev, right?  So you
shouldn't need to make this call to sysfs_delete_link().

> +	kfree(devcd);
> +}
> +
> +static void devcd_del(struct work_struct *wk)
> +{
> +	struct devcd_entry *devcd;
> +
> +	devcd = container_of(wk, struct devcd_entry, del_wk.work);
> +
> +	device_del(&devcd->devcd_dev);
> +	put_device(&devcd->devcd_dev);
> +}
> +
> +static ssize_t devcd_data_read(struct file *filp, struct kobject *kobj,
> +			       struct bin_attribute *bin_attr,
> +			       char *buffer, loff_t offset, size_t count)
> +{
> +	struct device *dev = kobj_to_dev(kobj);
> +	struct devcd_entry *devcd = dev_to_devcd(dev);
> +
> +	return devcd->read(buffer, offset, count, devcd->data, devcd->datalen);
> +}
> +
> +static ssize_t devcd_data_write(struct file *filp, struct kobject *kobj,
> +				struct bin_attribute *bin_attr,
> +				char *buffer, loff_t offset, size_t count)
> +{
> +	struct device *dev = kobj_to_dev(kobj);
> +	struct devcd_entry *devcd = dev_to_devcd(dev);
> +
> +	mod_delayed_work(system_wq, &devcd->del_wk, 0);
> +
> +	return count;
> +}
> +
> +static struct bin_attribute devcd_attr_data = {
> +	.attr = { .name = "data", .mode = S_IRUSR | S_IWUSR, },
> +	.size = 0,
> +	.read = devcd_data_read,
> +	.write = devcd_data_write,
> +};
> +
> +static struct bin_attribute *devcd_dev_bin_attrs[] = {
> +	&devcd_attr_data, NULL,
> +};
> +
> +static const struct attribute_group devcd_dev_group = {
> +	.bin_attrs = devcd_dev_bin_attrs,
> +};
> +
> +static const struct attribute_group *devcd_dev_groups[] = {
> +	&devcd_dev_group, NULL,
> +};
> +
> +static struct class devcd_class = {
> +	.name		= "devcoredump",
> +	.owner		= THIS_MODULE,
> +	.dev_release	= devcd_dev_release,
> +	.dev_groups	= devcd_dev_groups,
> +};
> +
> +static ssize_t devcd_readv(char *buffer, loff_t offset, size_t count,
> +			   const void *data, size_t datalen)
> +{
> +	if (offset > datalen)
> +		return -EINVAL;
> +
> +	if (offset + count > datalen)
> +		count = datalen - offset;
> +
> +	if (count)
> +		memcpy(buffer, ((u8 *)data) + offset, count);
> +
> +	return count;
> +}
> +
> +/**
> + * dev_coredumpv - create firmware coredump with vmalloc data
> + * @dev: the struct device for the crashed device
> + * @data: vmalloc data containing the firmware coredump
> + * @datalen: length of the data
> + * @gfp: allocation flags
> + */
> +void dev_coredumpv(struct device *dev, const void *data, size_t datalen,
> +		   gfp_t gfp)
> +{
> +	dev_coredumpm(dev, NULL, data, datalen, gfp, devcd_readv, vfree);
> +}
> +EXPORT_SYMBOL(dev_coredumpv);

driver core exports are all EXPORT_SYMBOL_GPL() please.

> +static int devcd_match_failing(struct device *dev, const void *failing)
> +{
> +	struct devcd_entry *devcd = dev_to_devcd(dev);
> +
> +	return devcd->failing_dev == failing;
> +}
> +
> +/**
> + * dev_coredumpm - create firmware coredump with read/free methods
> + * @dev: the struct device for the crashed device
> + * @data: data cookie for the @read/@free functions
> + * @datalen: length of the data
> + * @gfp: allocation flags
> + * @read: function to read from the given buffer
> + * @free: function to free the given buffer
> + */

You forgot @owner in this list.

> +void dev_coredumpm(struct device *dev, struct module *owner,
> +		   const void *data, size_t datalen, gfp_t gfp,
> +		   ssize_t (*read)(char *buffer, loff_t offset, size_t count,
> +				   const void *data, size_t datalen),
> +		   void (*free)(const void *data))
> +{
> +	static atomic_t devcd_count = ATOMIC_INIT(0);
> +	struct devcd_entry *devcd;
> +	struct device *existing;
> +
> +	existing = class_find_device(&devcd_class, NULL, dev,
> +				     devcd_match_failing);
> +	if (existing) {
> +		put_device(existing);
> +		return;
> +	}

I thought multiple dumps per "device" would throw away the older ones?
It's fine if you don't, but you might want to document this behavior in
the kerneldoc for the function.

Other than those very minor things, looks good to me, want to resend it
cleaned up and without the "RFC" in the Subject?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC v2] device coredump: add new device coredump class
@ 2014-09-07  7:42   ` Arik Nemtsov
  0 siblings, 0 replies; 16+ messages in thread
From: Arik Nemtsov @ 2014-09-07  7:42 UTC (permalink / raw)
  To: Johannes Berg
  Cc: linux-kernel, Greg Kroah-Hartman, Daniel Vetter,
	Emmanuel Grumbach, Luciano Coelho, Kalle Valo, dri-devel,
	linux-wireless

On Fri, Sep 5, 2014 at 11:50 AM, Johannes Berg
<johannes@sipsolutions.net> wrote:
> From: Johannes Berg <johannes.berg@intel.com>
>
> Many devices run firmware and/or complex hardware, and most of that
> can have bugs. When it misbehaves, however, it is often much harder
> to debug than software running on the host.
>

Very nice concept IMO :)

[....]
> +void dev_coredumpm(struct device *dev, struct module *owner,
> +                  const void *data, size_t datalen, gfp_t gfp,
> +                  ssize_t (*read)(char *buffer, loff_t offset, size_t count,
> +                                  const void *data, size_t datalen),
> +                  void (*free)(const void *data))
> +{
> +       static atomic_t devcd_count = ATOMIC_INIT(0);
> +       struct devcd_entry *devcd;
> +       struct device *existing;
> +
> +       existing = class_find_device(&devcd_class, NULL, dev,
> +                                    devcd_match_failing);
> +       if (existing) {
> +               put_device(existing);
> +               return;
> +       }
> +
> +       if (!try_module_get(owner))
> +               return;
> +
> +       devcd = kzalloc(sizeof(*devcd), gfp);
> +       if (!devcd)
> +               goto put_module;
> +
> +       devcd->owner = owner;
> +       devcd->data = data;
> +       devcd->datalen = datalen;
> +       devcd->read = read;
> +       devcd->free = free;
> +       devcd->failing_dev = get_device(dev);
> +
> +       device_initialize(&devcd->devcd_dev);
> +
> +       dev_set_name(&devcd->devcd_dev, "devcd%d",
> +                    atomic_inc_return(&devcd_count));
> +       devcd->devcd_dev.class = &devcd_class;
> +
> +       if (device_add(&devcd->devcd_dev))
> +               goto put_device;
> +
> +       if (sysfs_create_link(&devcd->devcd_dev.kobj, &dev->kobj,
> +                             "failing_device"))
> +               /* nothing - symlink will be missing */;
> +
> +       if (sysfs_create_link(&dev->kobj, &devcd->devcd_dev.kobj,
> +                             "dev_coredump"))
> +               /* nothing - symlink will be missing */;

Aren't you confusing the compiler here and above, doing the
INIT_DELAYED_WORK only when link creation failed? I would have thought
some braces are necessary..

> +
> +       INIT_DELAYED_WORK(&devcd->del_wk, devcd_del);
> +       schedule_delayed_work(&devcd->del_wk, DEVCD_TIMEOUT);
> +
> +       return;
> + put_device:
> +       put_device(&devcd->devcd_dev);
> + put_module:
> +       module_put(owner);
> +}
> +EXPORT_SYMBOL(dev_coredumpm);

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC v2] device coredump: add new device coredump class
@ 2014-09-07  7:42   ` Arik Nemtsov
  0 siblings, 0 replies; 16+ messages in thread
From: Arik Nemtsov @ 2014-09-07  7:42 UTC (permalink / raw)
  To: Johannes Berg
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, Greg Kroah-Hartman,
	Daniel Vetter, Emmanuel Grumbach, Luciano Coelho, Kalle Valo,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA

On Fri, Sep 5, 2014 at 11:50 AM, Johannes Berg
<johannes-cdvu00un1VgdHxzADdlk8Q@public.gmane.org> wrote:
> From: Johannes Berg <johannes.berg-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>
> Many devices run firmware and/or complex hardware, and most of that
> can have bugs. When it misbehaves, however, it is often much harder
> to debug than software running on the host.
>

Very nice concept IMO :)

[....]
> +void dev_coredumpm(struct device *dev, struct module *owner,
> +                  const void *data, size_t datalen, gfp_t gfp,
> +                  ssize_t (*read)(char *buffer, loff_t offset, size_t count,
> +                                  const void *data, size_t datalen),
> +                  void (*free)(const void *data))
> +{
> +       static atomic_t devcd_count = ATOMIC_INIT(0);
> +       struct devcd_entry *devcd;
> +       struct device *existing;
> +
> +       existing = class_find_device(&devcd_class, NULL, dev,
> +                                    devcd_match_failing);
> +       if (existing) {
> +               put_device(existing);
> +               return;
> +       }
> +
> +       if (!try_module_get(owner))
> +               return;
> +
> +       devcd = kzalloc(sizeof(*devcd), gfp);
> +       if (!devcd)
> +               goto put_module;
> +
> +       devcd->owner = owner;
> +       devcd->data = data;
> +       devcd->datalen = datalen;
> +       devcd->read = read;
> +       devcd->free = free;
> +       devcd->failing_dev = get_device(dev);
> +
> +       device_initialize(&devcd->devcd_dev);
> +
> +       dev_set_name(&devcd->devcd_dev, "devcd%d",
> +                    atomic_inc_return(&devcd_count));
> +       devcd->devcd_dev.class = &devcd_class;
> +
> +       if (device_add(&devcd->devcd_dev))
> +               goto put_device;
> +
> +       if (sysfs_create_link(&devcd->devcd_dev.kobj, &dev->kobj,
> +                             "failing_device"))
> +               /* nothing - symlink will be missing */;
> +
> +       if (sysfs_create_link(&dev->kobj, &devcd->devcd_dev.kobj,
> +                             "dev_coredump"))
> +               /* nothing - symlink will be missing */;

Aren't you confusing the compiler here and above, doing the
INIT_DELAYED_WORK only when link creation failed? I would have thought
some braces are necessary..

> +
> +       INIT_DELAYED_WORK(&devcd->del_wk, devcd_del);
> +       schedule_delayed_work(&devcd->del_wk, DEVCD_TIMEOUT);
> +
> +       return;
> + put_device:
> +       put_device(&devcd->devcd_dev);
> + put_module:
> +       module_put(owner);
> +}
> +EXPORT_SYMBOL(dev_coredumpm);
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC v2] device coredump: add new device coredump class
  2014-09-05 22:13   ` Greg Kroah-Hartman
@ 2014-09-07  9:37     ` Johannes Berg
  -1 siblings, 0 replies; 16+ messages in thread
From: Johannes Berg @ 2014-09-07  9:37 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Daniel Vetter, Emmanuel Grumbach, Luciano Coelho,
	Kalle Valo, dri-devel, linux-wireless

On Fri, 2014-09-05 at 15:13 -0700, Greg Kroah-Hartman wrote:

> > +MODULE_AUTHOR("Johannes Berg <johannes@sipsolutions.net>");
> > +MODULE_DESCRIPTION("Device Coredump support");
> > +MODULE_LICENSE("GPL");
> 
> As you only allow Y or N as build options, it's not really a "module" :)

Umm, yeah. I went back and forth on whether this should be allowed to be
modular, but then decided it wasn't big enough to be worth it.

> > +	/*
> > +	 * this seems racy, but I don't see a notifier or such on
> > +	 * a struct device to know when it goes away?
> > +	 */
> > +	if (devcd->failing_dev->kobj.sd)
> > +		sysfs_delete_link(&devcd->failing_dev->kobj, &dev->kobj,
> > +				  "dev_coredump");
> 
> What is this link?  It should "just go away" if this:
> 
> > +	put_device(devcd->failing_dev);
> 
> was the last put_device() call on the failing_dev, right?  So you
> shouldn't need to make this call to sysfs_delete_link().

Oh, thanks, I'll try that. I did something slightly different first and
ended up with dead symlinks, but in that case I think I was actually
removing the bus/class of the device while the device was still alive or
something - that was a big mess.

> > +void dev_coredumpm(struct device *dev, struct module *owner,
> > +		   const void *data, size_t datalen, gfp_t gfp,
> > +		   ssize_t (*read)(char *buffer, loff_t offset, size_t count,
> > +				   const void *data, size_t datalen),
> > +		   void (*free)(const void *data))
> > +{
> > +	static atomic_t devcd_count = ATOMIC_INIT(0);
> > +	struct devcd_entry *devcd;
> > +	struct device *existing;
> > +
> > +	existing = class_find_device(&devcd_class, NULL, dev,
> > +				     devcd_match_failing);
> > +	if (existing) {
> > +		put_device(existing);
> > +		return;
> > +	}
> 
> I thought multiple dumps per "device" would throw away the older ones?
> It's fine if you don't, but you might want to document this behavior in
> the kerneldoc for the function.

I ... umm ... Yeah. I need to free the new one here. I actually wanted
to keep the first one because it seems likely that the driver would
attempt some sort of recovery and then if it happens to crash the device
again it's probably not very helpful to see the last crash.

But I definitely need to free the new one and document the behaviour.
Maybe if somebody needs something else we can make it configurable, but
for now I think all potential users have indicated that they'd prefer
keeping the first dump.

> Other than those very minor things, looks good to me, want to resend it
> cleaned up and without the "RFC" in the Subject?

Tomorrow :)

johannes


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC v2] device coredump: add new device coredump class
@ 2014-09-07  9:37     ` Johannes Berg
  0 siblings, 0 replies; 16+ messages in thread
From: Johannes Berg @ 2014-09-07  9:37 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Emmanuel Grumbach, Daniel Vetter, Kalle Valo, linux-wireless,
	linux-kernel, dri-devel, Luciano Coelho

On Fri, 2014-09-05 at 15:13 -0700, Greg Kroah-Hartman wrote:

> > +MODULE_AUTHOR("Johannes Berg <johannes@sipsolutions.net>");
> > +MODULE_DESCRIPTION("Device Coredump support");
> > +MODULE_LICENSE("GPL");
> 
> As you only allow Y or N as build options, it's not really a "module" :)

Umm, yeah. I went back and forth on whether this should be allowed to be
modular, but then decided it wasn't big enough to be worth it.

> > +	/*
> > +	 * this seems racy, but I don't see a notifier or such on
> > +	 * a struct device to know when it goes away?
> > +	 */
> > +	if (devcd->failing_dev->kobj.sd)
> > +		sysfs_delete_link(&devcd->failing_dev->kobj, &dev->kobj,
> > +				  "dev_coredump");
> 
> What is this link?  It should "just go away" if this:
> 
> > +	put_device(devcd->failing_dev);
> 
> was the last put_device() call on the failing_dev, right?  So you
> shouldn't need to make this call to sysfs_delete_link().

Oh, thanks, I'll try that. I did something slightly different first and
ended up with dead symlinks, but in that case I think I was actually
removing the bus/class of the device while the device was still alive or
something - that was a big mess.

> > +void dev_coredumpm(struct device *dev, struct module *owner,
> > +		   const void *data, size_t datalen, gfp_t gfp,
> > +		   ssize_t (*read)(char *buffer, loff_t offset, size_t count,
> > +				   const void *data, size_t datalen),
> > +		   void (*free)(const void *data))
> > +{
> > +	static atomic_t devcd_count = ATOMIC_INIT(0);
> > +	struct devcd_entry *devcd;
> > +	struct device *existing;
> > +
> > +	existing = class_find_device(&devcd_class, NULL, dev,
> > +				     devcd_match_failing);
> > +	if (existing) {
> > +		put_device(existing);
> > +		return;
> > +	}
> 
> I thought multiple dumps per "device" would throw away the older ones?
> It's fine if you don't, but you might want to document this behavior in
> the kerneldoc for the function.

I ... umm ... Yeah. I need to free the new one here. I actually wanted
to keep the first one because it seems likely that the driver would
attempt some sort of recovery and then if it happens to crash the device
again it's probably not very helpful to see the last crash.

But I definitely need to free the new one and document the behaviour.
Maybe if somebody needs something else we can make it configurable, but
for now I think all potential users have indicated that they'd prefer
keeping the first dump.

> Other than those very minor things, looks good to me, want to resend it
> cleaned up and without the "RFC" in the Subject?

Tomorrow :)

johannes

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC v2] device coredump: add new device coredump class
  2014-09-05  8:50 [RFC v2] device coredump: add new device coredump class Johannes Berg
@ 2014-09-08  8:36   ` Arend van Spriel
  2014-09-05 22:13   ` Greg Kroah-Hartman
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 16+ messages in thread
From: Arend van Spriel @ 2014-09-08  8:36 UTC (permalink / raw)
  To: Johannes Berg
  Cc: linux-kernel, Greg Kroah-Hartman, Daniel Vetter,
	Emmanuel Grumbach, Luciano Coelho, Kalle Valo, dri-devel,
	linux-wireless

On 09/05/14 10:50, Johannes Berg wrote:
> From: Johannes Berg<johannes.berg@intel.com>
>
> Many devices run firmware and/or complex hardware, and most of that
> can have bugs. When it misbehaves, however, it is often much harder
> to debug than software running on the host.
>
> Introduce a "device coredump" mechanism to allow dumping internal
> device/firmware state through a generalized mechanism. As devices
> are different and information needed can vary accordingly, this
> doesn't prescribe a file format - it just provides mechanism to
> get data to be able to capture it in a generalized way (e.g. in
> distributions.)

Although it can be found in the patch it would not hurt to point out 
where the coredump can be found in sysfs in this patch description.

[...]

> +static struct class devcd_class = {
> +	.name		= "devcoredump",
> +	.owner		= THIS_MODULE,
> +	.dev_release	= devcd_dev_release,
> +	.dev_groups	= devcd_dev_groups,
> +};

[...]

> +/**
> + * dev_coredumpm - create firmware coredump with read/free methods
> + * @dev: the struct device for the crashed device
> + * @data: data cookie for the @read/@free functions
> + * @datalen: length of the data
> + * @gfp: allocation flags
> + * @read: function to read from the given buffer
> + * @free: function to free the given buffer
> + */
> +void dev_coredumpm(struct device *dev, struct module *owner,
> +		   const void *data, size_t datalen, gfp_t gfp,
> +		   ssize_t (*read)(char *buffer, loff_t offset, size_t count,
> +				   const void *data, size_t datalen),
> +		   void (*free)(const void *data))
> +{
> +	static atomic_t devcd_count = ATOMIC_INIT(0);
> +	struct devcd_entry *devcd;
> +	struct device *existing;
> +
> +	existing = class_find_device(&devcd_class, NULL, dev,
> +				     devcd_match_failing);
> +	if (existing) {
> +		put_device(existing);
> +		return;
> +	}
> +
> +	if (!try_module_get(owner))
> +		return;
> +
> +	devcd = kzalloc(sizeof(*devcd), gfp);
> +	if (!devcd)
> +		goto put_module;
> +
> +	devcd->owner = owner;
> +	devcd->data = data;
> +	devcd->datalen = datalen;
> +	devcd->read = read;
> +	devcd->free = free;
> +	devcd->failing_dev = get_device(dev);
> +
> +	device_initialize(&devcd->devcd_dev);
> +
> +	dev_set_name(&devcd->devcd_dev, "devcd%d",
> +		     atomic_inc_return(&devcd_count));
> +	devcd->devcd_dev.class =&devcd_class;
> +
> +	if (device_add(&devcd->devcd_dev))
> +		goto put_device;
> +
> +	if (sysfs_create_link(&devcd->devcd_dev.kobj,&dev->kobj,
> +			      "failing_device"))
> +		/* nothing - symlink will be missing */;
> +
> +	if (sysfs_create_link(&dev->kobj,&devcd->devcd_dev.kobj,
> +			      "dev_coredump"))
> +		/* nothing - symlink will be missing */;

The class is called "devcoredump" so you may want to be consistent here 
and get rid of the underscore.

Regards,
Arend

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC v2] device coredump: add new device coredump class
@ 2014-09-08  8:36   ` Arend van Spriel
  0 siblings, 0 replies; 16+ messages in thread
From: Arend van Spriel @ 2014-09-08  8:36 UTC (permalink / raw)
  To: Johannes Berg
  Cc: linux-kernel, Greg Kroah-Hartman, Daniel Vetter,
	Emmanuel Grumbach, Luciano Coelho, Kalle Valo, dri-devel,
	linux-wireless

On 09/05/14 10:50, Johannes Berg wrote:
> From: Johannes Berg<johannes.berg@intel.com>
>
> Many devices run firmware and/or complex hardware, and most of that
> can have bugs. When it misbehaves, however, it is often much harder
> to debug than software running on the host.
>
> Introduce a "device coredump" mechanism to allow dumping internal
> device/firmware state through a generalized mechanism. As devices
> are different and information needed can vary accordingly, this
> doesn't prescribe a file format - it just provides mechanism to
> get data to be able to capture it in a generalized way (e.g. in
> distributions.)

Although it can be found in the patch it would not hurt to point out 
where the coredump can be found in sysfs in this patch description.

[...]

> +static struct class devcd_class = {
> +	.name		= "devcoredump",
> +	.owner		= THIS_MODULE,
> +	.dev_release	= devcd_dev_release,
> +	.dev_groups	= devcd_dev_groups,
> +};

[...]

> +/**
> + * dev_coredumpm - create firmware coredump with read/free methods
> + * @dev: the struct device for the crashed device
> + * @data: data cookie for the @read/@free functions
> + * @datalen: length of the data
> + * @gfp: allocation flags
> + * @read: function to read from the given buffer
> + * @free: function to free the given buffer
> + */
> +void dev_coredumpm(struct device *dev, struct module *owner,
> +		   const void *data, size_t datalen, gfp_t gfp,
> +		   ssize_t (*read)(char *buffer, loff_t offset, size_t count,
> +				   const void *data, size_t datalen),
> +		   void (*free)(const void *data))
> +{
> +	static atomic_t devcd_count = ATOMIC_INIT(0);
> +	struct devcd_entry *devcd;
> +	struct device *existing;
> +
> +	existing = class_find_device(&devcd_class, NULL, dev,
> +				     devcd_match_failing);
> +	if (existing) {
> +		put_device(existing);
> +		return;
> +	}
> +
> +	if (!try_module_get(owner))
> +		return;
> +
> +	devcd = kzalloc(sizeof(*devcd), gfp);
> +	if (!devcd)
> +		goto put_module;
> +
> +	devcd->owner = owner;
> +	devcd->data = data;
> +	devcd->datalen = datalen;
> +	devcd->read = read;
> +	devcd->free = free;
> +	devcd->failing_dev = get_device(dev);
> +
> +	device_initialize(&devcd->devcd_dev);
> +
> +	dev_set_name(&devcd->devcd_dev, "devcd%d",
> +		     atomic_inc_return(&devcd_count));
> +	devcd->devcd_dev.class =&devcd_class;
> +
> +	if (device_add(&devcd->devcd_dev))
> +		goto put_device;
> +
> +	if (sysfs_create_link(&devcd->devcd_dev.kobj,&dev->kobj,
> +			      "failing_device"))
> +		/* nothing - symlink will be missing */;
> +
> +	if (sysfs_create_link(&dev->kobj,&devcd->devcd_dev.kobj,
> +			      "dev_coredump"))
> +		/* nothing - symlink will be missing */;

The class is called "devcoredump" so you may want to be consistent here 
and get rid of the underscore.

Regards,
Arend

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC v2] device coredump: add new device coredump class
  2014-09-05 22:13   ` Greg Kroah-Hartman
@ 2014-09-08  8:38     ` Johannes Berg
  -1 siblings, 0 replies; 16+ messages in thread
From: Johannes Berg @ 2014-09-08  8:38 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Daniel Vetter, Emmanuel Grumbach, Luciano Coelho,
	Kalle Valo, dri-devel, linux-wireless

On Fri, 2014-09-05 at 15:13 -0700, Greg Kroah-Hartman wrote:

> > +	/*
> > +	 * this seems racy, but I don't see a notifier or such on
> > +	 * a struct device to know when it goes away?
> > +	 */
> > +	if (devcd->failing_dev->kobj.sd)
> > +		sysfs_delete_link(&devcd->failing_dev->kobj, &dev->kobj,
> > +				  "dev_coredump");
> 
> What is this link?  It should "just go away" if this:
> 
> > +	put_device(devcd->failing_dev);
> 
> was the last put_device() call on the failing_dev, right?  So you
> shouldn't need to make this call to sysfs_delete_link().

I looked at this again, and it's the other way around. This is the link
that Daniel requested, from the original device to the one that's being
freed. For whatever reason though, symlinks don't automatically go away
when freed:

(with a test patch that makes "mac80211-hwsim" crash whenever we have
radar)

# echo 1 > /sys/kernel/debug/ieee80211/phy0/hwsim/dfs_simulate_radar
# ls /sys/class/devcoredump/
devcd1
# ls /sys/class/devcoredump/devcd1/
data            failing_device/ power/          subsystem/      uevent
# ls -l /sys/class/mac80211_hwsim/hwsim0/dev_coredump
lrwxrwxrwx 1 root root 0 Sep  8 08:34 /sys/class/mac80211_hwsim/hwsim0/dev_coredump -> ../../devcoredump/devcd1
# echo > /sys/class/devcoredump/devcd1/data
# ls /sys/class/devcoredump/
# ls -l /sys/class/mac80211_hwsim/hwsim0/dev_coredump
lrwxrwxrwx 1 root root 0 Sep  8 08:34 /sys/class/mac80211_hwsim/hwsim0/dev_coredump -> ../../devcoredump/devcd1

(but the link is now dead)

Maybe I'm creating the links in the wrong way so they don't
automatically get removed when the struct device is released?

johannes


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC v2] device coredump: add new device coredump class
@ 2014-09-08  8:38     ` Johannes Berg
  0 siblings, 0 replies; 16+ messages in thread
From: Johannes Berg @ 2014-09-08  8:38 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Emmanuel Grumbach, Daniel Vetter, Kalle Valo, linux-wireless,
	linux-kernel, dri-devel, Luciano Coelho

On Fri, 2014-09-05 at 15:13 -0700, Greg Kroah-Hartman wrote:

> > +	/*
> > +	 * this seems racy, but I don't see a notifier or such on
> > +	 * a struct device to know when it goes away?
> > +	 */
> > +	if (devcd->failing_dev->kobj.sd)
> > +		sysfs_delete_link(&devcd->failing_dev->kobj, &dev->kobj,
> > +				  "dev_coredump");
> 
> What is this link?  It should "just go away" if this:
> 
> > +	put_device(devcd->failing_dev);
> 
> was the last put_device() call on the failing_dev, right?  So you
> shouldn't need to make this call to sysfs_delete_link().

I looked at this again, and it's the other way around. This is the link
that Daniel requested, from the original device to the one that's being
freed. For whatever reason though, symlinks don't automatically go away
when freed:

(with a test patch that makes "mac80211-hwsim" crash whenever we have
radar)

# echo 1 > /sys/kernel/debug/ieee80211/phy0/hwsim/dfs_simulate_radar
# ls /sys/class/devcoredump/
devcd1
# ls /sys/class/devcoredump/devcd1/
data            failing_device/ power/          subsystem/      uevent
# ls -l /sys/class/mac80211_hwsim/hwsim0/dev_coredump
lrwxrwxrwx 1 root root 0 Sep  8 08:34 /sys/class/mac80211_hwsim/hwsim0/dev_coredump -> ../../devcoredump/devcd1
# echo > /sys/class/devcoredump/devcd1/data
# ls /sys/class/devcoredump/
# ls -l /sys/class/mac80211_hwsim/hwsim0/dev_coredump
lrwxrwxrwx 1 root root 0 Sep  8 08:34 /sys/class/mac80211_hwsim/hwsim0/dev_coredump -> ../../devcoredump/devcd1

(but the link is now dead)

Maybe I'm creating the links in the wrong way so they don't
automatically get removed when the struct device is released?

johannes

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC v2] device coredump: add new device coredump class
  2014-09-08  8:38     ` Johannes Berg
@ 2014-09-12 16:40       ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 16+ messages in thread
From: Greg Kroah-Hartman @ 2014-09-12 16:40 UTC (permalink / raw)
  To: Johannes Berg
  Cc: linux-kernel, Daniel Vetter, Emmanuel Grumbach, Luciano Coelho,
	Kalle Valo, dri-devel, linux-wireless

On Mon, Sep 08, 2014 at 10:38:07AM +0200, Johannes Berg wrote:
> On Fri, 2014-09-05 at 15:13 -0700, Greg Kroah-Hartman wrote:
> 
> > > +	/*
> > > +	 * this seems racy, but I don't see a notifier or such on
> > > +	 * a struct device to know when it goes away?
> > > +	 */
> > > +	if (devcd->failing_dev->kobj.sd)
> > > +		sysfs_delete_link(&devcd->failing_dev->kobj, &dev->kobj,
> > > +				  "dev_coredump");
> > 
> > What is this link?  It should "just go away" if this:
> > 
> > > +	put_device(devcd->failing_dev);
> > 
> > was the last put_device() call on the failing_dev, right?  So you
> > shouldn't need to make this call to sysfs_delete_link().
> 
> I looked at this again, and it's the other way around. This is the link
> that Daniel requested, from the original device to the one that's being
> freed. For whatever reason though, symlinks don't automatically go away
> when freed:
> 
> (with a test patch that makes "mac80211-hwsim" crash whenever we have
> radar)
> 
> # echo 1 > /sys/kernel/debug/ieee80211/phy0/hwsim/dfs_simulate_radar
> # ls /sys/class/devcoredump/
> devcd1
> # ls /sys/class/devcoredump/devcd1/
> data            failing_device/ power/          subsystem/      uevent
> # ls -l /sys/class/mac80211_hwsim/hwsim0/dev_coredump
> lrwxrwxrwx 1 root root 0 Sep  8 08:34 /sys/class/mac80211_hwsim/hwsim0/dev_coredump -> ../../devcoredump/devcd1
> # echo > /sys/class/devcoredump/devcd1/data
> # ls /sys/class/devcoredump/
> # ls -l /sys/class/mac80211_hwsim/hwsim0/dev_coredump
> lrwxrwxrwx 1 root root 0 Sep  8 08:34 /sys/class/mac80211_hwsim/hwsim0/dev_coredump -> ../../devcoredump/devcd1
> 
> (but the link is now dead)
> 
> Maybe I'm creating the links in the wrong way so they don't
> automatically get removed when the struct device is released?

No, you are correct, sorry for the noise, I thought the symlink was the
other way around for some reason.

greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC v2] device coredump: add new device coredump class
@ 2014-09-12 16:40       ` Greg Kroah-Hartman
  0 siblings, 0 replies; 16+ messages in thread
From: Greg Kroah-Hartman @ 2014-09-12 16:40 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Emmanuel Grumbach, Daniel Vetter, Kalle Valo, linux-wireless,
	linux-kernel, dri-devel, Luciano Coelho

On Mon, Sep 08, 2014 at 10:38:07AM +0200, Johannes Berg wrote:
> On Fri, 2014-09-05 at 15:13 -0700, Greg Kroah-Hartman wrote:
> 
> > > +	/*
> > > +	 * this seems racy, but I don't see a notifier or such on
> > > +	 * a struct device to know when it goes away?
> > > +	 */
> > > +	if (devcd->failing_dev->kobj.sd)
> > > +		sysfs_delete_link(&devcd->failing_dev->kobj, &dev->kobj,
> > > +				  "dev_coredump");
> > 
> > What is this link?  It should "just go away" if this:
> > 
> > > +	put_device(devcd->failing_dev);
> > 
> > was the last put_device() call on the failing_dev, right?  So you
> > shouldn't need to make this call to sysfs_delete_link().
> 
> I looked at this again, and it's the other way around. This is the link
> that Daniel requested, from the original device to the one that's being
> freed. For whatever reason though, symlinks don't automatically go away
> when freed:
> 
> (with a test patch that makes "mac80211-hwsim" crash whenever we have
> radar)
> 
> # echo 1 > /sys/kernel/debug/ieee80211/phy0/hwsim/dfs_simulate_radar
> # ls /sys/class/devcoredump/
> devcd1
> # ls /sys/class/devcoredump/devcd1/
> data            failing_device/ power/          subsystem/      uevent
> # ls -l /sys/class/mac80211_hwsim/hwsim0/dev_coredump
> lrwxrwxrwx 1 root root 0 Sep  8 08:34 /sys/class/mac80211_hwsim/hwsim0/dev_coredump -> ../../devcoredump/devcd1
> # echo > /sys/class/devcoredump/devcd1/data
> # ls /sys/class/devcoredump/
> # ls -l /sys/class/mac80211_hwsim/hwsim0/dev_coredump
> lrwxrwxrwx 1 root root 0 Sep  8 08:34 /sys/class/mac80211_hwsim/hwsim0/dev_coredump -> ../../devcoredump/devcd1
> 
> (but the link is now dead)
> 
> Maybe I'm creating the links in the wrong way so they don't
> automatically get removed when the struct device is released?

No, you are correct, sorry for the noise, I thought the symlink was the
other way around for some reason.

greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2014-09-12 16:40 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-05  8:50 [RFC v2] device coredump: add new device coredump class Johannes Berg
2014-09-05  9:40 ` Jonas Gorski
2014-09-05  9:56   ` Johannes Berg
2014-09-05  9:56     ` Johannes Berg
2014-09-05 22:13 ` Greg Kroah-Hartman
2014-09-05 22:13   ` Greg Kroah-Hartman
2014-09-07  9:37   ` Johannes Berg
2014-09-07  9:37     ` Johannes Berg
2014-09-08  8:38   ` Johannes Berg
2014-09-08  8:38     ` Johannes Berg
2014-09-12 16:40     ` Greg Kroah-Hartman
2014-09-12 16:40       ` Greg Kroah-Hartman
2014-09-07  7:42 ` Arik Nemtsov
2014-09-07  7:42   ` Arik Nemtsov
2014-09-08  8:36 ` Arend van Spriel
2014-09-08  8:36   ` Arend van Spriel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.