All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH 0/5] tracing: Add new file system tracefs
@ 2015-01-21 17:19 Steven Rostedt
  2015-01-21 17:19 ` [RFC][PATCH 1/5] tracefs: Add new tracefs file system Steven Rostedt
                   ` (6 more replies)
  0 siblings, 7 replies; 26+ messages in thread
From: Steven Rostedt @ 2015-01-21 17:19 UTC (permalink / raw)
  To: linux-kernel; +Cc: Al Viro, Greg Kroah-Hartman, Ingo Molnar, Andrew Morton


There has been complaints that tracing is tied too much to debugfs,
as there are systems that would like to perform tracing, but do
not mount debugfs for security reasons. That is because any subsystem
may use debugfs for debugging, and these interfaces are not always
tested for security.

Creating a new tracefs that the tracing directory will now be attached
to allows system admins the ability to access the tracing directory
without the need to mount debugfs.

Another advantage is that debugfs does not support the system calls
for mkdir and rmdir. Tracing uses these system calls to create new
instances for sub buffers. This was done by a hack that hijacked the
dentry ops from the "instances" debugfs dentry, and replacing it with
one that could work.

Instead of using this hack, tracefs can provide a proper interface to
allow the tracing system to have a mkdir and rmdir feature.

To maintain backward compatibility with older tools that expect that
the tracing directory is mounted with debugfs, the tracing directory
is still created under debugfs and tracefs is automatically mounted
there.

Finally, a new directory is created when tracefs is enabled called
/sys/kernel/tracing. This will be the new location that system admins
may mount tracefs if they are not using debugfs.


Steven Rostedt (Red Hat) (5):
      tracefs: Add new tracefs file system
      tracing: Convert the tracing facility over to use tracefs
      tracing: Automatically mount tracefs on debugfs/tracing
      tracing: Have mkdir and rmdir be part of tracefs
      tracefs: Add directory /sys/kernel/tracing

----
 fs/Makefile                          |   1 +
 fs/tracefs/Makefile                  |   4 +
 fs/tracefs/inode.c                   | 649 +++++++++++++++++++++++++++++++++++
 include/uapi/linux/magic.h           |   2 +
 kernel/trace/ftrace.c                |  22 +-
 kernel/trace/trace.c                 | 178 +++++-----
 kernel/trace/trace.h                 |   2 +-
 kernel/trace/trace_events.c          |  32 +-
 kernel/trace/trace_functions_graph.c |   7 +-
 kernel/trace/trace_kprobe.c          |  10 +-
 kernel/trace/trace_probe.h           |   2 +-
 kernel/trace/trace_stat.c            |  10 +-
 scripts/tags.sh                      |   2 +-
 13 files changed, 789 insertions(+), 132 deletions(-)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC][PATCH 1/5] tracefs: Add new tracefs file system
  2015-01-21 17:19 [RFC][PATCH 0/5] tracing: Add new file system tracefs Steven Rostedt
@ 2015-01-21 17:19 ` Steven Rostedt
  2015-01-21 18:30   ` Steven Rostedt
  2015-01-21 17:19 ` [RFC][PATCH 2/5] tracing: Convert the tracing facility over to use tracefs Steven Rostedt
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 26+ messages in thread
From: Steven Rostedt @ 2015-01-21 17:19 UTC (permalink / raw)
  To: linux-kernel; +Cc: Al Viro, Greg Kroah-Hartman, Ingo Molnar, Andrew Morton

[-- Attachment #1: 0001-tracefs-Add-new-tracefs-file-system.patch --]
[-- Type: text/plain, Size: 16972 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

Add a separate file system to handle the tracing directory. Currently it
is part of debugfs, but that is starting to show its limits.

One thing is that in order to access the tracing infrastructure, you need
to mount debugfs. As that includes debugging from all sorts of sub systems
in the kernel, it is not considered advisable to mount such an all encompassing
debugging system.

Having the tracing system in its own file systems gives access to the tracing
sub system without needing to include all other systems.

Another problem with tracing using the debugfs system is that the instances
use mkdir to create sub buffers. debugfs does not support mkdir from userspace
so to implement it, special hacks were used. By controlling the file system
that the tracing infrastructure uses, this can be properly done without hacks.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 fs/Makefile                |   1 +
 fs/tracefs/Makefile        |   4 +
 fs/tracefs/inode.c         | 561 +++++++++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/magic.h |   2 +
 4 files changed, 568 insertions(+)
 create mode 100644 fs/tracefs/Makefile
 create mode 100644 fs/tracefs/inode.c

diff --git a/fs/Makefile b/fs/Makefile
index bedff48e8fdc..d244b8d973ac 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -118,6 +118,7 @@ obj-$(CONFIG_HOSTFS)		+= hostfs/
 obj-$(CONFIG_HPPFS)		+= hppfs/
 obj-$(CONFIG_CACHEFILES)	+= cachefiles/
 obj-$(CONFIG_DEBUG_FS)		+= debugfs/
+obj-$(CONFIG_TRACING)		+= tracefs/
 obj-$(CONFIG_OCFS2_FS)		+= ocfs2/
 obj-$(CONFIG_BTRFS_FS)		+= btrfs/
 obj-$(CONFIG_GFS2_FS)           += gfs2/
diff --git a/fs/tracefs/Makefile b/fs/tracefs/Makefile
new file mode 100644
index 000000000000..82fa35b656c4
--- /dev/null
+++ b/fs/tracefs/Makefile
@@ -0,0 +1,4 @@
+tracefs-objs	:= inode.o
+
+obj-$(CONFIG_TRACING)	+= tracefs.o
+
diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
new file mode 100644
index 000000000000..d243f670d461
--- /dev/null
+++ b/fs/tracefs/inode.c
@@ -0,0 +1,561 @@
+/*
+ *  inode.c - part of tracefs, a pseudo file system for activating tracing
+ *
+ * Based on debugfs by: Greg Kroah-Hartman <greg@kroah.com>
+ *
+ *  Copyright (C) 2014 Red Hat Inc, author: Steven Rostedt <srostedt@redhat.com>
+ *
+ *	This program is free software; you can redistribute it and/or
+ *	modify it under the terms of the GNU General Public License version
+ *	2 as published by the Free Software Foundation.
+ *
+ * tracefs is the file system that is used by the tracing infrastructure.
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/fs.h>
+#include <linux/mount.h>
+#include <linux/namei.h>
+#include <linux/tracefs.h>
+#include <linux/fsnotify.h>
+#include <linux/seq_file.h>
+#include <linux/parser.h>
+#include <linux/magic.h>
+#include <linux/slab.h>
+
+#define TRACEFS_DEFAULT_MODE	0700
+
+static struct vfsmount *tracefs_mount;
+static int tracefs_mount_count;
+static bool tracefs_registered;
+
+static ssize_t default_read_file(struct file *file, char __user *buf,
+				 size_t count, loff_t *ppos)
+{
+	return 0;
+}
+
+static ssize_t default_write_file(struct file *file, const char __user *buf,
+				   size_t count, loff_t *ppos)
+{
+	return count;
+}
+
+const struct file_operations tracefs_file_operations = {
+	.read =		default_read_file,
+	.write =	default_write_file,
+	.open =		simple_open,
+	.llseek =	noop_llseek,
+};
+
+static struct inode *tracefs_get_inode(struct super_block *sb, umode_t mode, dev_t dev,
+				      void *data, const struct file_operations *fops)
+
+{
+	struct inode *inode = new_inode(sb);
+
+	if (inode) {
+		inode->i_ino = get_next_ino();
+		inode->i_mode = mode;
+		inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME;
+		switch (mode & S_IFMT) {
+		default:
+			init_special_inode(inode, mode, dev);
+			break;
+		case S_IFREG:
+			inode->i_fop = fops ? fops : &tracefs_file_operations;
+			inode->i_private = data;
+			break;
+		case S_IFDIR:
+			inode->i_op = &simple_dir_inode_operations;
+			inode->i_fop = &simple_dir_operations;
+
+			/* directory inodes start off with i_nlink == 2
+			 * (for "." entry) */
+			inc_nlink(inode);
+			break;
+		}
+	}
+	return inode;
+}
+
+/* SMP-safe */
+static int tracefs_mknod(struct inode *dir, struct dentry *dentry,
+			 umode_t mode, dev_t dev, void *data,
+			 const struct file_operations *fops)
+{
+	struct inode *inode;
+	int error = -EPERM;
+
+	if (dentry->d_inode)
+		return -EEXIST;
+
+	inode = tracefs_get_inode(dir->i_sb, mode, dev, data, fops);
+	if (inode) {
+		d_instantiate(dentry, inode);
+		dget(dentry);
+		error = 0;
+	}
+	return error;
+}
+
+static int tracefs_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
+{
+	int res;
+
+	mode = (mode & (S_IRWXUGO | S_ISVTX)) | S_IFDIR;
+	res = tracefs_mknod(dir, dentry, mode, 0, NULL, NULL);
+	if (!res) {
+		inc_nlink(dir);
+		fsnotify_mkdir(dir, dentry);
+	}
+	return res;
+}
+
+static int tracefs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
+			  void *data, const struct file_operations *fops)
+{
+	int res;
+
+	mode = (mode & S_IALLUGO) | S_IFREG;
+	res = tracefs_mknod(dir, dentry, mode, 0, data, fops);
+	if (!res)
+		fsnotify_create(dir, dentry);
+	return res;
+}
+
+struct tracefs_mount_opts {
+	kuid_t uid;
+	kgid_t gid;
+	umode_t mode;
+};
+
+enum {
+	Opt_uid,
+	Opt_gid,
+	Opt_mode,
+	Opt_err
+};
+
+static const match_table_t tokens = {
+	{Opt_uid, "uid=%u"},
+	{Opt_gid, "gid=%u"},
+	{Opt_mode, "mode=%o"},
+	{Opt_err, NULL}
+};
+
+struct tracefs_fs_info {
+	struct tracefs_mount_opts mount_opts;
+};
+
+static int tracefs_parse_options(char *data, struct tracefs_mount_opts *opts)
+{
+	substring_t args[MAX_OPT_ARGS];
+	int option;
+	int token;
+	kuid_t uid;
+	kgid_t gid;
+	char *p;
+
+	opts->mode = TRACEFS_DEFAULT_MODE;
+
+	while ((p = strsep(&data, ",")) != NULL) {
+		if (!*p)
+			continue;
+
+		token = match_token(p, tokens, args);
+		switch (token) {
+		case Opt_uid:
+			if (match_int(&args[0], &option))
+				return -EINVAL;
+			uid = make_kuid(current_user_ns(), option);
+			if (!uid_valid(uid))
+				return -EINVAL;
+			opts->uid = uid;
+			break;
+		case Opt_gid:
+			if (match_int(&args[0], &option))
+				return -EINVAL;
+			gid = make_kgid(current_user_ns(), option);
+			if (!gid_valid(gid))
+				return -EINVAL;
+			opts->gid = gid;
+			break;
+		case Opt_mode:
+			if (match_octal(&args[0], &option))
+				return -EINVAL;
+			opts->mode = option & S_IALLUGO;
+			break;
+		/*
+		 * We might like to report bad mount options here;
+		 * but traditionally tracefs has ignored all mount options
+		 */
+		}
+	}
+
+	return 0;
+}
+
+static int tracefs_apply_options(struct super_block *sb)
+{
+	struct tracefs_fs_info *fsi = sb->s_fs_info;
+	struct inode *inode = sb->s_root->d_inode;
+	struct tracefs_mount_opts *opts = &fsi->mount_opts;
+
+	inode->i_mode &= ~S_IALLUGO;
+	inode->i_mode |= opts->mode;
+
+	inode->i_uid = opts->uid;
+	inode->i_gid = opts->gid;
+
+	return 0;
+}
+
+static int tracefs_remount(struct super_block *sb, int *flags, char *data)
+{
+	int err;
+	struct tracefs_fs_info *fsi = sb->s_fs_info;
+
+	sync_filesystem(sb);
+	err = tracefs_parse_options(data, &fsi->mount_opts);
+	if (err)
+		goto fail;
+
+	tracefs_apply_options(sb);
+
+fail:
+	return err;
+}
+
+static int tracefs_show_options(struct seq_file *m, struct dentry *root)
+{
+	struct tracefs_fs_info *fsi = root->d_sb->s_fs_info;
+	struct tracefs_mount_opts *opts = &fsi->mount_opts;
+
+	if (!uid_eq(opts->uid, GLOBAL_ROOT_UID))
+		seq_printf(m, ",uid=%u",
+			   from_kuid_munged(&init_user_ns, opts->uid));
+	if (!gid_eq(opts->gid, GLOBAL_ROOT_GID))
+		seq_printf(m, ",gid=%u",
+			   from_kgid_munged(&init_user_ns, opts->gid));
+	if (opts->mode != TRACEFS_DEFAULT_MODE)
+		seq_printf(m, ",mode=%o", opts->mode);
+
+	return 0;
+}
+
+static const struct super_operations tracefs_super_operations = {
+	.statfs		= simple_statfs,
+	.remount_fs	= tracefs_remount,
+	.show_options	= tracefs_show_options,
+};
+
+static int trace_fill_super(struct super_block *sb, void *data, int silent)
+{
+	static struct tree_descr trace_files[] = {{""}};
+	struct tracefs_fs_info *fsi;
+	int err;
+
+	save_mount_options(sb, data);
+
+	fsi = kzalloc(sizeof(struct tracefs_fs_info), GFP_KERNEL);
+	sb->s_fs_info = fsi;
+	if (!fsi) {
+		err = -ENOMEM;
+		goto fail;
+	}
+
+	err = tracefs_parse_options(data, &fsi->mount_opts);
+	if (err)
+		goto fail;
+
+	err  =  simple_fill_super(sb, TRACEFS_MAGIC, trace_files);
+	if (err)
+		goto fail;
+
+	sb->s_op = &tracefs_super_operations;
+
+	tracefs_apply_options(sb);
+
+	return 0;
+
+fail:
+	kfree(fsi);
+	sb->s_fs_info = NULL;
+	return err;
+}
+
+static struct dentry *trace_mount(struct file_system_type *fs_type,
+			int flags, const char *dev_name,
+			void *data)
+{
+	return mount_single(fs_type, flags, data, trace_fill_super);
+}
+
+static struct file_system_type trace_fs_type = {
+	.owner =	THIS_MODULE,
+	.name =		"tracefs",
+	.mount =	trace_mount,
+	.kill_sb =	kill_litter_super,
+};
+MODULE_ALIAS_FS("tracefs");
+
+static struct dentry *__create_file(const char *name, umode_t mode,
+				    struct dentry *parent, void *data,
+				    const struct file_operations *fops)
+{
+	struct dentry *dentry = NULL;
+	int error;
+
+	pr_debug("tracefs: creating file '%s'\n",name);
+
+	error = simple_pin_fs(&trace_fs_type, &tracefs_mount,
+			      &tracefs_mount_count);
+	if (error)
+		goto exit;
+
+	/* If the parent is not specified, we create it in the root.
+	 * We need the root dentry to do this, which is in the super
+	 * block. A pointer to that is in the struct vfsmount that we
+	 * have around.
+	 */
+	if (!parent)
+		parent = tracefs_mount->mnt_root;
+
+	mutex_lock(&parent->d_inode->i_mutex);
+	dentry = lookup_one_len(name, parent, strlen(name));
+	if (!IS_ERR(dentry)) {
+		switch (mode & S_IFMT) {
+		case S_IFDIR:
+			error = tracefs_mkdir(parent->d_inode, dentry, mode);
+
+			break;
+		default:
+			error = tracefs_create(parent->d_inode, dentry, mode,
+					       data, fops);
+			break;
+		}
+		dput(dentry);
+	} else
+		error = PTR_ERR(dentry);
+	mutex_unlock(&parent->d_inode->i_mutex);
+
+	if (error) {
+		dentry = NULL;
+		simple_release_fs(&tracefs_mount, &tracefs_mount_count);
+	}
+exit:
+	return dentry;
+}
+
+/**
+ * tracefs_create_file - create a file in the tracefs filesystem
+ * @name: a pointer to a string containing the name of the file to create.
+ * @mode: the permission that the file should have.
+ * @parent: a pointer to the parent dentry for this file.  This should be a
+ *          directory dentry if set.  If this parameter is NULL, then the
+ *          file will be created in the root of the tracefs filesystem.
+ * @data: a pointer to something that the caller will want to get to later
+ *        on.  The inode.i_private pointer will point to this value on
+ *        the open() call.
+ * @fops: a pointer to a struct file_operations that should be used for
+ *        this file.
+ *
+ * This is the basic "create a file" function for tracefs.  It allows for a
+ * wide range of flexibility in creating a file, or a directory (if you want
+ * to create a directory, the tracefs_create_dir() function is
+ * recommended to be used instead.)
+ *
+ * This function will return a pointer to a dentry if it succeeds.  This
+ * pointer must be passed to the tracefs_remove() function when the file is
+ * to be removed (no automatic cleanup happens if your module is unloaded,
+ * you are responsible here.)  If an error occurs, %NULL will be returned.
+ *
+ * If tracefs is not enabled in the kernel, the value -%ENODEV will be
+ * returned.
+ */
+struct dentry *tracefs_create_file(const char *name, umode_t mode,
+				   struct dentry *parent, void *data,
+				   const struct file_operations *fops)
+{
+	switch (mode & S_IFMT) {
+	case S_IFREG:
+	case 0:
+		break;
+	default:
+		BUG();
+	}
+
+	return __create_file(name, mode, parent, data, fops);
+}
+
+/**
+ * tracefs_create_dir - create a directory in the tracefs filesystem
+ * @name: a pointer to a string containing the name of the directory to
+ *        create.
+ * @parent: a pointer to the parent dentry for this file.  This should be a
+ *          directory dentry if set.  If this parameter is NULL, then the
+ *          directory will be created in the root of the tracefs filesystem.
+ *
+ * This function creates a directory in tracefs with the given name.
+ *
+ * This function will return a pointer to a dentry if it succeeds.  This
+ * pointer must be passed to the tracefs_remove() function when the file is
+ * to be removed. If an error occurs, %NULL will be returned.
+ *
+ * If tracing is not enabled in the kernel, the value -%ENODEV will be
+ * returned.
+ */
+struct dentry *tracefs_create_dir(const char *name, struct dentry *parent)
+{
+	return __create_file(name, S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO,
+				   parent, NULL, NULL);
+}
+
+static inline int tracefs_positive(struct dentry *dentry)
+{
+	return dentry->d_inode && !d_unhashed(dentry);
+}
+
+static int __tracefs_remove(struct dentry *dentry, struct dentry *parent)
+{
+	int ret = 0;
+
+	if (tracefs_positive(dentry)) {
+		if (dentry->d_inode) {
+			dget(dentry);
+			switch (dentry->d_inode->i_mode & S_IFMT) {
+			case S_IFDIR:
+				ret = simple_rmdir(parent->d_inode, dentry);
+				break;
+			default:
+				simple_unlink(parent->d_inode, dentry);
+				break;
+			}
+			if (!ret)
+				d_delete(dentry);
+			dput(dentry);
+		}
+	}
+	return ret;
+}
+
+/**
+ * tracefs_remove - removes a file or directory from the tracefs filesystem
+ * @dentry: a pointer to a the dentry of the file or directory to be
+ *          removed.
+ *
+ * This function removes a file or directory in tracefs that was previously
+ * created with a call to another tracefs function (like
+ * tracefs_create_file() or variants thereof.)
+ */
+void tracefs_remove(struct dentry *dentry)
+{
+	struct dentry *parent;
+	int ret;
+
+	if (IS_ERR_OR_NULL(dentry))
+		return;
+
+	parent = dentry->d_parent;
+	if (!parent || !parent->d_inode)
+		return;
+
+	mutex_lock(&parent->d_inode->i_mutex);
+	ret = __tracefs_remove(dentry, parent);
+	mutex_unlock(&parent->d_inode->i_mutex);
+	if (!ret)
+		simple_release_fs(&tracefs_mount, &tracefs_mount_count);
+}
+
+/**
+ * tracefs_remove_recursive - recursively removes a directory
+ * @dentry: a pointer to a the dentry of the directory to be removed.
+ *
+ * This function recursively removes a directory tree in tracefs that
+ * was previously created with a call to another tracefs function
+ * (like tracefs_create_file() or variants thereof.)
+ */
+void tracefs_remove_recursive(struct dentry *dentry)
+{
+	struct dentry *child, *parent;
+
+	if (IS_ERR_OR_NULL(dentry))
+		return;
+
+	parent = dentry->d_parent;
+	if (!parent || !parent->d_inode)
+		return;
+
+	parent = dentry;
+ down:
+	mutex_lock(&parent->d_inode->i_mutex);
+ loop:
+	/*
+	 * The parent->d_subdirs is protected by the d_lock. Outside that
+	 * lock, the child can be unlinked and set to be freed which can
+	 * use the d_u.d_child as the rcu head and corrupt this list.
+	 */
+	spin_lock(&parent->d_lock);
+	list_for_each_entry(child, &parent->d_subdirs, d_child) {
+		if (!tracefs_positive(child))
+			continue;
+
+		/* perhaps simple_empty(child) makes more sense */
+		if (!list_empty(&child->d_subdirs)) {
+			spin_unlock(&parent->d_lock);
+			mutex_unlock(&parent->d_inode->i_mutex);
+			parent = child;
+			goto down;
+		}
+
+		spin_unlock(&parent->d_lock);
+
+		if (!__tracefs_remove(child, parent))
+			simple_release_fs(&tracefs_mount, &tracefs_mount_count);
+
+		/*
+		 * The parent->d_lock protects agaist child from unlinking
+		 * from d_subdirs. When releasing the parent->d_lock we can
+		 * no longer trust that the next pointer is valid.
+		 * Restart the loop. We'll skip this one with the
+		 * tracefs_positive() check.
+		 */
+		goto loop;
+	}
+	spin_unlock(&parent->d_lock);
+
+	mutex_unlock(&parent->d_inode->i_mutex);
+	child = parent;
+	parent = parent->d_parent;
+	mutex_lock(&parent->d_inode->i_mutex);
+
+	if (child != dentry)
+		/* go up */
+		goto loop;
+
+	if (!__tracefs_remove(child, parent))
+		simple_release_fs(&tracefs_mount, &tracefs_mount_count);
+	mutex_unlock(&parent->d_inode->i_mutex);
+}
+
+/**
+ * tracefs_initialized - Tells whether tracefs has been registered
+ */
+bool tracefs_initialized(void)
+{
+	return tracefs_registered;
+}
+
+static int __init tracefs_init(void)
+{
+	int retval;
+
+	retval = register_filesystem(&trace_fs_type);
+	if (!retval)
+		tracefs_registered = true;
+
+	return retval;
+}
+core_initcall(tracefs_init);
diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h
index 7d664ea85ebd..7b1425a6b370 100644
--- a/include/uapi/linux/magic.h
+++ b/include/uapi/linux/magic.h
@@ -58,6 +58,8 @@
 
 #define STACK_END_MAGIC		0x57AC6E9D
 
+#define TRACEFS_MAGIC          0x74726163
+
 #define V9FS_MAGIC		0x01021997
 
 #define BDEVFS_MAGIC            0x62646576
-- 
2.1.4



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC][PATCH 2/5] tracing: Convert the tracing facility over to use tracefs
  2015-01-21 17:19 [RFC][PATCH 0/5] tracing: Add new file system tracefs Steven Rostedt
  2015-01-21 17:19 ` [RFC][PATCH 1/5] tracefs: Add new tracefs file system Steven Rostedt
@ 2015-01-21 17:19 ` Steven Rostedt
  2015-01-21 17:32   ` Steven Rostedt
  2015-01-21 17:19 ` [RFC][PATCH 3/5] tracing: Automatically mount tracefs on debugfs/tracing Steven Rostedt
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 26+ messages in thread
From: Steven Rostedt @ 2015-01-21 17:19 UTC (permalink / raw)
  To: linux-kernel; +Cc: Al Viro, Greg Kroah-Hartman, Ingo Molnar, Andrew Morton

[-- Attachment #1: 0002-tracing-Convert-the-tracing-facility-over-to-use-tra.patch --]
[-- Type: text/plain, Size: 21031 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

debugfs was fine for the tracing facility as a quick way to get
an interface. Now that tracing has matured, it should separate itself
from debugfs such that it can be mounted separately without needing
to mount all of debugfs with it. That is, users resist using tracing
because it requires mounting debugfs. Having tracing have its own file
system lets users get the features of tracing without needing to bring
in the rest of the kernel's debug infrastructure.

Another reason for tracefs is that debubfs does not support mkdir.
Currently, to create instances, one does a mkdir in the tracing/instance
directory. This is implemented via a hack that forces debugfs to do something
it is not intended on doing. By converting over to tracefs, this hack can
be removed and mkdir can be properly implemented. This patch does not
address this yet, but it lays the ground work for that to be done.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/ftrace.c                | 22 ++++++-------
 kernel/trace/trace.c                 | 63 ++++++++++++++++++++++--------------
 kernel/trace/trace.h                 |  2 +-
 kernel/trace/trace_events.c          | 32 +++++++++---------
 kernel/trace/trace_functions_graph.c |  7 ++--
 kernel/trace/trace_kprobe.c          | 10 +++---
 kernel/trace/trace_probe.h           |  2 +-
 kernel/trace/trace_stat.c            | 10 +++---
 scripts/tags.sh                      |  2 +-
 9 files changed, 81 insertions(+), 69 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 80c9d34540dd..e3596de88fc1 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -18,7 +18,7 @@
 #include <linux/kallsyms.h>
 #include <linux/seq_file.h>
 #include <linux/suspend.h>
-#include <linux/debugfs.h>
+#include <linux/tracefs.h>
 #include <linux/hardirq.h>
 #include <linux/kthread.h>
 #include <linux/uaccess.h>
@@ -1008,7 +1008,7 @@ static struct tracer_stat function_stats __initdata = {
 	.stat_show	= function_stat_show
 };
 
-static __init void ftrace_profile_debugfs(struct dentry *d_tracer)
+static __init void ftrace_profile_tracefs(struct dentry *d_tracer)
 {
 	struct ftrace_profile_stat *stat;
 	struct dentry *entry;
@@ -1044,15 +1044,15 @@ static __init void ftrace_profile_debugfs(struct dentry *d_tracer)
 		}
 	}
 
-	entry = debugfs_create_file("function_profile_enabled", 0644,
+	entry = tracefs_create_file("function_profile_enabled", 0644,
 				    d_tracer, NULL, &ftrace_profile_fops);
 	if (!entry)
-		pr_warning("Could not create debugfs "
+		pr_warning("Could not create tracefs "
 			   "'function_profile_enabled' entry\n");
 }
 
 #else /* CONFIG_FUNCTION_PROFILER */
-static __init void ftrace_profile_debugfs(struct dentry *d_tracer)
+static __init void ftrace_profile_tracefs(struct dentry *d_tracer)
 {
 }
 #endif /* CONFIG_FUNCTION_PROFILER */
@@ -4653,7 +4653,7 @@ void ftrace_destroy_filter_files(struct ftrace_ops *ops)
 	mutex_unlock(&ftrace_lock);
 }
 
-static __init int ftrace_init_dyn_debugfs(struct dentry *d_tracer)
+static __init int ftrace_init_dyn_tracefs(struct dentry *d_tracer)
 {
 
 	trace_create_file("available_filter_functions", 0444,
@@ -4961,7 +4961,7 @@ static int __init ftrace_nodyn_init(void)
 }
 core_initcall(ftrace_nodyn_init);
 
-static inline int ftrace_init_dyn_debugfs(struct dentry *d_tracer) { return 0; }
+static inline int ftrace_init_dyn_tracefs(struct dentry *d_tracer) { return 0; }
 static inline void ftrace_startup_enable(int command) { }
 static inline void ftrace_startup_all(int command) { }
 /* Keep as macros so we do not need to define the commands */
@@ -5414,7 +5414,7 @@ static const struct file_operations ftrace_pid_fops = {
 	.release	= ftrace_pid_release,
 };
 
-static __init int ftrace_init_debugfs(void)
+static __init int ftrace_init_tracefs(void)
 {
 	struct dentry *d_tracer;
 
@@ -5422,16 +5422,16 @@ static __init int ftrace_init_debugfs(void)
 	if (IS_ERR(d_tracer))
 		return 0;
 
-	ftrace_init_dyn_debugfs(d_tracer);
+	ftrace_init_dyn_tracefs(d_tracer);
 
 	trace_create_file("set_ftrace_pid", 0644, d_tracer,
 			    NULL, &ftrace_pid_fops);
 
-	ftrace_profile_debugfs(d_tracer);
+	ftrace_profile_tracefs(d_tracer);
 
 	return 0;
 }
-fs_initcall(ftrace_init_debugfs);
+fs_initcall(ftrace_init_tracefs);
 
 /**
  * ftrace_kill - kill ftrace
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index acd27555dc5b..a51a00317abe 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -20,6 +20,7 @@
 #include <linux/notifier.h>
 #include <linux/irqflags.h>
 #include <linux/debugfs.h>
+#include <linux/tracefs.h>
 #include <linux/pagemap.h>
 #include <linux/hardirq.h>
 #include <linux/linkage.h>
@@ -5814,19 +5815,31 @@ static __init int register_snapshot_cmd(void)
 static inline __init int register_snapshot_cmd(void) { return 0; }
 #endif /* defined(CONFIG_TRACER_SNAPSHOT) && defined(CONFIG_DYNAMIC_FTRACE) */
 
+#define TRACE_TOP_DIR_ENTRY		((struct dentry *)1)
+
 struct dentry *tracing_init_dentry_tr(struct trace_array *tr)
 {
+	/* Top entry does not have a descriptor */
+	if (tr->dir == TRACE_TOP_DIR_ENTRY)
+		return NULL;
+
+	/* All sub buffers do */
 	if (tr->dir)
 		return tr->dir;
 
 	if (!debugfs_initialized())
 		return ERR_PTR(-ENODEV);
 
-	if (tr->flags & TRACE_ARRAY_FL_GLOBAL)
+	if (tr->flags & TRACE_ARRAY_FL_GLOBAL) {
 		tr->dir = debugfs_create_dir("tracing", NULL);
+		tr->dir = TRACE_TOP_DIR_ENTRY;
+		return NULL;
+	}
 
-	if (!tr->dir)
+	if (!tr->dir) {
 		pr_warn_once("Could not create debugfs directory 'tracing'\n");
+		return ERR_PTR(-ENOMEM);
+	}
 
 	return tr->dir;
 }
@@ -5847,10 +5860,10 @@ static struct dentry *tracing_dentry_percpu(struct trace_array *tr, int cpu)
 	if (IS_ERR(d_tracer))
 		return NULL;
 
-	tr->percpu_dir = debugfs_create_dir("per_cpu", d_tracer);
+	tr->percpu_dir = tracefs_create_dir("per_cpu", d_tracer);
 
 	WARN_ONCE(!tr->percpu_dir,
-		  "Could not create debugfs directory 'per_cpu/%d'\n", cpu);
+		  "Could not create tracefs directory 'per_cpu/%d'\n", cpu);
 
 	return tr->percpu_dir;
 }
@@ -5867,7 +5880,7 @@ trace_create_cpu_file(const char *name, umode_t mode, struct dentry *parent,
 }
 
 static void
-tracing_init_debugfs_percpu(struct trace_array *tr, long cpu)
+tracing_init_tracefs_percpu(struct trace_array *tr, long cpu)
 {
 	struct dentry *d_percpu = tracing_dentry_percpu(tr, cpu);
 	struct dentry *d_cpu;
@@ -5877,9 +5890,9 @@ tracing_init_debugfs_percpu(struct trace_array *tr, long cpu)
 		return;
 
 	snprintf(cpu_dir, 30, "cpu%ld", cpu);
-	d_cpu = debugfs_create_dir(cpu_dir, d_percpu);
+	d_cpu = tracefs_create_dir(cpu_dir, d_percpu);
 	if (!d_cpu) {
-		pr_warning("Could not create debugfs '%s' entry\n", cpu_dir);
+		pr_warning("Could not create tracefs '%s' entry\n", cpu_dir);
 		return;
 	}
 
@@ -6031,9 +6044,9 @@ struct dentry *trace_create_file(const char *name,
 {
 	struct dentry *ret;
 
-	ret = debugfs_create_file(name, mode, parent, data, fops);
+	ret = tracefs_create_file(name, mode, parent, data, fops);
 	if (!ret)
-		pr_warning("Could not create debugfs '%s' entry\n", name);
+		pr_warning("Could not create tracefs '%s' entry\n", name);
 
 	return ret;
 }
@@ -6050,9 +6063,9 @@ static struct dentry *trace_options_init_dentry(struct trace_array *tr)
 	if (IS_ERR(d_tracer))
 		return NULL;
 
-	tr->options = debugfs_create_dir("options", d_tracer);
+	tr->options = tracefs_create_dir("options", d_tracer);
 	if (!tr->options) {
-		pr_warning("Could not create debugfs directory 'options'\n");
+		pr_warning("Could not create tracefs directory 'options'\n");
 		return NULL;
 	}
 
@@ -6121,7 +6134,7 @@ destroy_trace_option_files(struct trace_option_dentry *topts)
 		return;
 
 	for (cnt = 0; topts[cnt].opt; cnt++)
-		debugfs_remove(topts[cnt].entry);
+		tracefs_remove(topts[cnt].entry);
 
 	kfree(topts);
 }
@@ -6210,7 +6223,7 @@ static const struct file_operations rb_simple_fops = {
 struct dentry *trace_instance_dir;
 
 static void
-init_tracer_debugfs(struct trace_array *tr, struct dentry *d_tracer);
+init_tracer_tracefs(struct trace_array *tr, struct dentry *d_tracer);
 
 static int
 allocate_trace_buffer(struct trace_array *tr, struct trace_buffer *buf, int size)
@@ -6326,17 +6339,17 @@ static int new_instance_create(const char *name)
 	if (allocate_trace_buffers(tr, trace_buf_size) < 0)
 		goto out_free_tr;
 
-	tr->dir = debugfs_create_dir(name, trace_instance_dir);
+	tr->dir = tracefs_create_dir(name, trace_instance_dir);
 	if (!tr->dir)
 		goto out_free_tr;
 
 	ret = event_trace_add_tracer(tr->dir, tr);
 	if (ret) {
-		debugfs_remove_recursive(tr->dir);
+		tracefs_remove_recursive(tr->dir);
 		goto out_free_tr;
 	}
 
-	init_tracer_debugfs(tr, tr->dir);
+	init_tracer_tracefs(tr, tr->dir);
 
 	list_add(&tr->list, &ftrace_trace_arrays);
 
@@ -6384,7 +6397,7 @@ static int instance_delete(const char *name)
 	tracing_set_nop(tr);
 	event_trace_del_tracer(tr);
 	ftrace_destroy_function_files(tr);
-	debugfs_remove_recursive(tr->dir);
+	tracefs_remove_recursive(tr->dir);
 	free_trace_buffers(tr);
 
 	kfree(tr->name);
@@ -6409,7 +6422,7 @@ static int instance_mkdir (struct inode *inode, struct dentry *dentry, umode_t m
 		return -ENOENT;
 
 	/*
-	 * The inode mutex is locked, but debugfs_create_dir() will also
+	 * The inode mutex is locked, but tracefs_create_dir() will also
 	 * take the mutex. As the instances directory can not be destroyed
 	 * or changed in any other way, it is safe to unlock it, and
 	 * let the dentry try. If two users try to make the same dir at
@@ -6439,7 +6452,7 @@ static int instance_rmdir(struct inode *inode, struct dentry *dentry)
 	mutex_unlock(&dentry->d_inode->i_mutex);
 
 	/*
-	 * The inode mutex is locked, but debugfs_create_dir() will also
+	 * The inode mutex is locked, but tracefs_create_dir() will also
 	 * take the mutex. As the instances directory can not be destroyed
 	 * or changed in any other way, it is safe to unlock it, and
 	 * let the dentry try. If two users try to make the same dir at
@@ -6464,7 +6477,7 @@ static const struct inode_operations instance_dir_inode_operations = {
 
 static __init void create_trace_instances(struct dentry *d_tracer)
 {
-	trace_instance_dir = debugfs_create_dir("instances", d_tracer);
+	trace_instance_dir = tracefs_create_dir("instances", d_tracer);
 	if (WARN_ON(!trace_instance_dir))
 		return;
 
@@ -6473,7 +6486,7 @@ static __init void create_trace_instances(struct dentry *d_tracer)
 }
 
 static void
-init_tracer_debugfs(struct trace_array *tr, struct dentry *d_tracer)
+init_tracer_tracefs(struct trace_array *tr, struct dentry *d_tracer)
 {
 	int cpu;
 
@@ -6527,11 +6540,11 @@ init_tracer_debugfs(struct trace_array *tr, struct dentry *d_tracer)
 #endif
 
 	for_each_tracing_cpu(cpu)
-		tracing_init_debugfs_percpu(tr, cpu);
+		tracing_init_tracefs_percpu(tr, cpu);
 
 }
 
-static __init int tracer_init_debugfs(void)
+static __init int tracer_init_tracefs(void)
 {
 	struct dentry *d_tracer;
 
@@ -6541,7 +6554,7 @@ static __init int tracer_init_debugfs(void)
 	if (IS_ERR(d_tracer))
 		return 0;
 
-	init_tracer_debugfs(&global_trace, d_tracer);
+	init_tracer_tracefs(&global_trace, d_tracer);
 
 	trace_create_file("tracing_thresh", 0644, d_tracer,
 			&global_trace, &tracing_thresh_fops);
@@ -6901,5 +6914,5 @@ __init static int clear_boot_tracer(void)
 	return 0;
 }
 
-fs_initcall(tracer_init_debugfs);
+fs_initcall(tracer_init_tracefs);
 late_initcall(clear_boot_tracer);
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 0eddfeb05fee..ba1170cb4880 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -334,7 +334,7 @@ struct tracer_flags {
 
 
 /**
- * struct tracer - a specific tracer and its callbacks to interact with debugfs
+ * struct tracer - a specific tracer and its callbacks to interact with tracefs
  * @name: the name chosen to select it on the available_tracers file
  * @init: called when one switches to this tracer (echo name > current_tracer)
  * @reset: called when one switches to another tracer
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 4ff8c1394017..e3b7782f904f 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -13,7 +13,7 @@
 #include <linux/workqueue.h>
 #include <linux/spinlock.h>
 #include <linux/kthread.h>
-#include <linux/debugfs.h>
+#include <linux/tracefs.h>
 #include <linux/uaccess.h>
 #include <linux/module.h>
 #include <linux/ctype.h>
@@ -480,7 +480,7 @@ static void remove_subsystem(struct ftrace_subsystem_dir *dir)
 		return;
 
 	if (!--dir->nr_events) {
-		debugfs_remove_recursive(dir->entry);
+		tracefs_remove_recursive(dir->entry);
 		list_del(&dir->list);
 		__put_system_dir(dir);
 	}
@@ -499,7 +499,7 @@ static void remove_event_file_dir(struct ftrace_event_file *file)
 		}
 		spin_unlock(&dir->d_lock);
 
-		debugfs_remove_recursive(dir);
+		tracefs_remove_recursive(dir);
 	}
 
 	list_del(&file->list);
@@ -1526,7 +1526,7 @@ event_subsystem_dir(struct trace_array *tr, const char *name,
 	} else
 		__get_system(system);
 
-	dir->entry = debugfs_create_dir(name, parent);
+	dir->entry = tracefs_create_dir(name, parent);
 	if (!dir->entry) {
 		pr_warn("Failed to create system directory %s\n", name);
 		__put_system(system);
@@ -1539,12 +1539,12 @@ event_subsystem_dir(struct trace_array *tr, const char *name,
 	dir->subsystem = system;
 	file->system = dir;
 
-	entry = debugfs_create_file("filter", 0644, dir->entry, dir,
+	entry = tracefs_create_file("filter", 0644, dir->entry, dir,
 				    &ftrace_subsystem_filter_fops);
 	if (!entry) {
 		kfree(system->filter);
 		system->filter = NULL;
-		pr_warn("Could not create debugfs '%s/filter' entry\n", name);
+		pr_warn("Could not create tracefs '%s/filter' entry\n", name);
 	}
 
 	trace_create_file("enable", 0644, dir->entry, dir,
@@ -1585,9 +1585,9 @@ event_create_dir(struct dentry *parent, struct ftrace_event_file *file)
 		d_events = parent;
 
 	name = ftrace_event_name(call);
-	file->dir = debugfs_create_dir(name, d_events);
+	file->dir = tracefs_create_dir(name, d_events);
 	if (!file->dir) {
-		pr_warn("Could not create debugfs '%s' directory\n", name);
+		pr_warn("Could not create tracefs '%s' directory\n", name);
 		return -1;
 	}
 
@@ -2228,7 +2228,7 @@ static inline int register_event_cmds(void) { return 0; }
 /*
  * The top level array has already had its ftrace_event_file
  * descriptors created in order to allow for early events to
- * be recorded. This function is called after the debugfs has been
+ * be recorded. This function is called after the tracefs has been
  * initialized, and we now have to create the files associated
  * to the events.
  */
@@ -2311,16 +2311,16 @@ create_event_toplevel_files(struct dentry *parent, struct trace_array *tr)
 	struct dentry *d_events;
 	struct dentry *entry;
 
-	entry = debugfs_create_file("set_event", 0644, parent,
+	entry = tracefs_create_file("set_event", 0644, parent,
 				    tr, &ftrace_set_event_fops);
 	if (!entry) {
-		pr_warn("Could not create debugfs 'set_event' entry\n");
+		pr_warn("Could not create tracefs 'set_event' entry\n");
 		return -ENOMEM;
 	}
 
-	d_events = debugfs_create_dir("events", parent);
+	d_events = tracefs_create_dir("events", parent);
 	if (!d_events) {
-		pr_warn("Could not create debugfs 'events' directory\n");
+		pr_warn("Could not create tracefs 'events' directory\n");
 		return -ENOMEM;
 	}
 
@@ -2412,7 +2412,7 @@ int event_trace_del_tracer(struct trace_array *tr)
 
 	down_write(&trace_event_sem);
 	__trace_remove_event_dirs(tr);
-	debugfs_remove_recursive(tr->event_dir);
+	tracefs_remove_recursive(tr->event_dir);
 	up_write(&trace_event_sem);
 
 	tr->event_dir = NULL;
@@ -2493,10 +2493,10 @@ static __init int event_trace_init(void)
 	if (IS_ERR(d_tracer))
 		return 0;
 
-	entry = debugfs_create_file("available_events", 0444, d_tracer,
+	entry = tracefs_create_file("available_events", 0444, d_tracer,
 				    tr, &ftrace_avail_fops);
 	if (!entry)
-		pr_warn("Could not create debugfs 'available_events' entry\n");
+		pr_warn("Could not create tracefs 'available_events' entry\n");
 
 	if (trace_define_common_fields())
 		pr_warn("tracing: Failed to allocate common fields");
diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
index 2d25ad1526bb..9cfea4c6d314 100644
--- a/kernel/trace/trace_functions_graph.c
+++ b/kernel/trace/trace_functions_graph.c
@@ -6,7 +6,6 @@
  * is Copyright (c) Steven Rostedt <srostedt@redhat.com>
  *
  */
-#include <linux/debugfs.h>
 #include <linux/uaccess.h>
 #include <linux/ftrace.h>
 #include <linux/slab.h>
@@ -151,7 +150,7 @@ ftrace_push_return_trace(unsigned long ret, unsigned long func, int *depth,
 	 * The curr_ret_stack is initialized to -1 and get increased
 	 * in this function.  So it can be less than -1 only if it was
 	 * filtered out via ftrace_graph_notrace_addr() which can be
-	 * set from set_graph_notrace file in debugfs by user.
+	 * set from set_graph_notrace file in tracefs by user.
 	 */
 	if (current->curr_ret_stack < -1)
 		return -EBUSY;
@@ -1432,7 +1431,7 @@ static const struct file_operations graph_depth_fops = {
 	.llseek		= generic_file_llseek,
 };
 
-static __init int init_graph_debugfs(void)
+static __init int init_graph_tracefs(void)
 {
 	struct dentry *d_tracer;
 
@@ -1445,7 +1444,7 @@ static __init int init_graph_debugfs(void)
 
 	return 0;
 }
-fs_initcall(init_graph_debugfs);
+fs_initcall(init_graph_tracefs);
 
 static __init int init_graph_trace(void)
 {
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index b4a00def88f5..c1c6655847c8 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1310,7 +1310,7 @@ static int unregister_kprobe_event(struct trace_kprobe *tk)
 	return ret;
 }
 
-/* Make a debugfs interface for controlling probe points */
+/* Make a tracefs interface for controlling probe points */
 static __init int init_kprobe_trace(void)
 {
 	struct dentry *d_tracer;
@@ -1323,20 +1323,20 @@ static __init int init_kprobe_trace(void)
 	if (IS_ERR(d_tracer))
 		return 0;
 
-	entry = debugfs_create_file("kprobe_events", 0644, d_tracer,
+	entry = tracefs_create_file("kprobe_events", 0644, d_tracer,
 				    NULL, &kprobe_events_ops);
 
 	/* Event list interface */
 	if (!entry)
-		pr_warning("Could not create debugfs "
+		pr_warning("Could not create tracefs "
 			   "'kprobe_events' entry\n");
 
 	/* Profile interface */
-	entry = debugfs_create_file("kprobe_profile", 0444, d_tracer,
+	entry = tracefs_create_file("kprobe_profile", 0444, d_tracer,
 				    NULL, &kprobe_profile_ops);
 
 	if (!entry)
-		pr_warning("Could not create debugfs "
+		pr_warning("Could not create tracefs "
 			   "'kprobe_profile' entry\n");
 	return 0;
 }
diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
index 4f815fbce16d..19aff635841a 100644
--- a/kernel/trace/trace_probe.h
+++ b/kernel/trace/trace_probe.h
@@ -25,7 +25,7 @@
 #include <linux/seq_file.h>
 #include <linux/slab.h>
 #include <linux/smp.h>
-#include <linux/debugfs.h>
+#include <linux/tracefs.h>
 #include <linux/types.h>
 #include <linux/string.h>
 #include <linux/ctype.h>
diff --git a/kernel/trace/trace_stat.c b/kernel/trace/trace_stat.c
index 75e19e86c954..6cf935316769 100644
--- a/kernel/trace/trace_stat.c
+++ b/kernel/trace/trace_stat.c
@@ -12,7 +12,7 @@
 #include <linux/list.h>
 #include <linux/slab.h>
 #include <linux/rbtree.h>
-#include <linux/debugfs.h>
+#include <linux/tracefs.h>
 #include "trace_stat.h"
 #include "trace.h"
 
@@ -65,7 +65,7 @@ static void reset_stat_session(struct stat_session *session)
 
 static void destroy_session(struct stat_session *session)
 {
-	debugfs_remove(session->file);
+	tracefs_remove(session->file);
 	__reset_stat_session(session);
 	mutex_destroy(&session->stat_mutex);
 	kfree(session);
@@ -279,9 +279,9 @@ static int tracing_stat_init(void)
 	if (IS_ERR(d_tracing))
 		return 0;
 
-	stat_dir = debugfs_create_dir("trace_stat", d_tracing);
+	stat_dir = tracefs_create_dir("trace_stat", d_tracing);
 	if (!stat_dir)
-		pr_warning("Could not create debugfs "
+		pr_warning("Could not create tracefs "
 			   "'trace_stat' entry\n");
 	return 0;
 }
@@ -291,7 +291,7 @@ static int init_stat_file(struct stat_session *session)
 	if (!stat_dir && tracing_stat_init())
 		return -ENODEV;
 
-	session->file = debugfs_create_file(session->ts->name, 0644,
+	session->file = tracefs_create_file(session->ts->name, 0644,
 					    stat_dir,
 					    session, &tracing_stat_fops);
 	if (!session->file)
diff --git a/scripts/tags.sh b/scripts/tags.sh
index cdb491d84503..505231a09b07 100755
--- a/scripts/tags.sh
+++ b/scripts/tags.sh
@@ -228,7 +228,7 @@ exuberant()
 
 emacs()
 {
-	all_target_sources | xargs $1 -a                        \
+	all_target_sources | xargs $1 -a --no-members           \
 	--regex='/^\(ENTRY\|_GLOBAL\)(\([^)]*\)).*/\2/'         \
 	--regex='/^SYSCALL_DEFINE[0-9]?(\([^,)]*\).*/sys_\1/'   \
 	--regex='/^COMPAT_SYSCALL_DEFINE[0-9]?(\([^,)]*\).*/compat_sys_\1/' \
-- 
2.1.4



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC][PATCH 3/5] tracing: Automatically mount tracefs on debugfs/tracing
  2015-01-21 17:19 [RFC][PATCH 0/5] tracing: Add new file system tracefs Steven Rostedt
  2015-01-21 17:19 ` [RFC][PATCH 1/5] tracefs: Add new tracefs file system Steven Rostedt
  2015-01-21 17:19 ` [RFC][PATCH 2/5] tracing: Convert the tracing facility over to use tracefs Steven Rostedt
@ 2015-01-21 17:19 ` Steven Rostedt
  2015-01-21 17:19 ` [RFC][PATCH 4/5] tracing: Have mkdir and rmdir be part of tracefs Steven Rostedt
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 26+ messages in thread
From: Steven Rostedt @ 2015-01-21 17:19 UTC (permalink / raw)
  To: linux-kernel
  Cc: Al Viro, Greg Kroah-Hartman, Ingo Molnar, Andrew Morton, Al Viro

[-- Attachment #1: 0003-tracing-Automatically-mount-tracefs-on-debugfs-traci.patch --]
[-- Type: text/plain, Size: 3737 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

As tools currently rely on the tracing directory in debugfs, we can not
just created a tracefs infrastructure and expect sysadmins to mount
the new tracefs to have their old tools work.

Instead, the debugfs tracing directory is still created and the tracefs
file system is mounted there when the debugfs filesystem is mounted.

No longer does the tracing infrastructure update the debugfs file system,
but instead interacts with the tracefs file system. But now, it still
appears to the user like nothing changed, except you also have the feature
of mounting just the tracing system without needing all of debugfs!

Note, because debugfs_create_dir() happens to end up setting the
dentry->d_op, we can not use d_set_d_op() but must manually assign the
new op, that has automount set, to the dentry returned. This can be
racy, but since this happens during the initcall sequence on boot up,
there should be nothing that races with it.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/trace.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 50 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index a51a00317abe..e025e6f04bfa 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -32,6 +32,7 @@
 #include <linux/splice.h>
 #include <linux/kdebug.h>
 #include <linux/string.h>
+#include <linux/mount.h>
 #include <linux/rwsem.h>
 #include <linux/slab.h>
 #include <linux/ctype.h>
@@ -5815,10 +5816,35 @@ static __init int register_snapshot_cmd(void)
 static inline __init int register_snapshot_cmd(void) { return 0; }
 #endif /* defined(CONFIG_TRACER_SNAPSHOT) && defined(CONFIG_DYNAMIC_FTRACE) */
 
+static struct vfsmount *trace_automount(struct path *path)
+{
+	struct vfsmount *mnt;
+	struct file_system_type *type;
+
+	/*
+	 * To maintain backward compatibility for tools that mount
+	 * debugfs to get to the tracing facility, tracefs is automatically
+	 * mounted to the debugfs/tracing directory.
+	 */
+	type = get_fs_type("tracefs");
+	if (!type)
+		return NULL;
+	mnt = vfs_kern_mount(type, 0, "tracefs", NULL);
+	put_filesystem(type);
+	if (IS_ERR(mnt))
+		return NULL;
+	mntget(mnt);
+
+	return mnt;
+}
+
 #define TRACE_TOP_DIR_ENTRY		((struct dentry *)1)
 
 struct dentry *tracing_init_dentry_tr(struct trace_array *tr)
 {
+	static struct dentry_operations trace_ops;
+	struct dentry *traced;
+
 	/* Top entry does not have a descriptor */
 	if (tr->dir == TRACE_TOP_DIR_ENTRY)
 		return NULL;
@@ -5831,7 +5857,30 @@ struct dentry *tracing_init_dentry_tr(struct trace_array *tr)
 		return ERR_PTR(-ENODEV);
 
 	if (tr->flags & TRACE_ARRAY_FL_GLOBAL) {
-		tr->dir = debugfs_create_dir("tracing", NULL);
+		traced = debugfs_create_dir("tracing", NULL);
+		if (!traced)
+			return ERR_PTR(-ENOMEM);
+		/* copy the dentry ops and add an automount to it */
+		if (traced->d_op) {
+			/*
+			 * FIXME:
+			 * Currently debugfs sets the d_op by a side-effect
+			 * of calling simple_lookup(). Normally, we should
+			 * never change d_op of a dentry, but as this is
+			 * happening at boot up and shouldn't be racing with
+			 * any other users, this should be OK. But it is still
+			 * a hack, and needs to be properly done.
+			 */
+			trace_ops = *traced->d_op;
+			trace_ops.d_automount = trace_automount;
+			traced->d_flags |= DCACHE_NEED_AUTOMOUNT;
+			traced->d_op = &trace_ops;
+		} else {
+			/* Ideally, this is what should happen */
+			trace_ops = simple_dentry_operations;
+			trace_ops.d_automount = trace_automount;
+			d_set_d_op(traced, &trace_ops);
+		}
 		tr->dir = TRACE_TOP_DIR_ENTRY;
 		return NULL;
 	}
-- 
2.1.4



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC][PATCH 4/5] tracing: Have mkdir and rmdir be part of tracefs
  2015-01-21 17:19 [RFC][PATCH 0/5] tracing: Add new file system tracefs Steven Rostedt
                   ` (2 preceding siblings ...)
  2015-01-21 17:19 ` [RFC][PATCH 3/5] tracing: Automatically mount tracefs on debugfs/tracing Steven Rostedt
@ 2015-01-21 17:19 ` Steven Rostedt
  2015-01-21 18:31   ` Steven Rostedt
  2015-01-21 20:47   ` Steven Rostedt
  2015-01-21 17:19 ` [RFC][PATCH 5/5] tracefs: Add directory /sys/kernel/tracing Steven Rostedt
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 26+ messages in thread
From: Steven Rostedt @ 2015-01-21 17:19 UTC (permalink / raw)
  To: linux-kernel
  Cc: Al Viro, Greg Kroah-Hartman, Ingo Molnar, Andrew Morton, Al Viro

[-- Attachment #1: 0004-tracing-Have-mkdir-and-rmdir-be-part-of-tracefs.patch --]
[-- Type: text/plain, Size: 10874 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

The tracing "instances" directory can create sub tracing buffers
with mkdir, and remove them with rmdir. As a mkdir will also create
all the files and directories that control the sub buffer the locks
needed to be released before doing so to avoid deadlock. This method
was not very robust, and could potentially have a race somewhere due
to the lock releasing within the removing of the directory. But this
was needed because debugfs did not provide a mkdir or rmdir method
from syscalls.

Now that tracing has been converted over to tracefs, the tracefs file
system can be modified to accommodate this feature. Instead of needing
to release the locks, keep them locked but add a way to flag that they
are locked and do not need to be locked again.

A struct trace_dir_ops is created that holds the methods to be called
for both mkdir and rmdir, as well as a pointer to let the tracefs subsystem
know that the current inode's lock is already held by the calling process.

The pointer holds the current owner of the lock, and this is checked when
creating new files or removing old ones, and if the pointer matches current,
then the lock is not taken to avoid the deadlock.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 fs/tracefs/inode.c   | 103 +++++++++++++++++++++++++++++++++++++++++++++------
 kernel/trace/trace.c |  68 ++--------------------------------
 2 files changed, 96 insertions(+), 75 deletions(-)

diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
index d243f670d461..29632ce2c456 100644
--- a/fs/tracefs/inode.c
+++ b/fs/tracefs/inode.c
@@ -42,13 +42,62 @@ static ssize_t default_write_file(struct file *file, const char __user *buf,
 	return count;
 }
 
-const struct file_operations tracefs_file_operations = {
+static const struct file_operations tracefs_file_operations = {
 	.read =		default_read_file,
 	.write =	default_write_file,
 	.open =		simple_open,
 	.llseek =	noop_llseek,
 };
 
+static int tracefs_syscall_mkdir(struct inode *inode, struct dentry *dentry, umode_t mode)
+{
+	struct tracefs_dir_ops *ops = inode ? inode->i_private : NULL;
+	int ret;
+
+	if (!ops)
+		return -EPERM;
+
+	/*
+	 * The mkdir call can call the generic functions that create
+	 * the files within the tracefs system. Do not relock the parent,
+	 * it was locked by the caller of this function.
+	 */
+	ops->lock_owner = current;
+	ret = ops->mkdir(dentry->d_iname);
+	ops->lock_owner = NULL;
+
+	return ret;
+}
+
+static int tracefs_syscall_rmdir(struct inode *inode, struct dentry *dentry)
+{
+	struct tracefs_dir_ops *ops = inode->i_private;
+	int ret;
+
+	if (!ops)
+		return -EPERM;
+
+	/*
+	 * The rmdir call can call the generic functions that remove
+	 * the files within the tracefs system. Do not relock the parent,
+	 * it was locked by the caller of this function.
+	 */
+	ops->lock_owner = current;
+	/* The caller also locked the dentry, do not lock that again either */
+	dentry->d_inode->i_private = ops;
+	ret = ops->rmdir(dentry->d_iname);
+	dentry->d_inode->i_private = NULL;
+	ops->lock_owner = NULL;
+
+	return ret;
+}
+
+const struct inode_operations tracefs_dir_inode_operations = {
+	.lookup		= simple_lookup,
+	.mkdir		= tracefs_syscall_mkdir,
+	.rmdir		= tracefs_syscall_rmdir,
+};
+
 static struct inode *tracefs_get_inode(struct super_block *sb, umode_t mode, dev_t dev,
 				      void *data, const struct file_operations *fops)
 
@@ -68,7 +117,7 @@ static struct inode *tracefs_get_inode(struct super_block *sb, umode_t mode, dev
 			inode->i_private = data;
 			break;
 		case S_IFDIR:
-			inode->i_op = &simple_dir_inode_operations;
+			inode->i_op = &tracefs_dir_inode_operations;
 			inode->i_fop = &simple_dir_operations;
 
 			/* directory inodes start off with i_nlink == 2
@@ -125,6 +174,16 @@ static int tracefs_create(struct inode *dir, struct dentry *dentry, umode_t mode
 	return res;
 }
 
+void tracefs_add_dir_ops(struct dentry *dentry, struct tracefs_dir_ops *ops)
+{
+	struct inode *inode = dentry->d_inode;
+
+	if (!inode)
+		return;
+
+	inode->i_private = ops;
+}
+
 struct tracefs_mount_opts {
 	kuid_t uid;
 	kgid_t gid;
@@ -305,7 +364,9 @@ static struct dentry *__create_file(const char *name, umode_t mode,
 				    struct dentry *parent, void *data,
 				    const struct file_operations *fops)
 {
+	struct tracefs_dir_ops *ops;
 	struct dentry *dentry = NULL;
+	struct inode *parent_inode;
 	int error;
 
 	pr_debug("tracefs: creating file '%s'\n",name);
@@ -323,7 +384,12 @@ static struct dentry *__create_file(const char *name, umode_t mode,
 	if (!parent)
 		parent = tracefs_mount->mnt_root;
 
-	mutex_lock(&parent->d_inode->i_mutex);
+	parent_inode = parent->d_inode;
+	ops = parent_inode->i_private;
+
+	if (!ops || ops->lock_owner != current)
+		mutex_lock(&parent->d_inode->i_mutex);
+
 	dentry = lookup_one_len(name, parent, strlen(name));
 	if (!IS_ERR(dentry)) {
 		switch (mode & S_IFMT) {
@@ -339,7 +405,8 @@ static struct dentry *__create_file(const char *name, umode_t mode,
 		dput(dentry);
 	} else
 		error = PTR_ERR(dentry);
-	mutex_unlock(&parent->d_inode->i_mutex);
+	if (!ops || ops->lock_owner != current)
+		mutex_unlock(&parent->d_inode->i_mutex);
 
 	if (error) {
 		dentry = NULL;
@@ -452,7 +519,9 @@ static int __tracefs_remove(struct dentry *dentry, struct dentry *parent)
  */
 void tracefs_remove(struct dentry *dentry)
 {
+	struct tracefs_dir_ops *ops;
 	struct dentry *parent;
+	struct inode *inode;
 	int ret;
 
 	if (IS_ERR_OR_NULL(dentry))
@@ -462,9 +531,13 @@ void tracefs_remove(struct dentry *dentry)
 	if (!parent || !parent->d_inode)
 		return;
 
-	mutex_lock(&parent->d_inode->i_mutex);
+	inode = parent->d_inode;
+	ops = inode->i_private;
+	if (!ops || ops->lock_owner != current)
+		mutex_lock(&parent->d_inode->i_mutex);
 	ret = __tracefs_remove(dentry, parent);
-	mutex_unlock(&parent->d_inode->i_mutex);
+	if (!ops || ops->lock_owner != current)
+		mutex_unlock(&parent->d_inode->i_mutex);
 	if (!ret)
 		simple_release_fs(&tracefs_mount, &tracefs_mount_count);
 }
@@ -479,6 +552,7 @@ void tracefs_remove(struct dentry *dentry)
  */
 void tracefs_remove_recursive(struct dentry *dentry)
 {
+	struct tracefs_dir_ops *ops;
 	struct dentry *child, *parent;
 
 	if (IS_ERR_OR_NULL(dentry))
@@ -490,7 +564,9 @@ void tracefs_remove_recursive(struct dentry *dentry)
 
 	parent = dentry;
  down:
-	mutex_lock(&parent->d_inode->i_mutex);
+	ops = parent->d_inode->i_private;
+	if (!ops || ops->lock_owner != current)
+		mutex_lock(&parent->d_inode->i_mutex);
  loop:
 	/*
 	 * The parent->d_subdirs is protected by the d_lock. Outside that
@@ -505,7 +581,8 @@ void tracefs_remove_recursive(struct dentry *dentry)
 		/* perhaps simple_empty(child) makes more sense */
 		if (!list_empty(&child->d_subdirs)) {
 			spin_unlock(&parent->d_lock);
-			mutex_unlock(&parent->d_inode->i_mutex);
+			if (!ops || ops->lock_owner != current)
+				mutex_unlock(&parent->d_inode->i_mutex);
 			parent = child;
 			goto down;
 		}
@@ -526,10 +603,13 @@ void tracefs_remove_recursive(struct dentry *dentry)
 	}
 	spin_unlock(&parent->d_lock);
 
-	mutex_unlock(&parent->d_inode->i_mutex);
+	if (!ops || ops->lock_owner != current)
+		mutex_unlock(&parent->d_inode->i_mutex);
 	child = parent;
 	parent = parent->d_parent;
-	mutex_lock(&parent->d_inode->i_mutex);
+	ops = parent->d_inode->i_private;
+	if (!ops || ops->lock_owner != current)
+		mutex_lock(&parent->d_inode->i_mutex);
 
 	if (child != dentry)
 		/* go up */
@@ -537,7 +617,8 @@ void tracefs_remove_recursive(struct dentry *dentry)
 
 	if (!__tracefs_remove(child, parent))
 		simple_release_fs(&tracefs_mount, &tracefs_mount_count);
-	mutex_unlock(&parent->d_inode->i_mutex);
+	if (!ops || ops->lock_owner != current)
+		mutex_unlock(&parent->d_inode->i_mutex);
 }
 
 /**
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index e025e6f04bfa..78bf3007cfc2 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -6349,7 +6349,7 @@ static void free_trace_buffers(struct trace_array *tr)
 #endif
 }
 
-static int new_instance_create(const char *name)
+static int instance_mkdir(const char *name)
 {
 	struct trace_array *tr;
 	int ret;
@@ -6419,7 +6419,7 @@ static int new_instance_create(const char *name)
 
 }
 
-static int instance_delete(const char *name)
+static int instance_rmdir(const char *name)
 {
 	struct trace_array *tr;
 	int found = 0;
@@ -6460,66 +6460,7 @@ static int instance_delete(const char *name)
 	return ret;
 }
 
-static int instance_mkdir (struct inode *inode, struct dentry *dentry, umode_t mode)
-{
-	struct dentry *parent;
-	int ret;
-
-	/* Paranoid: Make sure the parent is the "instances" directory */
-	parent = hlist_entry(inode->i_dentry.first, struct dentry, d_u.d_alias);
-	if (WARN_ON_ONCE(parent != trace_instance_dir))
-		return -ENOENT;
-
-	/*
-	 * The inode mutex is locked, but tracefs_create_dir() will also
-	 * take the mutex. As the instances directory can not be destroyed
-	 * or changed in any other way, it is safe to unlock it, and
-	 * let the dentry try. If two users try to make the same dir at
-	 * the same time, then the new_instance_create() will determine the
-	 * winner.
-	 */
-	mutex_unlock(&inode->i_mutex);
-
-	ret = new_instance_create(dentry->d_iname);
-
-	mutex_lock(&inode->i_mutex);
-
-	return ret;
-}
-
-static int instance_rmdir(struct inode *inode, struct dentry *dentry)
-{
-	struct dentry *parent;
-	int ret;
-
-	/* Paranoid: Make sure the parent is the "instances" directory */
-	parent = hlist_entry(inode->i_dentry.first, struct dentry, d_u.d_alias);
-	if (WARN_ON_ONCE(parent != trace_instance_dir))
-		return -ENOENT;
-
-	/* The caller did a dget() on dentry */
-	mutex_unlock(&dentry->d_inode->i_mutex);
-
-	/*
-	 * The inode mutex is locked, but tracefs_create_dir() will also
-	 * take the mutex. As the instances directory can not be destroyed
-	 * or changed in any other way, it is safe to unlock it, and
-	 * let the dentry try. If two users try to make the same dir at
-	 * the same time, then the instance_delete() will determine the
-	 * winner.
-	 */
-	mutex_unlock(&inode->i_mutex);
-
-	ret = instance_delete(dentry->d_iname);
-
-	mutex_lock_nested(&inode->i_mutex, I_MUTEX_PARENT);
-	mutex_lock(&dentry->d_inode->i_mutex);
-
-	return ret;
-}
-
-static const struct inode_operations instance_dir_inode_operations = {
-	.lookup		= simple_lookup,
+static struct tracefs_dir_ops instance_dir_ops = {
 	.mkdir		= instance_mkdir,
 	.rmdir		= instance_rmdir,
 };
@@ -6530,8 +6471,7 @@ static __init void create_trace_instances(struct dentry *d_tracer)
 	if (WARN_ON(!trace_instance_dir))
 		return;
 
-	/* Hijack the dir inode operations, to allow mkdir */
-	trace_instance_dir->d_inode->i_op = &instance_dir_inode_operations;
+	tracefs_add_dir_ops(trace_instance_dir, &instance_dir_ops);
 }
 
 static void
-- 
2.1.4



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC][PATCH 5/5] tracefs: Add directory /sys/kernel/tracing
  2015-01-21 17:19 [RFC][PATCH 0/5] tracing: Add new file system tracefs Steven Rostedt
                   ` (3 preceding siblings ...)
  2015-01-21 17:19 ` [RFC][PATCH 4/5] tracing: Have mkdir and rmdir be part of tracefs Steven Rostedt
@ 2015-01-21 17:19 ` Steven Rostedt
  2015-01-21 17:32 ` [RFC][PATCH 0/5] tracing: Add new file system tracefs Steven Rostedt
  2015-01-21 23:00 ` Greg Kroah-Hartman
  6 siblings, 0 replies; 26+ messages in thread
From: Steven Rostedt @ 2015-01-21 17:19 UTC (permalink / raw)
  To: linux-kernel; +Cc: Al Viro, Greg Kroah-Hartman, Ingo Molnar, Andrew Morton

[-- Attachment #1: 0005-tracefs-Add-directory-sys-kernel-tracing.patch --]
[-- Type: text/plain, Size: 1126 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

When tracefs is configured, have the directory /sys/kernel/tracing appear
just like /sys/kernel/debug appears when debugfs is configured.

This will give a consistent place for system admins to mount tracefs.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 fs/tracefs/inode.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
index 29632ce2c456..e5d233aa4edd 100644
--- a/fs/tracefs/inode.c
+++ b/fs/tracefs/inode.c
@@ -16,6 +16,7 @@
 #include <linux/module.h>
 #include <linux/fs.h>
 #include <linux/mount.h>
+#include <linux/kobject.h>
 #include <linux/namei.h>
 #include <linux/tracefs.h>
 #include <linux/fsnotify.h>
@@ -629,10 +630,16 @@ bool tracefs_initialized(void)
 	return tracefs_registered;
 }
 
+static struct kobject *trace_kobj;
+
 static int __init tracefs_init(void)
 {
 	int retval;
 
+	trace_kobj = kobject_create_and_add("tracing", kernel_kobj);
+	if (!trace_kobj)
+		return -EINVAL;
+
 	retval = register_filesystem(&trace_fs_type);
 	if (!retval)
 		tracefs_registered = true;
-- 
2.1.4



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [RFC][PATCH 2/5] tracing: Convert the tracing facility over to use tracefs
  2015-01-21 17:19 ` [RFC][PATCH 2/5] tracing: Convert the tracing facility over to use tracefs Steven Rostedt
@ 2015-01-21 17:32   ` Steven Rostedt
  0 siblings, 0 replies; 26+ messages in thread
From: Steven Rostedt @ 2015-01-21 17:32 UTC (permalink / raw)
  To: linux-kernel; +Cc: Al Viro, Greg Kroah-Hartman, Ingo Molnar, Andrew Morton

On Wed, 21 Jan 2015 12:19:55 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

>  scripts/tags.sh                      |  2 +-
>  9 files changed, 81 insertions(+), 69 deletions(-)
> 


> diff --git a/scripts/tags.sh b/scripts/tags.sh
> index cdb491d84503..505231a09b07 100755
> --- a/scripts/tags.sh
> +++ b/scripts/tags.sh
> @@ -228,7 +228,7 @@ exuberant()
>  
>  emacs()
>  {
> -	all_target_sources | xargs $1 -a                        \
> +	all_target_sources | xargs $1 -a --no-members           \
>  	--regex='/^\(ENTRY\|_GLOBAL\)(\([^)]*\)).*/\2/'         \
>  	--regex='/^SYSCALL_DEFINE[0-9]?(\([^,)]*\).*/sys_\1/'   \
>  	--regex='/^COMPAT_SYSCALL_DEFINE[0-9]?(\([^,)]*\).*/compat_sys_\1/' \

Oops! I applied my "fix make tags" patch to do a new "make TAGS" and
forgot to remove it when doing the "git commit -a -s".

Will remove this from the series.

-- Steve


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC][PATCH 0/5] tracing: Add new file system tracefs
  2015-01-21 17:19 [RFC][PATCH 0/5] tracing: Add new file system tracefs Steven Rostedt
                   ` (4 preceding siblings ...)
  2015-01-21 17:19 ` [RFC][PATCH 5/5] tracefs: Add directory /sys/kernel/tracing Steven Rostedt
@ 2015-01-21 17:32 ` Steven Rostedt
  2015-01-21 23:00 ` Greg Kroah-Hartman
  6 siblings, 0 replies; 26+ messages in thread
From: Steven Rostedt @ 2015-01-21 17:32 UTC (permalink / raw)
  To: linux-kernel; +Cc: Al Viro, Greg Kroah-Hartman, Ingo Molnar, Andrew Morton

On Wed, 21 Jan 2015 12:19:53 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:


> ----
>  fs/Makefile                          |   1 +
>  fs/tracefs/Makefile                  |   4 +
>  fs/tracefs/inode.c                   | 649 +++++++++++++++++++++++++++++++++++
>  include/uapi/linux/magic.h           |   2 +
>  kernel/trace/ftrace.c                |  22 +-
>  kernel/trace/trace.c                 | 178 +++++-----
>  kernel/trace/trace.h                 |   2 +-
>  kernel/trace/trace_events.c          |  32 +-
>  kernel/trace/trace_functions_graph.c |   7 +-
>  kernel/trace/trace_kprobe.c          |  10 +-
>  kernel/trace/trace_probe.h           |   2 +-
>  kernel/trace/trace_stat.c            |  10 +-


>  scripts/tags.sh                      |   2 +-

Ignore this file. It was added by mistake (will remove in next version).

-- Steve

>  13 files changed, 789 insertions(+), 132 deletions(-)


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC][PATCH 1/5] tracefs: Add new tracefs file system
  2015-01-21 17:19 ` [RFC][PATCH 1/5] tracefs: Add new tracefs file system Steven Rostedt
@ 2015-01-21 18:30   ` Steven Rostedt
  0 siblings, 0 replies; 26+ messages in thread
From: Steven Rostedt @ 2015-01-21 18:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Al Viro, Greg Kroah-Hartman, Ingo Molnar, Andrew Morton

I need to learn how to use git better :-/

I forgot to git add include/linux/tracefs.h:

-- Steve

---
 include/linux/tracefs.h    |  41 ++++
 create mode 100644 include/linux/tracefs.h

diff --git a/include/linux/tracefs.h b/include/linux/tracefs.h
new file mode 100644
index 000000000000..23e04ce21749
--- /dev/null
+++ b/include/linux/tracefs.h
@@ -0,0 +1,41 @@
+/*
+ *  tracefs.h - a pseudo file system for activating tracing
+ *
+ * Based on debugfs by: 2004 Greg Kroah-Hartman <greg@kroah.com>
+ *
+ *  Copyright (C) 2014 Red Hat Inc, author: Steven Rostedt <srostedt@redhat.com>
+ *
+ *	This program is free software; you can redistribute it and/or
+ *	modify it under the terms of the GNU General Public License version
+ *	2 as published by the Free Software Foundation.
+ *
+ * tracefs is the file system that is used by the tracing infrastructure.
+ *
+ */
+
+#ifndef _TRACEFS_H_
+#define _TRACEFS_H_
+
+#include <linux/fs.h>
+#include <linux/seq_file.h>
+
+#include <linux/types.h>
+
+struct file_operations;
+
+#ifdef CONFIG_TRACING
+
+struct dentry *tracefs_create_file(const char *name, umode_t mode,
+				   struct dentry *parent, void *data,
+				   const struct file_operations *fops);
+
+struct dentry *tracefs_create_dir(const char *name, struct dentry *parent);
+
+void tracefs_remove(struct dentry *dentry);
+void tracefs_remove_recursive(struct dentry *dentry);
+
+bool tracefs_initialized(void);
+
+#endif /* CONFIG_TRACING */
+
+#endif

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [RFC][PATCH 4/5] tracing: Have mkdir and rmdir be part of tracefs
  2015-01-21 17:19 ` [RFC][PATCH 4/5] tracing: Have mkdir and rmdir be part of tracefs Steven Rostedt
@ 2015-01-21 18:31   ` Steven Rostedt
  2015-01-21 20:47   ` Steven Rostedt
  1 sibling, 0 replies; 26+ messages in thread
From: Steven Rostedt @ 2015-01-21 18:31 UTC (permalink / raw)
  To: linux-kernel; +Cc: Al Viro, Greg Kroah-Hartman, Ingo Molnar, Andrew Morton


The update to the missing tracefs.h.

-- Steve

---
 include/linux/tracefs.h |   8 ++++

diff --git a/include/linux/tracefs.h b/include/linux/tracefs.h
index 23e04ce21749..f8c58ab18bca 100644
--- a/include/linux/tracefs.h
+++ b/include/linux/tracefs.h
@@ -34,6 +34,14 @@ struct dentry *tracefs_create_dir(const char *name, struct dentry *parent);
 void tracefs_remove(struct dentry *dentry);
 void tracefs_remove_recursive(struct dentry *dentry);
 
+struct tracefs_dir_ops {
+	int (*mkdir)(const char *name);
+	int (*rmdir)(const char *name);
+	struct task_struct *lock_owner;
+};
+
+void tracefs_add_dir_ops(struct dentry *dentry, struct tracefs_dir_ops *ops);
+
 bool tracefs_initialized(void);
 
 #endif /* CONFIG_TRACING */

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [RFC][PATCH 4/5] tracing: Have mkdir and rmdir be part of tracefs
  2015-01-21 17:19 ` [RFC][PATCH 4/5] tracing: Have mkdir and rmdir be part of tracefs Steven Rostedt
  2015-01-21 18:31   ` Steven Rostedt
@ 2015-01-21 20:47   ` Steven Rostedt
  1 sibling, 0 replies; 26+ messages in thread
From: Steven Rostedt @ 2015-01-21 20:47 UTC (permalink / raw)
  To: linux-kernel; +Cc: Al Viro, Greg Kroah-Hartman, Ingo Molnar, Andrew Morton

On Wed, 21 Jan 2015 12:19:57 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
> 
> The tracing "instances" directory can create sub tracing buffers
> with mkdir, and remove them with rmdir. As a mkdir will also create
> all the files and directories that control the sub buffer the locks
> needed to be released before doing so to avoid deadlock. This method
> was not very robust, and could potentially have a race somewhere due
> to the lock releasing within the removing of the directory. But this
> was needed because debugfs did not provide a mkdir or rmdir method
> from syscalls.
> 
> Now that tracing has been converted over to tracefs, the tracefs file
> system can be modified to accommodate this feature. Instead of needing
> to release the locks, keep them locked but add a way to flag that they
> are locked and do not need to be locked again.
> 
> A struct trace_dir_ops is created that holds the methods to be called
> for both mkdir and rmdir, as well as a pointer to let the tracefs subsystem
> know that the current inode's lock is already held by the calling process.
> 
> The pointer holds the current owner of the lock, and this is checked when
> creating new files or removing old ones, and if the pointer matches current,
> then the lock is not taken to avoid the deadlock.

Grumble, my tests triggered this bug with lockdep on:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&sb->s_type->i_mutex_key#6);
                               lock(trace_types_lock);
                               lock(&sb->s_type->i_mutex_key#6);
  lock(trace_types_lock);

 *** DEADLOCK ***


I need to take that trace_types_lock here, but it's true that that lock
is held when these mutexes are taken. I'll have to spend a bit more
time figuring out how to solve this :-/

-- Steve

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC][PATCH 0/5] tracing: Add new file system tracefs
  2015-01-21 17:19 [RFC][PATCH 0/5] tracing: Add new file system tracefs Steven Rostedt
                   ` (5 preceding siblings ...)
  2015-01-21 17:32 ` [RFC][PATCH 0/5] tracing: Add new file system tracefs Steven Rostedt
@ 2015-01-21 23:00 ` Greg Kroah-Hartman
  2015-01-22  1:47   ` Steven Rostedt
  2015-01-22  4:23   ` Al Viro
  6 siblings, 2 replies; 26+ messages in thread
From: Greg Kroah-Hartman @ 2015-01-21 23:00 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, Al Viro, Ingo Molnar, Andrew Morton

On Wed, Jan 21, 2015 at 12:19:53PM -0500, Steven Rostedt wrote:
> 
> There has been complaints that tracing is tied too much to debugfs,
> as there are systems that would like to perform tracing, but do
> not mount debugfs for security reasons. That is because any subsystem
> may use debugfs for debugging, and these interfaces are not always
> tested for security.
> 
> Creating a new tracefs that the tracing directory will now be attached
> to allows system admins the ability to access the tracing directory
> without the need to mount debugfs.

Yeah!

Any chance you can use kernfs as your "basis" for this filesystem
instead of having to roll all of your own functions?  I'm slowly working
on moving debugfs to it, and it should save a lot of code there, as well
as fixing some "problems" we have in debugfs file lifetimes when things
are removed from the system.

And given the number of mistakes in this submission, I'll wait for a v2
before really reading the code :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC][PATCH 0/5] tracing: Add new file system tracefs
  2015-01-21 23:00 ` Greg Kroah-Hartman
@ 2015-01-22  1:47   ` Steven Rostedt
  2015-01-22  3:07     ` Steven Rostedt
  2015-01-22  4:23   ` Al Viro
  1 sibling, 1 reply; 26+ messages in thread
From: Steven Rostedt @ 2015-01-22  1:47 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Al Viro, Ingo Molnar, Andrew Morton

On Thu, 22 Jan 2015 07:00:07 +0800
Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:

> Any chance you can use kernfs as your "basis" for this filesystem
> instead of having to roll all of your own functions?  I'm slowly
> working on moving debugfs to it, and it should save a lot of code
> there, as well as fixing some "problems" we have in debugfs file
> lifetimes when things are removed from the system.

Someone else told me about doing this too. I'll take a look at it.

> 
> And given the number of mistakes in this submission, I'll wait for a
> v2 before really reading the code :)

Well, if I switch to using kernfs, I expect we should ignore this
version as well, as I would think it would make this series obsolete.

Also, if that's the case, I guess I could keep all the "tracefs" code
in kernel/tracing and not place it in the fs/ directory? Kind of like
what cgroups does.

-- Steve

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC][PATCH 0/5] tracing: Add new file system tracefs
  2015-01-22  1:47   ` Steven Rostedt
@ 2015-01-22  3:07     ` Steven Rostedt
  2015-01-22  3:18       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 26+ messages in thread
From: Steven Rostedt @ 2015-01-22  3:07 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Al Viro, Ingo Molnar, Andrew Morton

On Wed, 21 Jan 2015 20:47:25 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> Well, if I switch to using kernfs, I expect we should ignore this
> version as well, as I would think it would make this series obsolete.

Is there any documentation on kernfs? I can't find anything that makes
it any easier than what I already have.

-- Steve

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC][PATCH 0/5] tracing: Add new file system tracefs
  2015-01-22  3:07     ` Steven Rostedt
@ 2015-01-22  3:18       ` Greg Kroah-Hartman
  2015-01-22  3:51         ` Steven Rostedt
  0 siblings, 1 reply; 26+ messages in thread
From: Greg Kroah-Hartman @ 2015-01-22  3:18 UTC (permalink / raw)
  To: Steven Rostedt, tj; +Cc: linux-kernel, Al Viro, Ingo Molnar, Andrew Morton

On Wed, Jan 21, 2015 at 10:07:01PM -0500, Steven Rostedt wrote:
> On Wed, 21 Jan 2015 20:47:25 -0500
> Steven Rostedt <rostedt@goodmis.org> wrote:
> > 
> > Well, if I switch to using kernfs, I expect we should ignore this
> > version as well, as I would think it would make this series obsolete.
> 
> Is there any documentation on kernfs? I can't find anything that makes
> it any easier than what I already have.

Tejun would know best, he wrote it :)

What specifically are you looking for?  I think there's at least two
filesystems using it already, are they not good enough examples?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC][PATCH 0/5] tracing: Add new file system tracefs
  2015-01-22  3:18       ` Greg Kroah-Hartman
@ 2015-01-22  3:51         ` Steven Rostedt
  2015-01-22 12:32           ` Tejun Heo
  0 siblings, 1 reply; 26+ messages in thread
From: Steven Rostedt @ 2015-01-22  3:51 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: tj, linux-kernel, Al Viro, Ingo Molnar, Andrew Morton

On Thu, 22 Jan 2015 11:18:19 +0800
Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:


> Tejun would know best, he wrote it :)

Oh good, as Bugs Bunny would say "where's the doc?" (or was that
"what's up doc"?)

> 
> What specifically are you looking for?  I think there's at least two
> filesystems using it already, are they not good enough examples?
> 

Well, I see the two biggest users are sysfs and cgroups, where I never
understood how sysfs actually works, and cgroups, the filesystem is
very integrated with the usage of cgroups.

There doesn't seem to be any abi where one can relate to the vfs
system.

I'd like to keep the interface like debugfs had for tracefs, because
all of tracing depends on it, and it would require a full rewrite to
convert it to something that doesn't have the vfs type of paradigm, in
which case, tracefs would not be done for another decade.

That is, I need to create the following interface:

  tracefs_create_file()
  tracefs_create_dir()
  tracefs_remove()
  tracefs_remove_recursive()

and that's all I need for the filesystem. There doesn't seem to be any
documentation on kernfs about how to implement this.

Yes, I can study the code, but I was hoping that there was some
kernfs.txt that described how to create a new fs with it. It just saves
time if there was a document than having to read the code and perhaps
use it in a way it wasn't supposed to be used.

-- Steve

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC][PATCH 0/5] tracing: Add new file system tracefs
  2015-01-21 23:00 ` Greg Kroah-Hartman
  2015-01-22  1:47   ` Steven Rostedt
@ 2015-01-22  4:23   ` Al Viro
  2015-01-22  4:35     ` Steven Rostedt
  2015-01-22 12:26     ` Tejun Heo
  1 sibling, 2 replies; 26+ messages in thread
From: Al Viro @ 2015-01-22  4:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Steven Rostedt, linux-kernel, Ingo Molnar, Andrew Morton

On Thu, Jan 22, 2015 at 07:00:07AM +0800, Greg Kroah-Hartman wrote:
> On Wed, Jan 21, 2015 at 12:19:53PM -0500, Steven Rostedt wrote:
> > 
> > There has been complaints that tracing is tied too much to debugfs,
> > as there are systems that would like to perform tracing, but do
> > not mount debugfs for security reasons. That is because any subsystem
> > may use debugfs for debugging, and these interfaces are not always
> > tested for security.
> > 
> > Creating a new tracefs that the tracing directory will now be attached
> > to allows system admins the ability to access the tracing directory
> > without the need to mount debugfs.
> 
> Yeah!
> 
> Any chance you can use kernfs as your "basis" for this filesystem
> instead of having to roll all of your own functions?  I'm slowly working
> on moving debugfs to it, and it should save a lot of code there, as well
> as fixing some "problems" we have in debugfs file lifetimes when things
> are removed from the system.

I would recommend against that - kernfs is overburdened by their need to
accomodate cgroup weirdness.  IMO it's not a good model for anything, other
than an anti-hard-drugs poster ("don't shoot that shit, or you might end up
hallucinating _this_").

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC][PATCH 0/5] tracing: Add new file system tracefs
  2015-01-22  4:23   ` Al Viro
@ 2015-01-22  4:35     ` Steven Rostedt
  2015-01-22 12:49       ` Tejun Heo
  2015-01-22 12:26     ` Tejun Heo
  1 sibling, 1 reply; 26+ messages in thread
From: Steven Rostedt @ 2015-01-22  4:35 UTC (permalink / raw)
  To: Al Viro; +Cc: Greg Kroah-Hartman, linux-kernel, Ingo Molnar, Andrew Morton

On Thu, 22 Jan 2015 04:23:30 +0000
Al Viro <viro@ZenIV.linux.org.uk> wrote:

> I would recommend against that - kernfs is overburdened by their need
> to accomodate cgroup weirdness.  IMO it's not a good model for
> anything, other than an anti-hard-drugs poster ("don't shoot that
> shit, or you might end up hallucinating _this_").

OK, I'm not the only one that thought kernfs seemed to go all over the
place. I guess I now know why. It was more of a hook for cgroups. I can
understand why cgroups needed it, as I found that creating files from a
mkdir and removing them with rmdir causes some pain in vfs with
handling of locking. As that's what I'm working on overcoming now.

But I think I solved my issues (testing it now), and hopefully by
tomorrow, I'll have a V2 out.

-- Steve

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC][PATCH 0/5] tracing: Add new file system tracefs
  2015-01-22  4:23   ` Al Viro
  2015-01-22  4:35     ` Steven Rostedt
@ 2015-01-22 12:26     ` Tejun Heo
  1 sibling, 0 replies; 26+ messages in thread
From: Tejun Heo @ 2015-01-22 12:26 UTC (permalink / raw)
  To: Al Viro
  Cc: Greg Kroah-Hartman, Steven Rostedt, linux-kernel, Ingo Molnar,
	Andrew Morton

Hello, Al.

On Thu, Jan 22, 2015 at 04:23:30AM +0000, Al Viro wrote:
> I would recommend against that - kernfs is overburdened by their need to
> accomodate cgroup weirdness.  IMO it's not a good model for anything, other

That's not true.  The two big items where sysfs is complicated are the
custom revocation implementation and namespace support, both of which
come from its sysfs lineage.  The revocation implementation had an
update to better support cgroup but the new interface is more generic
than before and necessary if it wants to support self-removing files
along with the said revocation support.

One complexity actually added for cgroup is in the mount path because
cgroup needs to be able to create or match superblocks dynamically
during mount time and that doesn't really jive well with how
superblocks are managed in the vfs layer.  This part is ugly and
intentially left that way.  This really should be cgroup specific (and
we can't shed it at the moment) and shouldn't be used elsewhere.
IIRC, you had some complaints about this and I wonder whether that's
what shaped your opinion, but this is entirely isolated.  Just use
kernfs_mount() and ignore the @new_fs_created param.

Overall, kernfs is basically the actual filesystem part of sysfs.
None of the fundamentals changed while separating out kernfs out of
sysfs.  You may not like kernfs but then wouldn't improving it a far
better strategy over long haul?  I really don't think we need yet
another pseudo vfs based filesystem in kernel and unless tracingfs is
doing something crazy kernfs should be able to serve as a common
foundation.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC][PATCH 0/5] tracing: Add new file system tracefs
  2015-01-22  3:51         ` Steven Rostedt
@ 2015-01-22 12:32           ` Tejun Heo
  2015-01-22 12:33             ` Tejun Heo
  2015-01-22 14:32             ` Steven Rostedt
  0 siblings, 2 replies; 26+ messages in thread
From: Tejun Heo @ 2015-01-22 12:32 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Greg Kroah-Hartman, linux-kernel, Al Viro, Ingo Molnar, Andrew Morton

Hey, Steven.

On Wed, Jan 21, 2015 at 10:51:09PM -0500, Steven Rostedt wrote:
> > Tejun would know best, he wrote it :)
> 
> Oh good, as Bugs Bunny would say "where's the doc?" (or was that
> "what's up doc"?)

I didn't write any while extracting it out of sysfs.  Sorry about
that.  I should get to it.

> > What specifically are you looking for?  I think there's at least two
> > filesystems using it already, are they not good enough examples?
> 
> Well, I see the two biggest users are sysfs and cgroups, where I never
> understood how sysfs actually works, and cgroups, the filesystem is
> very integrated with the usage of cgroups.

Yes, cgroup's usage is rather complicated.  I wouldn't suggest it as a
good example.  sysfs's usage is a bit complicated too in part because
of the namespace support.

> There doesn't seem to be any abi where one can relate to the vfs
> system.
> 
> I'd like to keep the interface like debugfs had for tracefs, because
> all of tracing depends on it, and it would require a full rewrite to
> convert it to something that doesn't have the vfs type of paradigm, in
> which case, tracefs would not be done for another decade.
> 
> That is, I need to create the following interface:
> 
>   tracefs_create_file()

kernfs_create_file()

>   tracefs_create_dir()

kernfs_create_dir()

>   tracefs_remove()
>   tracefs_remove_recursive()

kernfs_remove[_by_name]() - recursive by default

> and that's all I need for the filesystem. There doesn't seem to be any
> documentation on kernfs about how to implement this.
> 
> Yes, I can study the code, but I was hoping that there was some
> kernfs.txt that described how to create a new fs with it. It just saves
> time if there was a document than having to read the code and perhaps
> use it in a way it wasn't supposed to be used.

Yeah yeah, I hear you.  I'll write one up.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC][PATCH 0/5] tracing: Add new file system tracefs
  2015-01-22 12:32           ` Tejun Heo
@ 2015-01-22 12:33             ` Tejun Heo
  2015-01-22 14:32             ` Steven Rostedt
  1 sibling, 0 replies; 26+ messages in thread
From: Tejun Heo @ 2015-01-22 12:33 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Greg Kroah-Hartman, linux-kernel, Al Viro, Ingo Molnar, Andrew Morton

On Thu, Jan 22, 2015 at 07:32:12AM -0500, Tejun Heo wrote:
> > Yes, I can study the code, but I was hoping that there was some
> > kernfs.txt that described how to create a new fs with it. It just saves
> > time if there was a document than having to read the code and perhaps
> > use it in a way it wasn't supposed to be used.
> 
> Yeah yeah, I hear you.  I'll write one up.

But all the public APIs are well documented with docbook comments, so
for the time being, I'd recommend those as the starting point.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC][PATCH 0/5] tracing: Add new file system tracefs
  2015-01-22  4:35     ` Steven Rostedt
@ 2015-01-22 12:49       ` Tejun Heo
  0 siblings, 0 replies; 26+ messages in thread
From: Tejun Heo @ 2015-01-22 12:49 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Al Viro, Greg Kroah-Hartman, linux-kernel, Ingo Molnar, Andrew Morton

On Wed, Jan 21, 2015 at 11:35:53PM -0500, Steven Rostedt wrote:
> On Thu, 22 Jan 2015 04:23:30 +0000
> Al Viro <viro@ZenIV.linux.org.uk> wrote:
> 
> > I would recommend against that - kernfs is overburdened by their need
> > to accomodate cgroup weirdness.  IMO it's not a good model for
> > anything, other than an anti-hard-drugs poster ("don't shoot that
> > shit, or you might end up hallucinating _this_").
> 
> OK, I'm not the only one that thought kernfs seemed to go all over the
> place. I guess I now know why. It was more of a hook for cgroups. I can

Again, not true at all.

> understand why cgroups needed it, as I found that creating files from a
> mkdir and removing them with rmdir causes some pain in vfs with
> handling of locking. As that's what I'm working on overcoming now.
> 
> But I think I solved my issues (testing it now), and hopefully by
> tomorrow, I'll have a V2 out.

I'd strongly recomment just using kernfs.  If you find something wrong
with it, let's fix it, please.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC][PATCH 0/5] tracing: Add new file system tracefs
  2015-01-22 12:32           ` Tejun Heo
  2015-01-22 12:33             ` Tejun Heo
@ 2015-01-22 14:32             ` Steven Rostedt
  2015-01-22 14:55               ` Tejun Heo
  1 sibling, 1 reply; 26+ messages in thread
From: Steven Rostedt @ 2015-01-22 14:32 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Greg Kroah-Hartman, linux-kernel, Al Viro, Ingo Molnar, Andrew Morton

On Thu, 22 Jan 2015 07:32:12 -0500
Tejun Heo <tj@kernel.org> wrote:

> > That is, I need to create the following interface:
> > 
> >   tracefs_create_file()
> 
> kernfs_create_file()
> 
> >   tracefs_create_dir()
> 
> kernfs_create_dir()

The problem is that these do not return dentry. They return kernfs_node.

I see a kernfs_node_from_dentry() call but not the other way around.

Yes, the interface for tracefs is just these four functions, but then
the interaction of the kernfs versions use a completely different API.

Each of the created files expects to attach their own open, read,
write, and release functions. And yes, some even use the seq functions,
and they use it the vfs way. I do not intend on rewriting the users of
the debugfs file system. To use kernfs, it seems that I would need to
do that, and I don't have the time to make such a dramatic change to
the system. It will fall down on my TODO list and I probably wont get
to it for another decade.

I created tracefs with 700 lines of code and two files (inode.c and
tracefs.h), and for the users of tracefs, I just did
s/debugfs/tracefs/. If I can't make that substitution for the users,
that is a show stopper.

I don't see how I can use kernfs without it causing a lot of invasive
changes to the ftrace subsystem.

-- Steve


> 
> >   tracefs_remove()
> >   tracefs_remove_recursive()
> 
> kernfs_remove[_by_name]() - recursive by default
> 
> > and that's all I need for the filesystem. There doesn't seem to be any
> > documentation on kernfs about how to implement this.
> > 
> > Yes, I can study the code, but I was hoping that there was some
> > kernfs.txt that described how to create a new fs with it. It just saves
> > time if there was a document than having to read the code and perhaps
> > use it in a way it wasn't supposed to be used.
> 
> Yeah yeah, I hear you.  I'll write one up.
> 
> Thanks.
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC][PATCH 0/5] tracing: Add new file system tracefs
  2015-01-22 14:32             ` Steven Rostedt
@ 2015-01-22 14:55               ` Tejun Heo
  2015-01-22 15:15                 ` Steven Rostedt
  0 siblings, 1 reply; 26+ messages in thread
From: Tejun Heo @ 2015-01-22 14:55 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Greg Kroah-Hartman, linux-kernel, Al Viro, Ingo Molnar, Andrew Morton

Hey, Steven.

On Thu, Jan 22, 2015 at 09:32:49AM -0500, Steven Rostedt wrote:
> The problem is that these do not return dentry. They return kernfs_node.

Yeap, that's what represents a file or directory in kernfs.

> I see a kernfs_node_from_dentry() call but not the other way around.

Because dentries / inodes may or may not exist for a given kernfs
node.  kernfs_nodes are the backing information in a similar way to an
actual filesystem.

> Yes, the interface for tracefs is just these four functions, but then
> the interaction of the kernfs versions use a completely different API.
> 
> Each of the created files expects to attach their own open, read,
> write, and release functions. And yes, some even use the seq functions,
> and they use it the vfs way. I do not intend on rewriting the users of
> the debugfs file system. To use kernfs, it seems that I would need to
> do that, and I don't have the time to make such a dramatic change to
> the system. It will fall down on my TODO list and I probably wont get
> to it for another decade.

kernfs provides two sets of file operations.  One is seq_file based
and the other is direct read/write.  In both cases, bouncing data
between userland and kernel is handled by kernfs.  If you already have
existing read write ops implemented doing custom buffer handling and
direct userland memory access, it'll take some adaptation but for a
lot of cases this would consolidate duplicate code paths.

> I created tracefs with 700 lines of code and two files (inode.c and
> tracefs.h), and for the users of tracefs, I just did
> s/debugfs/tracefs/. If I can't make that substitution for the users,
> that is a show stopper.
> 
> I don't see how I can use kernfs without it causing a lot of invasive
> changes to the ftrace subsystem.

Converting an existing vfs based pseudo fs implementation over to
kernfs isn't trivial.  I mean, if that were trivial, why would kernfs
even exist?  kernfs is a layer which abstracts a large part of pseudo
filesystem which provides extra features like significantly lower
memory footprint with large number of nodes and revocation support in
a way that its users, for the most part, hopefully, only have to worry
about the content to provide to userland.

I frankly have no idea whether tracefs would be a good candidate for
kernfs usage but if you're looking for a mechanical one-to-one
conversion from vfs based implementation, that's not gonna work.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC][PATCH 0/5] tracing: Add new file system tracefs
  2015-01-22 14:55               ` Tejun Heo
@ 2015-01-22 15:15                 ` Steven Rostedt
  2015-01-22 15:24                   ` Tejun Heo
  0 siblings, 1 reply; 26+ messages in thread
From: Steven Rostedt @ 2015-01-22 15:15 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Greg Kroah-Hartman, linux-kernel, Al Viro, Ingo Molnar, Andrew Morton

On Thu, 22 Jan 2015 09:55:47 -0500
Tejun Heo <tj@kernel.org> wrote:

> kernfs provides two sets of file operations.  One is seq_file based
> and the other is direct read/write.  In both cases, bouncing data
> between userland and kernel is handled by kernfs.  If you already have
> existing read write ops implemented doing custom buffer handling and
> direct userland memory access, it'll take some adaptation but for a
> lot of cases this would consolidate duplicate code paths.

Does it also handle splice? That's a key part of the tracing code.

Almost every tracing file is somewhat unique. When things can be
shared, they are, but there's not much generic code that can be shared.

> 
> > I created tracefs with 700 lines of code and two files (inode.c and
> > tracefs.h), and for the users of tracefs, I just did
> > s/debugfs/tracefs/. If I can't make that substitution for the users,
> > that is a show stopper.
> > 
> > I don't see how I can use kernfs without it causing a lot of invasive
> > changes to the ftrace subsystem.
> 
> Converting an existing vfs based pseudo fs implementation over to
> kernfs isn't trivial.  I mean, if that were trivial, why would kernfs
> even exist?  kernfs is a layer which abstracts a large part of pseudo
> filesystem which provides extra features like significantly lower
> memory footprint with large number of nodes and revocation support in
> a way that its users, for the most part, hopefully, only have to worry
> about the content to provide to userland.

Sounds like some of the tracing files could benefit from this. But I'm
not sure kernfs has all the necessary features I need.

> 
> I frankly have no idea whether tracefs would be a good candidate for
> kernfs usage but if you're looking for a mechanical one-to-one
> conversion from vfs based implementation, that's not gonna work.

OK, thanks. Perhaps if tracing was still new I could have tried to go
with kernfs. But as debugfs was such a simple to use interface, it let
me concentrate more on the complexities of tracing itself instead of
spending time coming up with a complex interface.

If a one to one conversion to vfs is not gonna work, I'm going to be
interested in seeing how debugfs will be converted.

Anyway, I think I'm convinced that kernfs is not yet the way to go. I'm
going to continue on with my current path.

Thanks,

-- Steve

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC][PATCH 0/5] tracing: Add new file system tracefs
  2015-01-22 15:15                 ` Steven Rostedt
@ 2015-01-22 15:24                   ` Tejun Heo
  0 siblings, 0 replies; 26+ messages in thread
From: Tejun Heo @ 2015-01-22 15:24 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Greg Kroah-Hartman, linux-kernel, Al Viro, Ingo Molnar, Andrew Morton

On Thu, Jan 22, 2015 at 10:15:30AM -0500, Steven Rostedt wrote:
> > kernfs provides two sets of file operations.  One is seq_file based
> > and the other is direct read/write.  In both cases, bouncing data
> > between userland and kernel is handled by kernfs.  If you already have
> > existing read write ops implemented doing custom buffer handling and
> > direct userland memory access, it'll take some adaptation but for a
> > lot of cases this would consolidate duplicate code paths.
> 
> Does it also handle splice? That's a key part of the tracing code.

It doesn't yet.  We can add it as a part of kernfs_syscall_ops tho
which exists to support these specialized bypass operations.  kernfs
doesn't do much with these.  It just passes over the calls to the
registered callbacks.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2015-01-22 15:24 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-21 17:19 [RFC][PATCH 0/5] tracing: Add new file system tracefs Steven Rostedt
2015-01-21 17:19 ` [RFC][PATCH 1/5] tracefs: Add new tracefs file system Steven Rostedt
2015-01-21 18:30   ` Steven Rostedt
2015-01-21 17:19 ` [RFC][PATCH 2/5] tracing: Convert the tracing facility over to use tracefs Steven Rostedt
2015-01-21 17:32   ` Steven Rostedt
2015-01-21 17:19 ` [RFC][PATCH 3/5] tracing: Automatically mount tracefs on debugfs/tracing Steven Rostedt
2015-01-21 17:19 ` [RFC][PATCH 4/5] tracing: Have mkdir and rmdir be part of tracefs Steven Rostedt
2015-01-21 18:31   ` Steven Rostedt
2015-01-21 20:47   ` Steven Rostedt
2015-01-21 17:19 ` [RFC][PATCH 5/5] tracefs: Add directory /sys/kernel/tracing Steven Rostedt
2015-01-21 17:32 ` [RFC][PATCH 0/5] tracing: Add new file system tracefs Steven Rostedt
2015-01-21 23:00 ` Greg Kroah-Hartman
2015-01-22  1:47   ` Steven Rostedt
2015-01-22  3:07     ` Steven Rostedt
2015-01-22  3:18       ` Greg Kroah-Hartman
2015-01-22  3:51         ` Steven Rostedt
2015-01-22 12:32           ` Tejun Heo
2015-01-22 12:33             ` Tejun Heo
2015-01-22 14:32             ` Steven Rostedt
2015-01-22 14:55               ` Tejun Heo
2015-01-22 15:15                 ` Steven Rostedt
2015-01-22 15:24                   ` Tejun Heo
2015-01-22  4:23   ` Al Viro
2015-01-22  4:35     ` Steven Rostedt
2015-01-22 12:49       ` Tejun Heo
2015-01-22 12:26     ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.