linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [for-next-3.11][PATCH 0/5] tracing: Final fixes for the race between open event file and deleting event
@ 2013-08-02  2:49 Steven Rostedt
  2013-08-02  2:49 ` [for-next-3.11][PATCH 1/5] debugfs: debugfs_remove_recursive() must not rely on list_empty(d_subdirs) Steven Rostedt
                   ` (5 more replies)
  0 siblings, 6 replies; 8+ messages in thread
From: Steven Rostedt @ 2013-08-02  2:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: Oleg Nesterov, Masami Hiramatsu, Ingo Molnar, Andrew Morton

Oleg,

Is all the races that we are aware of between accessing event files and
deleting events covered by these patches?  I think I have them all.
Was there any patches that I missed, as there were a lot of threads
and lots of patches sent out, but not all were considered final.

I think I got the main ones that we decided on, but I'm not 100% sure as
my INBOX is overrun by too many activities. It may be a few months before
I go through them all.

I haven't pushed this to my for-next branch as I'm waiting for some feedback
from others in my queue. But I did push it to my ftrace/urgent branch
if you want to look at what will be going there. That branch may rebase
but I wanted to get these patches tested by Fengguang before sending
them to next. And I try not to rebase my for-next branch.

Thanks,

-- Steve


Oleg Nesterov (2):
      debugfs: debugfs_remove_recursive() must not rely on list_empty(d_subdirs)
      tracing: trace_remove_event_call() should fail if call/file is in use

Steven Rostedt (Red Hat) (3):
      tracing: Add comment to describe special break case in probe_remove_event_call()
      tracing/kprobes: Fail to unregister if probe event files are in use
      tracing/uprobes: Fail to unregister if probe event files are in use

----
 fs/debugfs/inode.c           |   69 ++++++++++++++----------------------------
 include/linux/ftrace_event.h |    2 +-
 kernel/trace/trace_events.c  |   41 +++++++++++++++++++++++--
 kernel/trace/trace_kprobe.c  |   21 +++++++++----
 kernel/trace/trace_uprobe.c  |   51 +++++++++++++++++++++++--------
 5 files changed, 115 insertions(+), 69 deletions(-)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [for-next-3.11][PATCH 1/5] debugfs: debugfs_remove_recursive() must not rely on list_empty(d_subdirs)
  2013-08-02  2:49 [for-next-3.11][PATCH 0/5] tracing: Final fixes for the race between open event file and deleting event Steven Rostedt
@ 2013-08-02  2:49 ` Steven Rostedt
  2013-08-02  2:49 ` [for-next-3.11][PATCH 2/5] tracing: trace_remove_event_call() should fail if call/file is in use Steven Rostedt
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2013-08-02  2:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Oleg Nesterov, Masami Hiramatsu, Ingo Molnar, Andrew Morton,
	Greg Kroah-Hartman

[-- Attachment #1: 0001-debugfs-debugfs_remove_recursive-must-not-rely-on-li.patch --]
[-- Type: text/plain, Size: 4524 bytes --]

From: Oleg Nesterov <oleg@redhat.com>

debugfs_remove_recursive() is wrong,

1. it wrongly assumes that !list_empty(d_subdirs) means that this
   dir should be removed.

   This is not that bad by itself, but:

2. if d_subdirs does not becomes empty after __debugfs_remove()
   it gives up and silently fails, it doesn't even try to remove
   other entries.

   However ->d_subdirs can be non-empty because it still has the
   already deleted !debugfs_positive() entries.

3. simple_release_fs() is called even if __debugfs_remove() fails.

Suppose we have

	dir1/
		dir2/
			file2
		file1

and someone opens dir1/dir2/file2.

Now, debugfs_remove_recursive(dir1/dir2) succeeds, and dir1/dir2 goes
away.

But debugfs_remove_recursive(dir1) silently fails and doesn't remove
this directory. Because it tries to delete (the already deleted)
dir1/dir2/file2 again and then fails due to "Avoid infinite loop"
logic.

Test-case:

	#!/bin/sh

	cd /sys/kernel/debug/tracing
	echo 'p:probe/sigprocmask sigprocmask' >> kprobe_events
	sleep 1000 < events/probe/sigprocmask/id &
	echo -n >| kprobe_events

	[ -d events/probe ] && echo "ERR!! failed to rm probe"

And after that it is not possible to create another probe entry.

With this patch debugfs_remove_recursive() skips !debugfs_positive()
files although this is not strictly needed. The most important change
is that it does not try to make ->d_subdirs empty, it simply scans
the whole list(s) recursively and removes as much as possible.

Link: http://lkml.kernel.org/r/20130726151256.GC19472@redhat.com

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 fs/debugfs/inode.c |   69 +++++++++++++++++-----------------------------------
 1 file changed, 22 insertions(+), 47 deletions(-)

diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
index 4888cb3..c7c83ff 100644
--- a/fs/debugfs/inode.c
+++ b/fs/debugfs/inode.c
@@ -533,8 +533,7 @@ EXPORT_SYMBOL_GPL(debugfs_remove);
  */
 void debugfs_remove_recursive(struct dentry *dentry)
 {
-	struct dentry *child;
-	struct dentry *parent;
+	struct dentry *child, *next, *parent;
 
 	if (IS_ERR_OR_NULL(dentry))
 		return;
@@ -544,61 +543,37 @@ void debugfs_remove_recursive(struct dentry *dentry)
 		return;
 
 	parent = dentry;
+ down:
 	mutex_lock(&parent->d_inode->i_mutex);
+	list_for_each_entry_safe(child, next, &parent->d_subdirs, d_u.d_child) {
+		if (!debugfs_positive(child))
+			continue;
 
-	while (1) {
-		/*
-		 * When all dentries under "parent" has been removed,
-		 * walk up the tree until we reach our starting point.
-		 */
-		if (list_empty(&parent->d_subdirs)) {
-			mutex_unlock(&parent->d_inode->i_mutex);
-			if (parent == dentry)
-				break;
-			parent = parent->d_parent;
-			mutex_lock(&parent->d_inode->i_mutex);
-		}
-		child = list_entry(parent->d_subdirs.next, struct dentry,
-				d_u.d_child);
- next_sibling:
-
-		/*
-		 * If "child" isn't empty, walk down the tree and
-		 * remove all its descendants first.
-		 */
+		/* perhaps simple_empty(child) makes more sense */
 		if (!list_empty(&child->d_subdirs)) {
 			mutex_unlock(&parent->d_inode->i_mutex);
 			parent = child;
-			mutex_lock(&parent->d_inode->i_mutex);
-			continue;
+			goto down;
 		}
-		__debugfs_remove(child, parent);
-		if (parent->d_subdirs.next == &child->d_u.d_child) {
-			/*
-			 * Try the next sibling.
-			 */
-			if (child->d_u.d_child.next != &parent->d_subdirs) {
-				child = list_entry(child->d_u.d_child.next,
-						   struct dentry,
-						   d_u.d_child);
-				goto next_sibling;
-			}
-
-			/*
-			 * Avoid infinite loop if we fail to remove
-			 * one dentry.
-			 */
-			mutex_unlock(&parent->d_inode->i_mutex);
-			break;
-		}
-		simple_release_fs(&debugfs_mount, &debugfs_mount_count);
+ up:
+		if (!__debugfs_remove(child, parent))
+			simple_release_fs(&debugfs_mount, &debugfs_mount_count);
 	}
 
-	parent = dentry->d_parent;
+	mutex_unlock(&parent->d_inode->i_mutex);
+	child = parent;
+	parent = parent->d_parent;
 	mutex_lock(&parent->d_inode->i_mutex);
-	__debugfs_remove(dentry, parent);
+
+	if (child != dentry) {
+		next = list_entry(child->d_u.d_child.next, struct dentry,
+					d_u.d_child);
+		goto up;
+	}
+
+	if (!__debugfs_remove(child, parent))
+		simple_release_fs(&debugfs_mount, &debugfs_mount_count);
 	mutex_unlock(&parent->d_inode->i_mutex);
-	simple_release_fs(&debugfs_mount, &debugfs_mount_count);
 }
 EXPORT_SYMBOL_GPL(debugfs_remove_recursive);
 
-- 
1.7.10.4



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [for-next-3.11][PATCH 2/5] tracing: trace_remove_event_call() should fail if call/file is in use
  2013-08-02  2:49 [for-next-3.11][PATCH 0/5] tracing: Final fixes for the race between open event file and deleting event Steven Rostedt
  2013-08-02  2:49 ` [for-next-3.11][PATCH 1/5] debugfs: debugfs_remove_recursive() must not rely on list_empty(d_subdirs) Steven Rostedt
@ 2013-08-02  2:49 ` Steven Rostedt
  2013-08-02  2:49 ` [for-next-3.11][PATCH 3/5] tracing: Add comment to describe special break case in probe_remove_event_call() Steven Rostedt
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2013-08-02  2:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: Oleg Nesterov, Masami Hiramatsu, Ingo Molnar, Andrew Morton

[-- Attachment #1: 0002-tracing-trace_remove_event_call-should-fail-if-call-.patch --]
[-- Type: text/plain, Size: 3333 bytes --]

From: Oleg Nesterov <oleg@redhat.com>

Change trace_remove_event_call(call) to return the error if this
call is active. This is what the callers assume but can't verify
outside of the tracing locks. Both trace_kprobe.c/trace_uprobe.c
need the additional changes, unregister_trace_probe() should abort
if trace_remove_event_call() fails.

The caller is going to free this call/file so we must ensure that
nobody can use them after trace_remove_event_call() succeeds.
debugfs should be fine after the previous changes and event_remove()
does TRACE_REG_UNREGISTER, but still there are 2 reasons why we need
the additional checks:

- There could be a perf_event(s) attached to this tp_event, so the
  patch checks ->perf_refcount.

- TRACE_REG_UNREGISTER can be suppressed by FTRACE_EVENT_FL_SOFT_MODE,
  so we simply check FTRACE_EVENT_FL_ENABLED protected by event_mutex.

Link: http://lkml.kernel.org/r/20130729175033.GB26284@redhat.com

Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 include/linux/ftrace_event.h |    2 +-
 kernel/trace/trace_events.c  |   35 +++++++++++++++++++++++++++++++++--
 2 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 4372658..f98ab06 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -332,7 +332,7 @@ extern int trace_define_field(struct ftrace_event_call *call, const char *type,
 			      const char *name, int offset, int size,
 			      int is_signed, int filter_type);
 extern int trace_add_event_call(struct ftrace_event_call *call);
-extern void trace_remove_event_call(struct ftrace_event_call *call);
+extern int trace_remove_event_call(struct ftrace_event_call *call);
 
 #define is_signed_type(type)	(((type)(-1)) < (type)1)
 
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index a67c913..ec04836 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -1713,16 +1713,47 @@ static void __trace_remove_event_call(struct ftrace_event_call *call)
 	destroy_preds(call);
 }
 
+static int probe_remove_event_call(struct ftrace_event_call *call)
+{
+	struct trace_array *tr;
+	struct ftrace_event_file *file;
+
+#ifdef CONFIG_PERF_EVENTS
+	if (call->perf_refcount)
+		return -EBUSY;
+#endif
+	do_for_each_event_file(tr, file) {
+		if (file->event_call != call)
+			continue;
+		/*
+		 * We can't rely on ftrace_event_enable_disable(enable => 0)
+		 * we are going to do, FTRACE_EVENT_FL_SOFT_MODE can suppress
+		 * TRACE_REG_UNREGISTER.
+		 */
+		if (file->flags & FTRACE_EVENT_FL_ENABLED)
+			return -EBUSY;
+		break;
+	} while_for_each_event_file();
+
+	__trace_remove_event_call(call);
+
+	return 0;
+}
+
 /* Remove an event_call */
-void trace_remove_event_call(struct ftrace_event_call *call)
+int trace_remove_event_call(struct ftrace_event_call *call)
 {
+	int ret;
+
 	mutex_lock(&trace_types_lock);
 	mutex_lock(&event_mutex);
 	down_write(&trace_event_sem);
-	__trace_remove_event_call(call);
+	ret = probe_remove_event_call(call);
 	up_write(&trace_event_sem);
 	mutex_unlock(&event_mutex);
 	mutex_unlock(&trace_types_lock);
+
+	return ret;
 }
 
 #define for_each_event(event, start, end)			\
-- 
1.7.10.4



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [for-next-3.11][PATCH 3/5] tracing: Add comment to describe special break case in probe_remove_event_call()
  2013-08-02  2:49 [for-next-3.11][PATCH 0/5] tracing: Final fixes for the race between open event file and deleting event Steven Rostedt
  2013-08-02  2:49 ` [for-next-3.11][PATCH 1/5] debugfs: debugfs_remove_recursive() must not rely on list_empty(d_subdirs) Steven Rostedt
  2013-08-02  2:49 ` [for-next-3.11][PATCH 2/5] tracing: trace_remove_event_call() should fail if call/file is in use Steven Rostedt
@ 2013-08-02  2:49 ` Steven Rostedt
  2013-08-02  2:49 ` [for-next-3.11][PATCH 4/5] tracing/kprobes: Fail to unregister if probe event files are in use Steven Rostedt
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2013-08-02  2:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: Oleg Nesterov, Masami Hiramatsu, Ingo Molnar, Andrew Morton

[-- Attachment #1: 0003-tracing-Add-comment-to-describe-special-break-case-i.patch --]
[-- Type: text/plain, Size: 1256 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

The "break" used in the do_for_each_event_file() is used as an optimization
as the loop is really a double loop. The loop searches all event files
for each trace_array. There's only one matching event file per trace_array
and after we find the event file for the trace_array, the break is used
to jump to the next trace_array and start the search there.

As this is not a standard way of using "break" in C code, it requires
a comment right before the break to let people know what is going on.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/trace_events.c |    6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index ec04836..29a7ebc 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -1732,6 +1732,12 @@ static int probe_remove_event_call(struct ftrace_event_call *call)
 		 */
 		if (file->flags & FTRACE_EVENT_FL_ENABLED)
 			return -EBUSY;
+		/*
+		 * The do_for_each_event_file_safe() is
+		 * a double loop. After finding the call for this
+		 * trace_array, we use break to jump to the next
+		 * trace_array.
+		 */
 		break;
 	} while_for_each_event_file();
 
-- 
1.7.10.4



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [for-next-3.11][PATCH 4/5] tracing/kprobes: Fail to unregister if probe event files are in use
  2013-08-02  2:49 [for-next-3.11][PATCH 0/5] tracing: Final fixes for the race between open event file and deleting event Steven Rostedt
                   ` (2 preceding siblings ...)
  2013-08-02  2:49 ` [for-next-3.11][PATCH 3/5] tracing: Add comment to describe special break case in probe_remove_event_call() Steven Rostedt
@ 2013-08-02  2:49 ` Steven Rostedt
  2013-08-02  2:49 ` [for-next-3.11][PATCH 5/5] tracing/uprobes: " Steven Rostedt
  2013-08-02 13:45 ` [for-next-3.11][PATCH 0/5] tracing: Final fixes for the race between open event file and deleting event Oleg Nesterov
  5 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2013-08-02  2:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: Oleg Nesterov, Masami Hiramatsu, Ingo Molnar, Andrew Morton

[-- Attachment #1: 0004-tracing-kprobes-Fail-to-unregister-if-probe-event-fi.patch --]
[-- Type: text/plain, Size: 5729 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

When a probe is being removed, it cleans up the event files that correspond
to the probe. But there is a race between writing to one of these files
and deleting the probe. This is especially true for the "enable" file.

	CPU 0				CPU 1
	-----				-----

				  fd = open("enable",O_WRONLY);

  probes_open()
  release_all_trace_probes()
  unregister_trace_probe()
  if (trace_probe_is_enabled(tp))
	return -EBUSY

				   write(fd, "1", 1)
				   __ftrace_set_clr_event()
				   call->class->reg()
				    (kprobe_register)
				     enable_trace_probe(tp)

  __unregister_trace_probe(tp);
  list_del(&tp->list)
  unregister_probe_event(tp) <-- fails!
  free_trace_probe(tp)

				   write(fd, "0", 1)
				   __ftrace_set_clr_event()
				   call->class->unreg
				    (kprobe_register)
				    disable_trace_probe(tp) <-- BOOM!

A test program was written that used two threads to simulate the
above scenario adding a nanosleep() interval to change the timings
and after several thousand runs, it was able to trigger this bug
and crash:

BUG: unable to handle kernel paging request at 00000005000000f9
IP: [<ffffffff810dee70>] probes_open+0x3b/0xa7
PGD 7808a067 PUD 0
Oops: 0000 [#1] PREEMPT SMP
Dumping ftrace buffer:
---------------------------------
Modules linked in: ipt_MASQUERADE sunrpc ip6t_REJECT nf_conntrack_ipv6
CPU: 1 PID: 2070 Comm: test-kprobe-rem Not tainted 3.11.0-rc3-test+ #47
Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS SDBLI944.86P 05/08/2007
task: ffff880077756440 ti: ffff880076e52000 task.ti: ffff880076e52000
RIP: 0010:[<ffffffff810dee70>]  [<ffffffff810dee70>] probes_open+0x3b/0xa7
RSP: 0018:ffff880076e53c38  EFLAGS: 00010203
RAX: 0000000500000001 RBX: ffff88007844f440 RCX: 0000000000000003
RDX: 0000000000000003 RSI: 0000000000000003 RDI: ffff880076e52000
RBP: ffff880076e53c58 R08: ffff880076e53bd8 R09: 0000000000000000
R10: ffff880077756440 R11: 0000000000000006 R12: ffffffff810dee35
R13: ffff880079250418 R14: 0000000000000000 R15: ffff88007844f450
FS:  00007f87a276f700(0000) GS:ffff88007d480000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000005000000f9 CR3: 0000000077262000 CR4: 00000000000007e0
Stack:
 ffff880076e53c58 ffffffff81219ea0 ffff88007844f440 ffffffff810dee35
 ffff880076e53ca8 ffffffff81130f78 ffff8800772986c0 ffff8800796f93a0
 ffffffff81d1b5d8 ffff880076e53e04 0000000000000000 ffff88007844f440
Call Trace:
 [<ffffffff81219ea0>] ? security_file_open+0x2c/0x30
 [<ffffffff810dee35>] ? unregister_trace_probe+0x4b/0x4b
 [<ffffffff81130f78>] do_dentry_open+0x162/0x226
 [<ffffffff81131186>] finish_open+0x46/0x54
 [<ffffffff8113f30b>] do_last+0x7f6/0x996
 [<ffffffff8113cc6f>] ? inode_permission+0x42/0x44
 [<ffffffff8113f6dd>] path_openat+0x232/0x496
 [<ffffffff8113fc30>] do_filp_open+0x3a/0x8a
 [<ffffffff8114ab32>] ? __alloc_fd+0x168/0x17a
 [<ffffffff81131f4e>] do_sys_open+0x70/0x102
 [<ffffffff8108f06e>] ? trace_hardirqs_on_caller+0x160/0x197
 [<ffffffff81131ffe>] SyS_open+0x1e/0x20
 [<ffffffff81522742>] system_call_fastpath+0x16/0x1b
Code: e5 41 54 53 48 89 f3 48 83 ec 10 48 23 56 78 48 39 c2 75 6c 31 f6 48 c7
RIP  [<ffffffff810dee70>] probes_open+0x3b/0xa7
 RSP <ffff880076e53c38>
CR2: 00000005000000f9
---[ end trace 35f17d68fc569897 ]---

The unregister_trace_probe() must be done first, and if it fails it must
fail the removal of the kprobe.

Several changes have already been made by Oleg Nesterov and Masami Hiramatsu
to allow moving the unregister_probe_event() before the removal of
the probe and exit the function if it fails. This prevents the tp
structure from being used after it is freed.

Link: http://lkml.kernel.org/r/20130704034038.819592356@goodmis.org

Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/trace_kprobe.c |   21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 3811487..243f683 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -95,7 +95,7 @@ static __kprobes bool trace_probe_is_on_module(struct trace_probe *tp)
 }
 
 static int register_probe_event(struct trace_probe *tp);
-static void unregister_probe_event(struct trace_probe *tp);
+static int unregister_probe_event(struct trace_probe *tp);
 
 static DEFINE_MUTEX(probe_lock);
 static LIST_HEAD(probe_list);
@@ -351,9 +351,12 @@ static int unregister_trace_probe(struct trace_probe *tp)
 	if (trace_probe_is_enabled(tp))
 		return -EBUSY;
 
+	/* Will fail if probe is being used by ftrace or perf */
+	if (unregister_probe_event(tp))
+		return -EBUSY;
+
 	__unregister_trace_probe(tp);
 	list_del(&tp->list);
-	unregister_probe_event(tp);
 
 	return 0;
 }
@@ -632,7 +635,9 @@ static int release_all_trace_probes(void)
 	/* TODO: Use batch unregistration */
 	while (!list_empty(&probe_list)) {
 		tp = list_entry(probe_list.next, struct trace_probe, list);
-		unregister_trace_probe(tp);
+		ret = unregister_trace_probe(tp);
+		if (ret)
+			goto end;
 		free_trace_probe(tp);
 	}
 
@@ -1247,11 +1252,15 @@ static int register_probe_event(struct trace_probe *tp)
 	return ret;
 }
 
-static void unregister_probe_event(struct trace_probe *tp)
+static int unregister_probe_event(struct trace_probe *tp)
 {
+	int ret;
+
 	/* tp->event is unregistered in trace_remove_event_call() */
-	trace_remove_event_call(&tp->call);
-	kfree(tp->call.print_fmt);
+	ret = trace_remove_event_call(&tp->call);
+	if (!ret)
+		kfree(tp->call.print_fmt);
+	return ret;
 }
 
 /* Make a debugfs interface for controlling probe points */
-- 
1.7.10.4



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [for-next-3.11][PATCH 5/5] tracing/uprobes: Fail to unregister if probe event files are in use
  2013-08-02  2:49 [for-next-3.11][PATCH 0/5] tracing: Final fixes for the race between open event file and deleting event Steven Rostedt
                   ` (3 preceding siblings ...)
  2013-08-02  2:49 ` [for-next-3.11][PATCH 4/5] tracing/kprobes: Fail to unregister if probe event files are in use Steven Rostedt
@ 2013-08-02  2:49 ` Steven Rostedt
  2013-08-02 13:45 ` [for-next-3.11][PATCH 0/5] tracing: Final fixes for the race between open event file and deleting event Oleg Nesterov
  5 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2013-08-02  2:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: Oleg Nesterov, Masami Hiramatsu, Ingo Molnar, Andrew Morton

[-- Attachment #1: 0005-tracing-uprobes-Fail-to-unregister-if-probe-event-fi.patch --]
[-- Type: text/plain, Size: 4520 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

Uprobes suffer the same problem that kprobes have. There's a race between
writing to the "enable" file and removing the probe. The probe checks for
it being in use and if it is not, goes about deleting the probe and the
event that represents it. But the problem with that is, after it checks
if it is in use it can be enabled, and the deletion of the event (access
to the probe) will fail, as it is in use. But the uprobe will still be
deleted. This is a problem as the event can reference the uprobe that
was deleted.

The fix is to remove the event first, and check to make sure the event
removal succeeds. Then it is safe to remove the probe.

When the event exists, either ftrace or perf can enable the probe and
prevent the event from being removed.

Link: http://lkml.kernel.org/r/20130704034038.991525256@goodmis.org

Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/trace_uprobe.c |   51 ++++++++++++++++++++++++++++++++-----------
 1 file changed, 38 insertions(+), 13 deletions(-)

diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index a23d2d7..272261b 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -70,7 +70,7 @@ struct trace_uprobe {
 	(sizeof(struct probe_arg) * (n)))
 
 static int register_uprobe_event(struct trace_uprobe *tu);
-static void unregister_uprobe_event(struct trace_uprobe *tu);
+static int unregister_uprobe_event(struct trace_uprobe *tu);
 
 static DEFINE_MUTEX(uprobe_lock);
 static LIST_HEAD(uprobe_list);
@@ -164,11 +164,17 @@ static struct trace_uprobe *find_probe_event(const char *event, const char *grou
 }
 
 /* Unregister a trace_uprobe and probe_event: call with locking uprobe_lock */
-static void unregister_trace_uprobe(struct trace_uprobe *tu)
+static int unregister_trace_uprobe(struct trace_uprobe *tu)
 {
+	int ret;
+
+	ret = unregister_uprobe_event(tu);
+	if (ret)
+		return ret;
+
 	list_del(&tu->list);
-	unregister_uprobe_event(tu);
 	free_trace_uprobe(tu);
+	return 0;
 }
 
 /* Register a trace_uprobe and probe_event */
@@ -181,9 +187,12 @@ static int register_trace_uprobe(struct trace_uprobe *tu)
 
 	/* register as an event */
 	old_tp = find_probe_event(tu->call.name, tu->call.class->system);
-	if (old_tp)
+	if (old_tp) {
 		/* delete old event */
-		unregister_trace_uprobe(old_tp);
+		ret = unregister_trace_uprobe(old_tp);
+		if (ret)
+			goto end;
+	}
 
 	ret = register_uprobe_event(tu);
 	if (ret) {
@@ -256,6 +265,8 @@ static int create_trace_uprobe(int argc, char **argv)
 		group = UPROBE_EVENT_SYSTEM;
 
 	if (is_delete) {
+		int ret;
+
 		if (!event) {
 			pr_info("Delete command needs an event name.\n");
 			return -EINVAL;
@@ -269,9 +280,9 @@ static int create_trace_uprobe(int argc, char **argv)
 			return -ENOENT;
 		}
 		/* delete an event */
-		unregister_trace_uprobe(tu);
+		ret = unregister_trace_uprobe(tu);
 		mutex_unlock(&uprobe_lock);
-		return 0;
+		return ret;
 	}
 
 	if (argc < 2) {
@@ -408,16 +419,20 @@ fail_address_parse:
 	return ret;
 }
 
-static void cleanup_all_probes(void)
+static int cleanup_all_probes(void)
 {
 	struct trace_uprobe *tu;
+	int ret = 0;
 
 	mutex_lock(&uprobe_lock);
 	while (!list_empty(&uprobe_list)) {
 		tu = list_entry(uprobe_list.next, struct trace_uprobe, list);
-		unregister_trace_uprobe(tu);
+		ret = unregister_trace_uprobe(tu);
+		if (ret)
+			break;
 	}
 	mutex_unlock(&uprobe_lock);
+	return ret;
 }
 
 /* Probes listing interfaces */
@@ -462,8 +477,13 @@ static const struct seq_operations probes_seq_op = {
 
 static int probes_open(struct inode *inode, struct file *file)
 {
-	if ((file->f_mode & FMODE_WRITE) && (file->f_flags & O_TRUNC))
-		cleanup_all_probes();
+	int ret;
+
+	if ((file->f_mode & FMODE_WRITE) && (file->f_flags & O_TRUNC)) {
+		ret = cleanup_all_probes();
+		if (ret)
+			return ret;
+	}
 
 	return seq_open(file, &probes_seq_op);
 }
@@ -968,12 +988,17 @@ static int register_uprobe_event(struct trace_uprobe *tu)
 	return ret;
 }
 
-static void unregister_uprobe_event(struct trace_uprobe *tu)
+static int unregister_uprobe_event(struct trace_uprobe *tu)
 {
+	int ret;
+
 	/* tu->event is unregistered in trace_remove_event_call() */
-	trace_remove_event_call(&tu->call);
+	ret = trace_remove_event_call(&tu->call);
+	if (ret)
+		return ret;
 	kfree(tu->call.print_fmt);
 	tu->call.print_fmt = NULL;
+	return 0;
 }
 
 /* Make a trace interface for controling probe points */
-- 
1.7.10.4



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [for-next-3.11][PATCH 0/5] tracing: Final fixes for the race between open event file and deleting event
  2013-08-02  2:49 [for-next-3.11][PATCH 0/5] tracing: Final fixes for the race between open event file and deleting event Steven Rostedt
                   ` (4 preceding siblings ...)
  2013-08-02  2:49 ` [for-next-3.11][PATCH 5/5] tracing/uprobes: " Steven Rostedt
@ 2013-08-02 13:45 ` Oleg Nesterov
  2013-08-02 14:04   ` Steven Rostedt
  5 siblings, 1 reply; 8+ messages in thread
From: Oleg Nesterov @ 2013-08-02 13:45 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, Masami Hiramatsu, Ingo Molnar, Andrew Morton

On 08/01, Steven Rostedt wrote:
>
> Is all the races that we are aware of between accessing event files and
> deleting events covered by these patches?  I think I have them all.

Yes, I believe this covers all problems we discussed. I am not aware
of any other problem in this area.

Oleg.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [for-next-3.11][PATCH 0/5] tracing: Final fixes for the race between open event file and deleting event
  2013-08-02 13:45 ` [for-next-3.11][PATCH 0/5] tracing: Final fixes for the race between open event file and deleting event Oleg Nesterov
@ 2013-08-02 14:04   ` Steven Rostedt
  0 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2013-08-02 14:04 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: linux-kernel, Masami Hiramatsu, Ingo Molnar, Andrew Morton

On Fri, 2013-08-02 at 15:45 +0200, Oleg Nesterov wrote:
> On 08/01, Steven Rostedt wrote:
> >
> > Is all the races that we are aware of between accessing event files and
> > deleting events covered by these patches?  I think I have them all.
> 
> Yes, I believe this covers all problems we discussed. I am not aware
> of any other problem in this area.

Great! I just pushed this to for-next. I'll let it sit there for over
the weekend, and if there's no more issues, I'll push this off to Linus.

Hopefully this will be the last of the changes needed for 3.11 :-)

I'll then need to look at all the changes made and put together a set of
changes to send to stable. I didn't mark any of these with stable as
there's lots of dependencies I need to figure out and some of these
fixes need to go back quite a ways.

Thanks for all your help (you and Masami :-)

-- Steve



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2013-08-02 14:04 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-02  2:49 [for-next-3.11][PATCH 0/5] tracing: Final fixes for the race between open event file and deleting event Steven Rostedt
2013-08-02  2:49 ` [for-next-3.11][PATCH 1/5] debugfs: debugfs_remove_recursive() must not rely on list_empty(d_subdirs) Steven Rostedt
2013-08-02  2:49 ` [for-next-3.11][PATCH 2/5] tracing: trace_remove_event_call() should fail if call/file is in use Steven Rostedt
2013-08-02  2:49 ` [for-next-3.11][PATCH 3/5] tracing: Add comment to describe special break case in probe_remove_event_call() Steven Rostedt
2013-08-02  2:49 ` [for-next-3.11][PATCH 4/5] tracing/kprobes: Fail to unregister if probe event files are in use Steven Rostedt
2013-08-02  2:49 ` [for-next-3.11][PATCH 5/5] tracing/uprobes: " Steven Rostedt
2013-08-02 13:45 ` [for-next-3.11][PATCH 0/5] tracing: Final fixes for the race between open event file and deleting event Oleg Nesterov
2013-08-02 14:04   ` Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).