All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH] debugfs: Fix corrupted loop in debugfs_remove_recursive
@ 2014-06-09 18:06 Steven Rostedt
  2014-06-10 13:33 ` Steven Rostedt
  2014-06-12 10:43 ` Al Viro
  0 siblings, 2 replies; 9+ messages in thread
From: Steven Rostedt @ 2014-06-09 18:06 UTC (permalink / raw)
  To: Al Viro, Greg Kroah-Hartman; +Cc: LKML, linux-fsdevel, Andrew Morton

[ I'm currently running my tests on it now, and so far, after a few
 hours it has yet to blow up. I'll run it for 24 hours which it never
 succeeded in the past. ]

The tracing code has a way to make directories within the debugfs file
system as well as deleting them using mkdir/rmdir in the instance
directory. This is very limited in functionality, such as there is
no renames, and the parent directory "instance" can not be modified.
The tracing code creates the instance directory from the debugfs code
and then replaces the dentry->d_inode->i_op with its own to allow
for mkdir/rmdir to work.

When these are called, the d_entry and inode locks need to be released
to call the instance creation and deletion code. That code has its own
accounting and locking to serialize everything to prevent multiple
users from causing harm. As the parent "instance" directory can not
be modified this simplifies things.

I created a stress test that creates several threads that randomly
creates and deletes directories thousands of times a second. The code
stood up to this test and I submitted it a while ago.

Recently I added a new test that adds readers to the mix. While the
instance directories were being added and deleted, readers would read
from these directories and even enable tracing within them. This test
was able to trigger a bug:


 general protection fault: 0000 [#1] PREEMPT SMP 
 Modules linked in: ...
 CPU: 3 PID: 17789 Comm: rmdir Tainted: G        W     3.15.0-rc2-test+ #41
 Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS SDBLI944.86P 05/08/2007
 task: ffff88003786ca60 ti: ffff880077018000 task.ti: ffff880077018000
 RIP: 0010:[<ffffffff811ed5eb>]  [<ffffffff811ed5eb>] debugfs_remove_recursive+0x1bd/0x367
 RSP: 0018:ffff880077019df8  EFLAGS: 00010246
 RAX: 0000000000000002 RBX: ffff88006f0fe490 RCX: 0000000000000000
 RDX: dead000000100058 RSI: 0000000000000246 RDI: ffff88003786d454
 RBP: ffff88006f0fe640 R08: 0000000000000628 R09: 0000000000000000
 R10: 0000000000000628 R11: ffff8800795110a0 R12: ffff88006f0fe640
 R13: ffff88006f0fe640 R14: ffffffff81817d0b R15: ffffffff818188b7
 FS:  00007ff13ae24700(0000) GS:ffff88007d580000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 0000003054ec7be0 CR3: 0000000076d51000 CR4: 00000000000007e0
 Stack:
  ffff88007a41ebe0 dead000000100058 00000000fffffffe ffff88006f0fe640
  0000000000000000 ffff88006f0fe678 ffff88007a41ebe0 ffff88003793a000
  00000000fffffffe ffffffff810bde82 ffff88006f0fe640 ffff88007a41eb28
 Call Trace:
  [<ffffffff810bde82>] ? instance_rmdir+0x15b/0x1de
  [<ffffffff81132e2d>] ? vfs_rmdir+0x80/0xd3
  [<ffffffff81132f51>] ? do_rmdir+0xd1/0x139
  [<ffffffff8124ad9e>] ? trace_hardirqs_on_thunk+0x3a/0x3c
  [<ffffffff814fea62>] ? system_call_fastpath+0x16/0x1b
 Code: fe ff ff 48 8d 75 30 48 89 df e8 c9 fd ff ff 85 c0 75 13 48 c7 c6 b8 cc d2 81 48 c7 c7 b0 cc d2 81 e8 8c 7a f5 ff 48 8b 54 24 08 <48> 8b 82 a8 00 00 00 48 89 d3 48 2d a8 00 00 00 48 89 44 24 08 
 RIP  [<ffffffff811ed5eb>] debugfs_remove_recursive+0x1bd/0x367
  RSP <ffff880077019df8>

It took a while, but every time it triggered, it was always in the
same place:

	list_for_each_entry_safe(child, next, &parent->d_subdirs, d_u.d_child) {

Where the child->d_u.d_child seemed to be corrupted.  I added lots of
trace_printk()s to see what was wrong, and sure enough, it was always
the child's d_u.d_child field. I looked around to see what touches
it and noticed that in __dentry_kill() which calls dentry_free():

static void dentry_free(struct dentry *dentry)
{
	/* if dentry was never visible to RCU, immediate free is OK */
	if (!(dentry->d_flags & DCACHE_RCUACCESS))
		__d_free(&dentry->d_u.d_rcu);
	else
		call_rcu(&dentry->d_u.d_rcu, __d_free);
}

I also noticed that __dentry_kill() unlinks the child->d_u.child
under the parent->d_lock spin_lock.

Looking back at the loop in debugfs_remove_recursive() it never takes the
parent->d_lock to do the list walk. Adding more tracing, I was able to
prove this was the issue:

 ftrace-t-15385   1.... 246662024us : dentry_kill <ffffffff81138b91>: free ffff88006d573600
    rmdir-15409   2.... 246662024us : debugfs_remove_recursive <ffffffff811ec7e5>: child=ffff88006d573600 next=dead000000100058

The dentry_kill freed ffff88006d573600 just as the remove recursive was walking
it.

In order to fix this, the list walk needs to be modified a bit to take
the parent->d_lock. The safe version is no longer necessary, as every
time we remove a child, the parent->d_lock must be released and the
list walk must start over. Each time a child is removed, even though it
may still be on the list, it should be skipped by the first check
in the loop:

		if (!debugfs_positive(child))
			continue;

Cc: stable@vger.kernel.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

---
 fs/debugfs/inode.c |   33 ++++++++++++++++++++++++++-------
 1 file changed, 26 insertions(+), 7 deletions(-)

Index: linux-trace.git/fs/debugfs/inode.c
===================================================================
--- linux-trace.git.orig/fs/debugfs/inode.c	2014-06-09 10:34:13.010051075 -0400
+++ linux-trace.git/fs/debugfs/inode.c	2014-06-09 10:34:36.108811592 -0400
@@ -534,7 +534,7 @@
  */
 void debugfs_remove_recursive(struct dentry *dentry)
 {
-	struct dentry *child, *next, *parent;
+	struct dentry *child, *parent;
 
 	if (IS_ERR_OR_NULL(dentry))
 		return;
@@ -546,30 +546,49 @@
 	parent = dentry;
  down:
 	mutex_lock(&parent->d_inode->i_mutex);
-	list_for_each_entry_safe(child, next, &parent->d_subdirs, d_u.d_child) {
+ loop:
+	/*
+	 * The parent->d_subdirs is protected by the d_lock. Outside that
+	 * lock, the child can be unlinked and set to be freed which can
+	 * use the d_u.d_child as the rcu head and corrupt this list.
+	 */
+	spin_lock(&parent->d_lock);
+	list_for_each_entry(child, &parent->d_subdirs, d_u.d_child) {
 		if (!debugfs_positive(child))
 			continue;
 
 		/* perhaps simple_empty(child) makes more sense */
 		if (!list_empty(&child->d_subdirs)) {
+			spin_unlock(&parent->d_lock);
 			mutex_unlock(&parent->d_inode->i_mutex);
 			parent = child;
 			goto down;
 		}
- up:
+
+		spin_unlock(&parent->d_lock);
+
 		if (!__debugfs_remove(child, parent))
 			simple_release_fs(&debugfs_mount, &debugfs_mount_count);
+
+		/*
+		 * The parent->d_lock protects agaist child from unlinking
+		 * from d_subdirs. When releasing the parent->d_lock we can
+		 * no longer trust that the next pointer is valid.
+		 * Restart the loop. We'll skip this one with the
+		 * debugfs_positive() check.
+		 */
+		goto loop;
 	}
+	spin_unlock(&parent->d_lock);
 
 	mutex_unlock(&parent->d_inode->i_mutex);
 	child = parent;
 	parent = parent->d_parent;
 	mutex_lock(&parent->d_inode->i_mutex);
 
-	if (child != dentry) {
-		next = list_next_entry(child, d_u.d_child);
-		goto up;
-	}
+	if (child != dentry)
+		/* go up */
+		goto loop;
 
 	if (!__debugfs_remove(child, parent))
 		simple_release_fs(&debugfs_mount, &debugfs_mount_count);

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC][PATCH] debugfs: Fix corrupted loop in debugfs_remove_recursive
  2014-06-09 18:06 [RFC][PATCH] debugfs: Fix corrupted loop in debugfs_remove_recursive Steven Rostedt
@ 2014-06-10 13:33 ` Steven Rostedt
  2014-06-10 20:23   ` Greg Kroah-Hartman
  2014-06-12 10:43 ` Al Viro
  1 sibling, 1 reply; 9+ messages in thread
From: Steven Rostedt @ 2014-06-10 13:33 UTC (permalink / raw)
  To: Al Viro, Greg Kroah-Hartman; +Cc: LKML, linux-fsdevel, Andrew Morton

[-- Attachment #1: Type: text/plain, Size: 655 bytes --]

On Mon, 9 Jun 2014 14:06:07 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> [ I'm currently running my tests on it now, and so far, after a few
>  hours it has yet to blow up. I'll run it for 24 hours which it never
>  succeeded in the past. ]

I ran with this patch on three boxes all night, one for 24 hours. No
problems. I stopped the tests, removed the patch, ran the test on the
same boxes and they all crashed within an hour.

This patch definitely fixes the bug.

The test I ran was:

while :; do
 ./ftrace-test-mkinstances || break
 ./ftrace-test-mkinstances-2 || break
done

Note, the tests expect debugfs to be mounted at /debug.

-- Steve

[-- Attachment #2: ftrace-test-mkinstances --]
[-- Type: application/octet-stream, Size: 1092 bytes --]

#!/bin/bash

if [ ! -d /debug/tracing/instances ]; then
	echo "No instances directory"
	exit 0
fi

cd /debug/tracing/instances

mkdir x
rmdir x
result=$?

if [ $result -ne 0 ]; then
	echo "instance rmdir not supported, skipping this test"
	exit 0
fi

instance_slam() {
	while :; do
		mkdir x
		mkdir y
		mkdir z
		rmdir x
		rmdir y
		rmdir z
	done 2>/dev/null
}

instance_slam &
x=`jobs -l`
p1=`echo $x | cut -d' ' -f2`
echo $p1

instance_slam &
x=`jobs -l | tail -1`
p2=`echo $x | cut -d' ' -f2`
echo $p2

instance_slam &
x=`jobs -l | tail -1`
p3=`echo $x | cut -d' ' -f2`
echo $p3

instance_slam &
x=`jobs -l | tail -1`
p4=`echo $x | cut -d' ' -f2`
echo $p4

instance_slam &
x=`jobs -l | tail -1`
p5=`echo $x | cut -d' ' -f2`
echo $p5

for i in `seq 10`; do
	ls
	sleep 1
done
kill -1 $p1 
kill -1 $p2
kill -1 $p3
kill -1 $p4
kill -1 $p5

echo "Wait for processes to finish"
wait $p1 $p2 $p3 $p4 $p5
echo "all processes finished, wait for cleanup"
sleep 2

mkdir x y z
ls x y z
rmdir x y z
for d in x y z; do
	if [ -d $d ]; then
		echo $d still exists
		exit -1
	fi
done
echo SUCCESS
exit 0

[-- Attachment #3: ftrace-test-mkinstances-2 --]
[-- Type: application/octet-stream, Size: 767 bytes --]

#!/bin/bash

if [ ! -d /debug/tracing/instances ]; then
	echo "No instances directory"
	exit 0
fi

cd /debug/tracing/instances

instance_slam() {
	while :; do
		mkdir foo &> /dev/null
		rmdir foo &> /dev/null
	done
}

instance_read() {
	while :; do
		cat foo/trace &> /dev/null
	done
}

instance_set() {
	while :; do
		echo 1 > foo/events/sched/sched_switch
	done 2> /dev/null
}

instance_slam &
x=`jobs -l`
p1=`echo $x | cut -d' ' -f2`
echo $p1

instance_set &
x=`jobs -l | tail -1`
p2=`echo $x | cut -d' ' -f2`
echo $p2

sleep 10

kill -1 $p1 
kill -1 $p2

echo "Wait for processes to finish"
wait $p1 $p2
echo "all processes finished, wait for cleanup"
sleep 2

mkdir foo
ls foo
rmdir foo
if [ -d foo ]; then
	echo foo still exists
	exit -1
fi
echo SUCCESS
exit 0

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC][PATCH] debugfs: Fix corrupted loop in debugfs_remove_recursive
  2014-06-10 20:23   ` Greg Kroah-Hartman
@ 2014-06-10 20:22     ` Steven Rostedt
  2014-06-10 20:33       ` Greg Kroah-Hartman
  2014-06-30 15:53     ` Steven Rostedt
  1 sibling, 1 reply; 9+ messages in thread
From: Steven Rostedt @ 2014-06-10 20:22 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Al Viro, LKML, linux-fsdevel, Andrew Morton

On Tue, 10 Jun 2014 13:23:02 -0700
Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:


> > Note, the tests expect debugfs to be mounted at /debug.
> 
> Well, those are some broken tests :)

And that's part of the reason why I haven't pushed them to be included
in mainline's tools/testing/ directory.

-- Steve

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC][PATCH] debugfs: Fix corrupted loop in debugfs_remove_recursive
  2014-06-10 13:33 ` Steven Rostedt
@ 2014-06-10 20:23   ` Greg Kroah-Hartman
  2014-06-10 20:22     ` Steven Rostedt
  2014-06-30 15:53     ` Steven Rostedt
  0 siblings, 2 replies; 9+ messages in thread
From: Greg Kroah-Hartman @ 2014-06-10 20:23 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Al Viro, LKML, linux-fsdevel, Andrew Morton

On Tue, Jun 10, 2014 at 09:33:57AM -0400, Steven Rostedt wrote:
> On Mon, 9 Jun 2014 14:06:07 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > [ I'm currently running my tests on it now, and so far, after a few
> >  hours it has yet to blow up. I'll run it for 24 hours which it never
> >  succeeded in the past. ]
> 
> I ran with this patch on three boxes all night, one for 24 hours. No
> problems. I stopped the tests, removed the patch, ran the test on the
> same boxes and they all crashed within an hour.
> 
> This patch definitely fixes the bug.
> 
> The test I ran was:
> 
> while :; do
>  ./ftrace-test-mkinstances || break
>  ./ftrace-test-mkinstances-2 || break
> done
> 
> Note, the tests expect debugfs to be mounted at /debug.

Well, those are some broken tests :)

Anyway, thanks for the patch, I'll queue it up after 3.16-rc1 is out.

greg k-h

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC][PATCH] debugfs: Fix corrupted loop in debugfs_remove_recursive
  2014-06-10 20:22     ` Steven Rostedt
@ 2014-06-10 20:33       ` Greg Kroah-Hartman
  0 siblings, 0 replies; 9+ messages in thread
From: Greg Kroah-Hartman @ 2014-06-10 20:33 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Al Viro, LKML, linux-fsdevel, Andrew Morton

On Tue, Jun 10, 2014 at 04:22:56PM -0400, Steven Rostedt wrote:
> On Tue, 10 Jun 2014 13:23:02 -0700
> Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> 
> 
> > > Note, the tests expect debugfs to be mounted at /debug.
> > 
> > Well, those are some broken tests :)
> 
> And that's part of the reason why I haven't pushed them to be included
> in mainline's tools/testing/ directory.

It should be simple to change that...

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC][PATCH] debugfs: Fix corrupted loop in debugfs_remove_recursive
  2014-06-09 18:06 [RFC][PATCH] debugfs: Fix corrupted loop in debugfs_remove_recursive Steven Rostedt
  2014-06-10 13:33 ` Steven Rostedt
@ 2014-06-12 10:43 ` Al Viro
  2014-06-12 16:08   ` Greg Kroah-Hartman
  1 sibling, 1 reply; 9+ messages in thread
From: Al Viro @ 2014-06-12 10:43 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Greg Kroah-Hartman, LKML, linux-fsdevel, Andrew Morton

On Mon, Jun 09, 2014 at 02:06:07PM -0400, Steven Rostedt wrote:

> When these are called, the d_entry and inode locks need to be released
> to call the instance creation and deletion code. That code has its own
> accounting and locking to serialize everything to prevent multiple
> users from causing harm. As the parent "instance" directory can not
> be modified this simplifies things.

Yecchhh...   Looking at debugfs:

static inline int debugfs_positive(struct dentry *dentry)
{
        return dentry->d_inode && !d_unhashed(dentry);
}

...
        if (debugfs_positive(dentry)) {
                if (dentry->d_inode) {
What the hell?

        parent = dentry->d_parent;
        if (!parent || !parent->d_inode)
                return;
Huh?  First of all, ->d_parent is *never* NULL.  Moreover, it can't be a
negative dentry.

What's more, if debugfs_rename() is ever used for cross-directory renames,
this tree-walker is buggered - it'll happily walk up "back" into a directory
it has never visited...

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC][PATCH] debugfs: Fix corrupted loop in debugfs_remove_recursive
  2014-06-12 10:43 ` Al Viro
@ 2014-06-12 16:08   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 9+ messages in thread
From: Greg Kroah-Hartman @ 2014-06-12 16:08 UTC (permalink / raw)
  To: Al Viro; +Cc: Steven Rostedt, LKML, linux-fsdevel, Andrew Morton

On Thu, Jun 12, 2014 at 11:43:14AM +0100, Al Viro wrote:
> On Mon, Jun 09, 2014 at 02:06:07PM -0400, Steven Rostedt wrote:
> 
> > When these are called, the d_entry and inode locks need to be released
> > to call the instance creation and deletion code. That code has its own
> > accounting and locking to serialize everything to prevent multiple
> > users from causing harm. As the parent "instance" directory can not
> > be modified this simplifies things.
> 
> Yecchhh...   Looking at debugfs:
> 
> static inline int debugfs_positive(struct dentry *dentry)
> {
>         return dentry->d_inode && !d_unhashed(dentry);
> }
> 
> ...
>         if (debugfs_positive(dentry)) {
>                 if (dentry->d_inode) {
> What the hell?
> 
>         parent = dentry->d_parent;
>         if (!parent || !parent->d_inode)
>                 return;
> Huh?  First of all, ->d_parent is *never* NULL.  Moreover, it can't be a
> negative dentry.
> 
> What's more, if debugfs_rename() is ever used for cross-directory renames,
> this tree-walker is buggered - it'll happily walk up "back" into a directory
> it has never visited...

All of that code has been there since before 2.11, I really don't
remember how I came up with it at all, sorry.

I'm working on converting debugfs to use kernfs, so all of the debugfs
mess and problems should go away soon.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC][PATCH] debugfs: Fix corrupted loop in debugfs_remove_recursive
  2014-06-10 20:23   ` Greg Kroah-Hartman
  2014-06-10 20:22     ` Steven Rostedt
@ 2014-06-30 15:53     ` Steven Rostedt
  2014-06-30 16:26       ` Greg Kroah-Hartman
  1 sibling, 1 reply; 9+ messages in thread
From: Steven Rostedt @ 2014-06-30 15:53 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Al Viro, LKML, linux-fsdevel, Andrew Morton

On Tue, 10 Jun 2014 13:23:02 -0700
Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:


> Anyway, thanks for the patch, I'll queue it up after 3.16-rc1 is out.

Did this patch get dropped?

-- Steve


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC][PATCH] debugfs: Fix corrupted loop in debugfs_remove_recursive
  2014-06-30 15:53     ` Steven Rostedt
@ 2014-06-30 16:26       ` Greg Kroah-Hartman
  0 siblings, 0 replies; 9+ messages in thread
From: Greg Kroah-Hartman @ 2014-06-30 16:26 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Al Viro, LKML, linux-fsdevel, Andrew Morton

On Mon, Jun 30, 2014 at 11:53:31AM -0400, Steven Rostedt wrote:
> On Tue, 10 Jun 2014 13:23:02 -0700
> Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> 
> 
> > Anyway, thanks for the patch, I'll queue it up after 3.16-rc1 is out.
> 
> Did this patch get dropped?

No, it's still in my "to-apply" queue, just been on the road for a
while.  I'll get to it after the 5th...

greg k-h

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-06-30 16:26 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-09 18:06 [RFC][PATCH] debugfs: Fix corrupted loop in debugfs_remove_recursive Steven Rostedt
2014-06-10 13:33 ` Steven Rostedt
2014-06-10 20:23   ` Greg Kroah-Hartman
2014-06-10 20:22     ` Steven Rostedt
2014-06-10 20:33       ` Greg Kroah-Hartman
2014-06-30 15:53     ` Steven Rostedt
2014-06-30 16:26       ` Greg Kroah-Hartman
2014-06-12 10:43 ` Al Viro
2014-06-12 16:08   ` Greg Kroah-Hartman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.