All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] audit: fix mark refcounting
@ 2011-11-07 14:59 Miklos Szeredi
  2011-11-15 14:12 ` Miklos Szeredi
  2011-12-14 14:35 ` [PATCH resend] " Miklos Szeredi
  0 siblings, 2 replies; 16+ messages in thread
From: Miklos Szeredi @ 2011-11-07 14:59 UTC (permalink / raw)
  To: Eric Paris, Al Viro; +Cc: linux-fsdevel, linux-kernel

From: Miklos Szeredi <mszeredi@suse.cz>

Removing the parent of a watched file results in "kernel BUG at
fs/notify/mark.c:139".

To reproduce

  add "-w /tmp/audit/dir/watched_file" to audit.rules
  rm -rf /tmp/audit/dir

This is caused by fsnotify_destroy_mark() being called without an
extra reference taken by the caller.

Reported by Francesco Cosoleto here:

  https://bugzilla.novell.com/show_bug.cgi?id=689860

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: Al Viro <viro@zeniv.linux.org.uk>
CC: Eric Paris <eparis@redhat.com>
CC: stable@vger.kernel.org
---
 kernel/audit_watch.c |    2 ++
 1 file changed, 2 insertions(+)

Index: linux.git/kernel/audit_watch.c
===================================================================
--- linux.git.orig/kernel/audit_watch.c	2011-09-13 16:08:20.000000000 +0200
+++ linux.git/kernel/audit_watch.c	2011-11-07 15:19:07.000000000 +0100
@@ -349,7 +349,9 @@ static void audit_remove_parent_watches(
 	}
 	mutex_unlock(&audit_filter_mutex);
 
+	audit_get_parent(parent);
 	fsnotify_destroy_mark(&parent->mark);
+	audit_put_parent(parent);
 }
 
 /* Get path information necessary for adding watches. */

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] audit: fix mark refcounting
  2011-11-07 14:59 [PATCH] audit: fix mark refcounting Miklos Szeredi
@ 2011-11-15 14:12 ` Miklos Szeredi
  2011-11-15 14:31   ` Eric Paris
  2011-12-14 14:35 ` [PATCH resend] " Miklos Szeredi
  1 sibling, 1 reply; 16+ messages in thread
From: Miklos Szeredi @ 2011-11-15 14:12 UTC (permalink / raw)
  To: Eric Paris, Al Viro; +Cc: linux-fsdevel, linux-kernel

Ping?

On Mon, Nov 7, 2011 at 3:59 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> From: Miklos Szeredi <mszeredi@suse.cz>
>
> Removing the parent of a watched file results in "kernel BUG at
> fs/notify/mark.c:139".
>
> To reproduce
>
>  add "-w /tmp/audit/dir/watched_file" to audit.rules
>  rm -rf /tmp/audit/dir
>
> This is caused by fsnotify_destroy_mark() being called without an
> extra reference taken by the caller.
>
> Reported by Francesco Cosoleto here:
>
>  https://bugzilla.novell.com/show_bug.cgi?id=689860
>
> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
> CC: Al Viro <viro@zeniv.linux.org.uk>
> CC: Eric Paris <eparis@redhat.com>
> CC: stable@vger.kernel.org
> ---
>  kernel/audit_watch.c |    2 ++
>  1 file changed, 2 insertions(+)
>
> Index: linux.git/kernel/audit_watch.c
> ===================================================================
> --- linux.git.orig/kernel/audit_watch.c 2011-09-13 16:08:20.000000000 +0200
> +++ linux.git/kernel/audit_watch.c      2011-11-07 15:19:07.000000000 +0100
> @@ -349,7 +349,9 @@ static void audit_remove_parent_watches(
>        }
>        mutex_unlock(&audit_filter_mutex);
>
> +       audit_get_parent(parent);
>        fsnotify_destroy_mark(&parent->mark);
> +       audit_put_parent(parent);
>  }
>
>  /* Get path information necessary for adding watches. */
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] audit: fix mark refcounting
  2011-11-15 14:12 ` Miklos Szeredi
@ 2011-11-15 14:31   ` Eric Paris
  0 siblings, 0 replies; 16+ messages in thread
From: Eric Paris @ 2011-11-15 14:31 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-fsdevel, linux-kernel

I picked it up in my audit tree for next go round, although I haven't
published the tree as I'm still doing devel...  Should be in next in a
week or two.

-Eric

On Tue, 2011-11-15 at 15:12 +0100, Miklos Szeredi wrote:
> Ping?
> 
> On Mon, Nov 7, 2011 at 3:59 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> > From: Miklos Szeredi <mszeredi@suse.cz>
> >
> > Removing the parent of a watched file results in "kernel BUG at
> > fs/notify/mark.c:139".
> >
> > To reproduce
> >
> >  add "-w /tmp/audit/dir/watched_file" to audit.rules
> >  rm -rf /tmp/audit/dir
> >
> > This is caused by fsnotify_destroy_mark() being called without an
> > extra reference taken by the caller.
> >
> > Reported by Francesco Cosoleto here:
> >
> >  https://bugzilla.novell.com/show_bug.cgi?id=689860
> >
> > Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
> > CC: Al Viro <viro@zeniv.linux.org.uk>
> > CC: Eric Paris <eparis@redhat.com>
> > CC: stable@vger.kernel.org
> > ---
> >  kernel/audit_watch.c |    2 ++
> >  1 file changed, 2 insertions(+)
> >
> > Index: linux.git/kernel/audit_watch.c
> > ===================================================================
> > --- linux.git.orig/kernel/audit_watch.c 2011-09-13 16:08:20.000000000 +0200
> > +++ linux.git/kernel/audit_watch.c      2011-11-07 15:19:07.000000000 +0100
> > @@ -349,7 +349,9 @@ static void audit_remove_parent_watches(
> >        }
> >        mutex_unlock(&audit_filter_mutex);
> >
> > +       audit_get_parent(parent);
> >        fsnotify_destroy_mark(&parent->mark);
> > +       audit_put_parent(parent);
> >  }
> >
> >  /* Get path information necessary for adding watches. */
> >



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH resend] audit: fix mark refcounting
  2011-11-07 14:59 [PATCH] audit: fix mark refcounting Miklos Szeredi
  2011-11-15 14:12 ` Miklos Szeredi
@ 2011-12-14 14:35 ` Miklos Szeredi
  2011-12-15  2:15   ` Linus Torvalds
  1 sibling, 1 reply; 16+ messages in thread
From: Miklos Szeredi @ 2011-12-14 14:35 UTC (permalink / raw)
  To: Eric Paris, Al Viro; +Cc: akpm, torvalds, linux-fsdevel, linux-kernel

I think this should go into 3.2 (and stable).

Please apply.

Thanks,
Miklos
----

From: Miklos Szeredi <mszeredi@suse.cz>
Subject: audit: fix mark refcounting

Removing the parent of a watched file results in "kernel BUG at
fs/notify/mark.c:139".

To reproduce

  add "-w /tmp/audit/dir/watched_file" to audit.rules
  rm -rf /tmp/audit/dir

This is caused by fsnotify_destroy_mark() being called without an
extra reference taken by the caller.

Reported by Francesco Cosoleto here:

  https://bugzilla.novell.com/show_bug.cgi?id=689860

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: Al Viro <viro@zeniv.linux.org.uk>
CC: Eric Paris <eparis@redhat.com>
CC: stable@vger.kernel.org
---
 kernel/audit_watch.c |    2 ++
 1 file changed, 2 insertions(+)

Index: linux.git/kernel/audit_watch.c
===================================================================
--- linux.git.orig/kernel/audit_watch.c	2011-12-06 16:39:48.000000000 +0100
+++ linux.git/kernel/audit_watch.c	2011-12-13 14:05:37.000000000 +0100
@@ -349,7 +349,9 @@ static void audit_remove_parent_watches(
 	}
 	mutex_unlock(&audit_filter_mutex);
 
+	audit_get_parent(parent);
 	fsnotify_destroy_mark(&parent->mark);
+	audit_put_parent(parent);
 }
 
 /* Get path information necessary for adding watches. */

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH resend] audit: fix mark refcounting
  2011-12-14 14:35 ` [PATCH resend] " Miklos Szeredi
@ 2011-12-15  2:15   ` Linus Torvalds
  2011-12-15  8:40     ` Al Viro
  2011-12-15  8:49     ` Miklos Szeredi
  0 siblings, 2 replies; 16+ messages in thread
From: Linus Torvalds @ 2011-12-15  2:15 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Eric Paris, Al Viro, akpm, linux-fsdevel, linux-kernel

Looks reasonable, but why doesn't all callers have that "put_mark()" thing?

And if/when all callers *do* have that put_mark() thing, maybe we
should make destroy_mark() just do it?

In particular, a quick grep shows that there are destroy_mark users still in:

 - fs/notify/fanotify/fanotify_user.c

 - fs/notify/dnotify/dnotify.c (2 of them)

 - fs/notify/inotify/inotify_fsnotify.c


that don't do "put_mark()" after the destroy. Why is it ok there?

I don't know the code, it's probably fine, but I'd like to know why
the audit code needs it but not the other sites (but my grep didn't
look at context)

And I'd like Al to say something. Al?

                      Linus

On Wed, Dec 14, 2011 at 6:35 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> I think this should go into 3.2 (and stable).
>
> Please apply.
>
> Thanks,
> Miklos
> ----
>
> From: Miklos Szeredi <mszeredi@suse.cz>
> Subject: audit: fix mark refcounting
>
> Removing the parent of a watched file results in "kernel BUG at
> fs/notify/mark.c:139".
>
> To reproduce
>
>  add "-w /tmp/audit/dir/watched_file" to audit.rules
>  rm -rf /tmp/audit/dir
>
> This is caused by fsnotify_destroy_mark() being called without an
> extra reference taken by the caller.
>
> Reported by Francesco Cosoleto here:
>
>  https://bugzilla.novell.com/show_bug.cgi?id=689860
>
> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
> CC: Al Viro <viro@zeniv.linux.org.uk>
> CC: Eric Paris <eparis@redhat.com>
> CC: stable@vger.kernel.org
> ---
>  kernel/audit_watch.c |    2 ++
>  1 file changed, 2 insertions(+)
>
> Index: linux.git/kernel/audit_watch.c
> ===================================================================
> --- linux.git.orig/kernel/audit_watch.c 2011-12-06 16:39:48.000000000 +0100
> +++ linux.git/kernel/audit_watch.c      2011-12-13 14:05:37.000000000 +0100
> @@ -349,7 +349,9 @@ static void audit_remove_parent_watches(
>        }
>        mutex_unlock(&audit_filter_mutex);
>
> +       audit_get_parent(parent);
>        fsnotify_destroy_mark(&parent->mark);
> +       audit_put_parent(parent);
>  }
>
>  /* Get path information necessary for adding watches. */

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH resend] audit: fix mark refcounting
  2011-12-15  2:15   ` Linus Torvalds
@ 2011-12-15  8:40     ` Al Viro
  2011-12-15  8:56       ` Miklos Szeredi
  2011-12-15 16:48       ` Linus Torvalds
  2011-12-15  8:49     ` Miklos Szeredi
  1 sibling, 2 replies; 16+ messages in thread
From: Al Viro @ 2011-12-15  8:40 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Miklos Szeredi, Eric Paris, akpm, linux-fsdevel, linux-kernel

On Wed, Dec 14, 2011 at 06:15:11PM -0800, Linus Torvalds wrote:
> Looks reasonable, but why doesn't all callers have that "put_mark()" thing?
> 
> And if/when all callers *do* have that put_mark() thing, maybe we
> should make destroy_mark() just do it?
> 
> In particular, a quick grep shows that there are destroy_mark users still in:
> 
>  - fs/notify/fanotify/fanotify_user.c
> 
>  - fs/notify/dnotify/dnotify.c (2 of them)
> 
>  - fs/notify/inotify/inotify_fsnotify.c
> 
> 
> that don't do "put_mark()" after the destroy. Why is it ok there?

Um?  dnotify has fsnotify_put_mark() called in both cases...
 
> I don't know the code, it's probably fine, but I'd like to know why
> the audit code needs it but not the other sites (but my grep didn't
> look at context)
> 
> And I'd like Al to say something. Al?

I don't like it; it's called from ->handle_event() and parent->mark is
exactly the inode_mark argument of that method.  It ought to be pinned
by caller.  In other places we *do* need get/put around that destroy
and we generally do that.

AFAICS, we have the following picture:
	* that place in audit_watch - argument of ->handle_event()
	* audit_remove_watch_rule() - pinned explicitly
	* audit_tree - pinned explicitly
	* dnotify (both callrs) - pinned explicitly, and refcount is
dropped unconditionally while fsnotify_destroy_mark() is *not*; IOW,
that's a very strong argument against folding put_mark into destroy_mark.
	* inotify_fsnotify.c - argument of ->handle_event()
	* fanotify_user.c - pinned and dropped by caller; again, refcount
manipulations are unconditional while destroy_mark is not; it's even
worse than in dnotify case, since here we do put_mark is a place where
we don't *know* whether destroy_mark has happened.  We could move the
calls of fsnotify_put_mark() into the fanotify_mark_remove_from_mask()
(where destroy_mark is done), but then we'll get something like
	if (!(oldmask & ~mask))
                fsnotify_destroy_mark(fsn_mark);
	else
                fsnotify_put_mark(fsn_mark);
in there, which is IMO ugly.

Guys, does anybody have a real demonstration of the breakage cured by
pinning the mark down in audit_watch.c ->handle_event()?  Or is that
a pure theory?

Is ->handle_event() argument held by caller?  Eric?  If that's the case,
we don't need to do anything with audit_watch.c instance; otherwise,
both that one and inotify_handle_event() are in trouble...

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH resend] audit: fix mark refcounting
  2011-12-15  2:15   ` Linus Torvalds
  2011-12-15  8:40     ` Al Viro
@ 2011-12-15  8:49     ` Miklos Szeredi
  1 sibling, 0 replies; 16+ messages in thread
From: Miklos Szeredi @ 2011-12-15  8:49 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Eric Paris, Al Viro, akpm, linux-fsdevel, linux-kernel

Linus Torvalds <torvalds@linux-foundation.org> writes:

> Looks reasonable, but why doesn't all callers have that "put_mark()" thing?
>
> And if/when all callers *do* have that put_mark() thing, maybe we
> should make destroy_mark() just do it?
>
> In particular, a quick grep shows that there are destroy_mark users still in:
>
>  - fs/notify/fanotify/fanotify_user.c
>
>  - fs/notify/dnotify/dnotify.c (2 of them)

These do in fact do "put_mark()" after the "destroy_mark()".

>
>  - fs/notify/inotify/inotify_fsnotify.c

This one, I think, is broken.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH resend] audit: fix mark refcounting
  2011-12-15  8:40     ` Al Viro
@ 2011-12-15  8:56       ` Miklos Szeredi
  2011-12-15  9:01         ` Al Viro
  2011-12-15  9:03         ` Miklos Szeredi
  2011-12-15 16:48       ` Linus Torvalds
  1 sibling, 2 replies; 16+ messages in thread
From: Miklos Szeredi @ 2011-12-15  8:56 UTC (permalink / raw)
  To: Al Viro; +Cc: Linus Torvalds, Eric Paris, akpm, linux-fsdevel, linux-kernel

Al Viro <viro@ZenIV.linux.org.uk> writes:

> On Wed, Dec 14, 2011 at 06:15:11PM -0800, Linus Torvalds wrote:
>> Looks reasonable, but why doesn't all callers have that "put_mark()" thing?
>> 
>> And if/when all callers *do* have that put_mark() thing, maybe we
>> should make destroy_mark() just do it?
>> 
>> In particular, a quick grep shows that there are destroy_mark users still in:
>> 
>>  - fs/notify/fanotify/fanotify_user.c
>> 
>>  - fs/notify/dnotify/dnotify.c (2 of them)
>> 
>>  - fs/notify/inotify/inotify_fsnotify.c
>> 
>> 
>> that don't do "put_mark()" after the destroy. Why is it ok there?
>
> Um?  dnotify has fsnotify_put_mark() called in both cases...
>  
>> I don't know the code, it's probably fine, but I'd like to know why
>> the audit code needs it but not the other sites (but my grep didn't
>> look at context)
>> 
>> And I'd like Al to say something. Al?
>
> I don't like it; it's called from ->handle_event() and parent->mark is
> exactly the inode_mark argument of that method.  It ought to be pinned
> by caller.  In other places we *do* need get/put around that destroy
> and we generally do that.
>
> AFAICS, we have the following picture:
> 	* that place in audit_watch - argument of ->handle_event()
> 	* audit_remove_watch_rule() - pinned explicitly
> 	* audit_tree - pinned explicitly
> 	* dnotify (both callrs) - pinned explicitly, and refcount is
> dropped unconditionally while fsnotify_destroy_mark() is *not*; IOW,
> that's a very strong argument against folding put_mark into destroy_mark.
> 	* inotify_fsnotify.c - argument of ->handle_event()
> 	* fanotify_user.c - pinned and dropped by caller; again, refcount
> manipulations are unconditional while destroy_mark is not; it's even
> worse than in dnotify case, since here we do put_mark is a place where
> we don't *know* whether destroy_mark has happened.  We could move the
> calls of fsnotify_put_mark() into the fanotify_mark_remove_from_mask()
> (where destroy_mark is done), but then we'll get something like
> 	if (!(oldmask & ~mask))
>                 fsnotify_destroy_mark(fsn_mark);
> 	else
>                 fsnotify_put_mark(fsn_mark);
> in there, which is IMO ugly.
>
> Guys, does anybody have a real demonstration of the breakage cured by
> pinning the mark down in audit_watch.c ->handle_event()?  Or is that
> a pure theory?

Yes it does fix the BUG.  Test case in patch.

> Is ->handle_event() argument held by caller?

Well, obviously not, otherwise we wouldn't hit the bug.

>   Eric?  If that's the case,
> we don't need to do anything with audit_watch.c instance; otherwise,
> both that one and inotify_handle_event() are in trouble...

Yep.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH resend] audit: fix mark refcounting
  2011-12-15  8:56       ` Miklos Szeredi
@ 2011-12-15  9:01         ` Al Viro
  2011-12-15  9:03         ` Miklos Szeredi
  1 sibling, 0 replies; 16+ messages in thread
From: Al Viro @ 2011-12-15  9:01 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Linus Torvalds, Eric Paris, akpm, linux-fsdevel, linux-kernel

On Thu, Dec 15, 2011 at 09:56:26AM +0100, Miklos Szeredi wrote:

> > Guys, does anybody have a real demonstration of the breakage cured by
> > pinning the mark down in audit_watch.c ->handle_event()?  Or is that
> > a pure theory?
> 
> Yes it does fix the BUG.  Test case in patch.
> 
> > Is ->handle_event() argument held by caller?
> 
> Well, obviously not, otherwise we wouldn't hit the bug.
> 
> >   Eric?  If that's the case,
> > we don't need to do anything with audit_watch.c instance; otherwise,
> > both that one and inotify_handle_event() are in trouble...
> 
> Yep.

I wonder if the right fix is to do it here and not in caller, though...
OTOH, usually we don't hit destroy at all, so it's probably better to
handle it in the individual instances...

OK, consider the audit_watch.c part ACKed; inotify counterpart needs a similar
patch, AFAICS.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH resend] audit: fix mark refcounting
  2011-12-15  8:56       ` Miklos Szeredi
  2011-12-15  9:01         ` Al Viro
@ 2011-12-15  9:03         ` Miklos Szeredi
  2011-12-15 20:06           ` Lino Sanfilippo
  1 sibling, 1 reply; 16+ messages in thread
From: Miklos Szeredi @ 2011-12-15  9:03 UTC (permalink / raw)
  To: Al Viro; +Cc: Linus Torvalds, Eric Paris, akpm, linux-fsdevel, linux-kernel

Updated patch:

----
From: Miklos Szeredi <mszeredi@suse.cz>
Subject: fsnotify: hold reference to mark around destroy

Removing the parent of a watched file results in "kernel BUG at
fs/notify/mark.c:139".

To reproduce

  add "-w /tmp/audit/dir/watched_file" to audit.rules
  rm -rf /tmp/audit/dir

This is caused by fsnotify_destroy_mark() being called without an
extra reference taken by the caller.

Reported by Francesco Cosoleto here:

  https://bugzilla.novell.com/show_bug.cgi?id=689860

Linus noticed that the call in

 fs/notify/inotify/inotify_fsnotify.c

has also missing reference to mark.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: Al Viro <viro@zeniv.linux.org.uk>
CC: Eric Paris <eparis@redhat.com>
CC: stable@vger.kernel.org
---
 fs/notify/inotify/inotify_fsnotify.c |    5 ++++-
 kernel/audit_watch.c                 |    2 ++
 2 files changed, 6 insertions(+), 1 deletion(-)

Index: linux.git/kernel/audit_watch.c
===================================================================
--- linux.git.orig/kernel/audit_watch.c	2011-12-14 13:59:47.000000000 +0100
+++ linux.git/kernel/audit_watch.c	2011-12-15 09:57:53.000000000 +0100
@@ -349,7 +349,9 @@ static void audit_remove_parent_watches(
 	}
 	mutex_unlock(&audit_filter_mutex);
 
+	audit_get_parent(parent);
 	fsnotify_destroy_mark(&parent->mark);
+	audit_put_parent(parent);
 }
 
 /* Get path information necessary for adding watches. */
Index: linux.git/fs/notify/inotify/inotify_fsnotify.c
===================================================================
--- linux.git.orig/fs/notify/inotify/inotify_fsnotify.c	2011-09-13 16:08:20.000000000 +0200
+++ linux.git/fs/notify/inotify/inotify_fsnotify.c	2011-12-15 09:59:15.000000000 +0100
@@ -130,8 +130,11 @@ static int inotify_handle_event(struct f
 			ret = PTR_ERR(added_event);
 	}
 
-	if (inode_mark->mask & IN_ONESHOT)
+	if (inode_mark->mask & IN_ONESHOT) {
+		fsnotify_get_mark(inode_mark);
 		fsnotify_destroy_mark(inode_mark);
+		fsnotify_put_mark(inode_mark);
+	}
 
 	return ret;
 }

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH resend] audit: fix mark refcounting
  2011-12-15  8:40     ` Al Viro
  2011-12-15  8:56       ` Miklos Szeredi
@ 2011-12-15 16:48       ` Linus Torvalds
  1 sibling, 0 replies; 16+ messages in thread
From: Linus Torvalds @ 2011-12-15 16:48 UTC (permalink / raw)
  To: Al Viro; +Cc: Miklos Szeredi, Eric Paris, akpm, linux-fsdevel, linux-kernel

On Thu, Dec 15, 2011 at 12:40 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>>
>> In particular, a quick grep shows that there are destroy_mark users still in:
>>
>>  - fs/notify/fanotify/fanotify_user.c
>>
>>  - fs/notify/dnotify/dnotify.c (2 of them)
>>
>>  - fs/notify/inotify/inotify_fsnotify.c
>>
>> that don't do "put_mark()" after the destroy. Why is it ok there?
>
> Um?  dnotify has fsnotify_put_mark() called in both cases...

Ok, that didn't show up in my grep, the "put_mark()" was more than
three lines away in the other case. As mentioned, I simply grepped
without looking at much context at all.

> I don't like it; it's called from ->handle_event() and parent->mark is
> exactly the inode_mark argument of that method.  It ought to be pinned
> by caller.  In other places we *do* need get/put around that destroy
> and we generally do that.

Presumably *parent* is pinned by caller, but not ->mark. So when the
parent directory is deleted, the parent data structure stays around,
but mark is cleanred, and you get the oops that was reported. See the
simple two-liner example to trigger it. I didn't test it myself, but
it looks obvious.

                             Linus

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH resend] audit: fix mark refcounting
  2011-12-15  9:03         ` Miklos Szeredi
@ 2011-12-15 20:06           ` Lino Sanfilippo
  2011-12-15 22:28             ` Eric Paris
  2011-12-15 22:55             ` Al Viro
  0 siblings, 2 replies; 16+ messages in thread
From: Lino Sanfilippo @ 2011-12-15 20:06 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Al Viro, Linus Torvalds, Eric Paris, akpm, linux-fsdevel, linux-kernel

On Thu, Dec 15, 2011 at 10:03:41AM +0100, Miklos Szeredi wrote:
>  
> +	audit_get_parent(parent);
>  	fsnotify_destroy_mark(&parent->mark);
> +	audit_put_parent(parent);

Hi,

What about taking an extra ref on an inode mark in send_to_group()
before we call handle_event()?
So we dont have to handle the cases in which a mark is destroyed
explicitly...

Lino


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH resend] audit: fix mark refcounting
  2011-12-15 20:06           ` Lino Sanfilippo
@ 2011-12-15 22:28             ` Eric Paris
  2011-12-15 22:34               ` Linus Torvalds
  2011-12-15 22:55             ` Al Viro
  1 sibling, 1 reply; 16+ messages in thread
From: Eric Paris @ 2011-12-15 22:28 UTC (permalink / raw)
  To: Lino Sanfilippo
  Cc: Miklos Szeredi, Al Viro, Linus Torvalds, akpm, linux-fsdevel,
	linux-kernel

On Thu, 2011-12-15 at 21:06 +0100, Lino Sanfilippo wrote:
> On Thu, Dec 15, 2011 at 10:03:41AM +0100, Miklos Szeredi wrote:
> >  
> > +	audit_get_parent(parent);
> >  	fsnotify_destroy_mark(&parent->mark);
> > +	audit_put_parent(parent);
> 
> Hi,
> 
> What about taking an extra ref on an inode mark in send_to_group()
> before we call handle_event()?
> So we dont have to handle the cases in which a mark is destroyed
> explicitly...

Yes, but it puts atomic operations on a much hotter path.  I don't
believe there is actually a bug in the code here, but clearly we're
tripping over a BUG() I put in the code.  These patches shut up the
BUG() with minimal overhead, but don't really solve any problem.

Lino is right, the 'right' place to do this would be to take a reference
in fs/notify/fsnotify.c::fsnotify() when we find a mark under the srcu
lock and drop that reference when we are finished using that mark.
Since that function only ever uses the mark to call the handler() if it
gets destroyed in the handler that is fine.  We'd have a real bug here
if fsnotify() used the mark after the handler call...

How expensive is an atomic_inc()/atomic_dec() combo?  If it's mostly
free we can do it in the right place.

-Eric


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH resend] audit: fix mark refcounting
  2011-12-15 22:28             ` Eric Paris
@ 2011-12-15 22:34               ` Linus Torvalds
  0 siblings, 0 replies; 16+ messages in thread
From: Linus Torvalds @ 2011-12-15 22:34 UTC (permalink / raw)
  To: Eric Paris
  Cc: Lino Sanfilippo, Miklos Szeredi, Al Viro, akpm, linux-fsdevel,
	linux-kernel

On Thu, Dec 15, 2011 at 2:28 PM, Eric Paris <eparis@redhat.com> wrote:
>
> How expensive is an atomic_inc()/atomic_dec() combo?  If it's mostly
> free we can do it in the right place.

It's pretty expensive, but it depends a lot on cache details etc. It
involves a memory barrier on x86 too, and depending on architecture it
might be anything from 150 cycles (P4 - but by now probably nobody
cares) to "a few tens" of cycles (roughly 10-35 on modern x86).

The cache miss itself - if it happens - is not counted in the above cost.

A totally uncontended spinlock is actually cheaper than a pair of
atomic ops. So if it's a hot path it probably should be moved
elsewhere if possible.

                       Linus

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH resend] audit: fix mark refcounting
  2011-12-15 20:06           ` Lino Sanfilippo
  2011-12-15 22:28             ` Eric Paris
@ 2011-12-15 22:55             ` Al Viro
  2012-01-12 16:59               ` Miklos Szeredi
  1 sibling, 1 reply; 16+ messages in thread
From: Al Viro @ 2011-12-15 22:55 UTC (permalink / raw)
  To: Lino Sanfilippo
  Cc: Miklos Szeredi, Linus Torvalds, Eric Paris, akpm, linux-fsdevel,
	linux-kernel

On Thu, Dec 15, 2011 at 09:06:31PM +0100, Lino Sanfilippo wrote:
> On Thu, Dec 15, 2011 at 10:03:41AM +0100, Miklos Szeredi wrote:
> >  
> > +	audit_get_parent(parent);
> >  	fsnotify_destroy_mark(&parent->mark);
> > +	audit_put_parent(parent);
> 
> Hi,
> 
> What about taking an extra ref on an inode mark in send_to_group()
> before we call handle_event()?
> So we dont have to handle the cases in which a mark is destroyed
> explicitly...

The thing is, on most of the method calls we won't need that at all.
And it costs quite a bit, so I'm afraid that this variant is the
way to go.  Yes, it would be nicer to do that in caller, but...

Dunno...  Neither instance actually touches the mark after that
destroy_mark and we have very few of those guys (fortunately).  So
removing this BUG_ON() instead might be the right thing to do.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH resend] audit: fix mark refcounting
  2011-12-15 22:55             ` Al Viro
@ 2012-01-12 16:59               ` Miklos Szeredi
  0 siblings, 0 replies; 16+ messages in thread
From: Miklos Szeredi @ 2012-01-12 16:59 UTC (permalink / raw)
  To: Al Viro
  Cc: Lino Sanfilippo, Linus Torvalds, Eric Paris, akpm, linux-fsdevel,
	linux-kernel

Al Viro <viro@ZenIV.linux.org.uk> writes:

> Dunno...  Neither instance actually touches the mark after that
> destroy_mark and we have very few of those guys (fortunately).  So
> removing this BUG_ON() instead might be the right thing to do.

So here's a patch doing that.  Tested okay.

Please apply.

Thanks,
Miklos


---
From: Miklos Szeredi <mszeredi@suse.cz>
Subject: fsnotify: don't BUG in fsnotify_destroy_mark()

Removing the parent of a watched file results in "kernel BUG at
fs/notify/mark.c:139".

To reproduce

  add "-w /tmp/audit/dir/watched_file" to audit.rules
  rm -rf /tmp/audit/dir

This is caused by fsnotify_destroy_mark() being called without an
extra reference taken by the caller.

Reported by Francesco Cosoleto here:

  https://bugzilla.novell.com/show_bug.cgi?id=689860

Fix by removing the BUG_ON and adding a comment about not accessing mark after
the iput.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: stable@vger.kernel.org
---
 fs/notify/mark.c |    8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

Index: linux-2.6/fs/notify/mark.c
===================================================================
--- linux-2.6.orig/fs/notify/mark.c	2011-10-26 14:26:23.000000000 +0200
+++ linux-2.6/fs/notify/mark.c	2012-01-12 14:25:40.000000000 +0100
@@ -135,9 +135,6 @@ void fsnotify_destroy_mark(struct fsnoti
 
 	mark->flags &= ~FSNOTIFY_MARK_FLAG_ALIVE;
 
-	/* 1 from caller and 1 for being on i_list/g_list */
-	BUG_ON(atomic_read(&mark->refcnt) < 2);
-
 	spin_lock(&group->mark_lock);
 
 	if (mark->flags & FSNOTIFY_MARK_FLAG_INODE) {
@@ -182,6 +179,11 @@ void fsnotify_destroy_mark(struct fsnoti
 		iput(inode);
 
 	/*
+	 * We don't necessarily have a ref on mark from caller so the above iput
+	 * may have already destroyed it.  Don't touch from now on.
+	 */
+
+	/*
 	 * it's possible that this group tried to destroy itself, but this
 	 * this mark was simultaneously being freed by inode.  If that's the
 	 * case, we finish freeing the group here.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2012-01-12 17:00 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-07 14:59 [PATCH] audit: fix mark refcounting Miklos Szeredi
2011-11-15 14:12 ` Miklos Szeredi
2011-11-15 14:31   ` Eric Paris
2011-12-14 14:35 ` [PATCH resend] " Miklos Szeredi
2011-12-15  2:15   ` Linus Torvalds
2011-12-15  8:40     ` Al Viro
2011-12-15  8:56       ` Miklos Szeredi
2011-12-15  9:01         ` Al Viro
2011-12-15  9:03         ` Miklos Szeredi
2011-12-15 20:06           ` Lino Sanfilippo
2011-12-15 22:28             ` Eric Paris
2011-12-15 22:34               ` Linus Torvalds
2011-12-15 22:55             ` Al Viro
2012-01-12 16:59               ` Miklos Szeredi
2011-12-15 16:48       ` Linus Torvalds
2011-12-15  8:49     ` Miklos Szeredi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.