All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] vfs: Fix possible NULL pointer dereference in inode_permission()
@ 2014-01-09 21:27 Steven Rostedt
  2014-01-09 21:42 ` Matthew Wilcox
  0 siblings, 1 reply; 23+ messages in thread
From: Steven Rostedt @ 2014-01-09 21:27 UTC (permalink / raw)
  To: LKML, linux-fsdevel
  Cc: Alexander Viro, Stephen Smalley, Christoph Hellwig,
	Linus Torvalds, Eric Paris, Theodore Ts'o, Dave Chinner,
	James Morris, Paul Moore, Andrew Morton, Paul E. McKenney,
	stable


While running stress tests on adding and deleting ftrace instances I
hit this bug:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
IP: [<ffffffff812d8bc5>] selinux_inode_permission+0x85/0x160
PGD 63681067 PUD 7ddbe067 PMD 0 
Oops: 0000 [#1] PREEMPT 
CPU: 0 PID: 5634 Comm: ftrace-test-mki Not tainted 3.13.0-rc4-test-00033-gd2a6dde-dirty #20
Hardware name:                  /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006
task: ffff880078375800 ti: ffff88007ddb0000 task.ti: ffff88007ddb0000
RIP: 0010:[<ffffffff812d8bc5>]  [<ffffffff812d8bc5>] selinux_inode_permission+0x85/0x160
RSP: 0018:ffff88007ddb1c48  EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000800000 RCX: ffff88006dd43840
RDX: 0000000000000001 RSI: 0000000000000081 RDI: ffff88006ee46000
RBP: ffff88007ddb1c88 R08: 0000000000000000 R09: ffff88007ddb1c54
R10: 6e6576652f6f6f66 R11: 0000000000000003 R12: 0000000000000000
R13: 0000000000000081 R14: ffff88006ee46000 R15: 0000000000000000
FS:  00007f217b5b6700(0000) GS:ffffffff81e21000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M
CR2: 0000000000000020 CR3: 000000006a0fe000 CR4: 00000000000007f0
Stack:
 0000000000000081 ffff88006ee46000 0000000000000081 ffffffff812d8b45
 ffff88006ee46000 0000000000000081 ffff880078375800 ffff880078375800
 ffff88007ddb1c98 ffffffff812d358c ffff88007ddb1cb8 ffffffff811364f1
Call Trace:^M
 [<ffffffff812d8b45>] ? selinux_inode_permission+0x5/0x160
 [<ffffffff812d358c>] security_inode_permission+0x1c/0x30
 [<ffffffff811364f1>] __inode_permission+0x41/0xa0
 [<ffffffff81136568>] inode_permission+0x18/0x50
 [<ffffffff811378b6>] link_path_walk+0x66/0x920
 [<ffffffff810875a5>] ? __rcu_read_lock+0x5/0x20
 [<ffffffff8113a9e6>] path_openat+0xa6/0x6c0
 [<ffffffff8113b000>] ? path_openat+0x6c0/0x6c0
 [<ffffffff810c4e69>] ? __trace_graph_entry+0x49/0xc0
 [<ffffffff8112b196>] ? do_sys_open+0x146/0x240
 [<ffffffff8113b043>] do_filp_open+0x43/0xa0
 [<ffffffff8113b005>] ? do_filp_open+0x5/0xa0
 [<ffffffff8112b196>] do_sys_open+0x146/0x240
 [<ffffffff8112b2ae>] SyS_open+0x1e/0x20
 [<ffffffff81948cd0>] system_call_fastpath+0x16/0x1b
Code: 84 a1 00 00 00 81 e3 00 20 00 00 89 d8 83 c8 02 40 f6 c6 04 0f 45 d8 40 f6 c6 08 74 71 80 cf 02 49 8b 46 38 4c 8d 4d cc 45 31 c0 <0f> b7 50 20 8b 70 1c 48 8b 41 70 89 d9 8b 78 04 e8 36 cf ff ff 
RIP  [<ffffffff812d8bc5>] selinux_inode_permission+0x85/0x160
 RSP <ffff88007ddb1c48>
CR2: 0000000000000020
---[ end trace 9d800e5ac5059462 ]---


Investigating, I found that the inode->i_security was NULL, and the
dereference of it caused the oops.

in selinux_inode_permission():
----
	isec = inode->i_security;

	rc = avc_has_perm_noaudit(sid, isec->sid, isec->sclass, perms, 0, &avd);
----

Note, the crash came from stressing the deletion and reading of debugfs
files. I was not able to recreate this via normal files. But I'm not
sure they are safe. It may just be that the race window is much harder
to hit.

What seems to have happened (and what I have traced), is the file is
being opened at the same time the file or directory is being deleted.
As the dentry and inode locks are not held during the path walk, nor is
the inodes ref counts being incremented, there is nothing saving these
structures from being discarded except for an rcu_read_lock().

The rcu_read_lock() protects against freeing of the inode, but it does
not protect freeing of the inode_security_struct. Talking with Eric
Paris about this, it seems to be a generic issue with the
destroy_inode() calling security_inode_free() when the inode may still
be in use (protected by rcu). It seems that the true destruction of the
inode (done by __destroy_inode()) should also be protect by rcu.

Cc: stable@vger.kernel.org
Cc: Eric Paris <eparis@parisplace.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---

This is based off of this thread:

  https://lkml.org/lkml/2014/1/9/349

And perhaps is the true fix for:

  http://oss.sgi.com/archives/xfs/2013-11/msg00709.html

diff --git a/fs/inode.c b/fs/inode.c
index 4bcdad3..a8f3b88 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -252,16 +252,17 @@ EXPORT_SYMBOL(__destroy_inode);
 static void i_callback(struct rcu_head *head)
 {
 	struct inode *inode = container_of(head, struct inode, i_rcu);
+	__destroy_inode(inode);
 	kmem_cache_free(inode_cachep, inode);
 }
 
 static void destroy_inode(struct inode *inode)
 {
 	BUG_ON(!list_empty(&inode->i_lru));
-	__destroy_inode(inode);
-	if (inode->i_sb->s_op->destroy_inode)
+	if (inode->i_sb->s_op->destroy_inode) {
+		__destroy_inode(inode);
 		inode->i_sb->s_op->destroy_inode(inode);
-	else
+	} else
 		call_rcu(&inode->i_rcu, i_callback);
 }
 

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] vfs: Fix possible NULL pointer dereference in inode_permission()
  2014-01-09 21:27 [PATCH] vfs: Fix possible NULL pointer dereference in inode_permission() Steven Rostedt
@ 2014-01-09 21:42 ` Matthew Wilcox
  2014-01-09 21:50   ` Steven Rostedt
  0 siblings, 1 reply; 23+ messages in thread
From: Matthew Wilcox @ 2014-01-09 21:42 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, linux-fsdevel, Alexander Viro, Stephen Smalley,
	Christoph Hellwig, Linus Torvalds, Eric Paris, Theodore Ts'o,
	Dave Chinner, James Morris, Paul Moore, Andrew Morton,
	Paul E. McKenney, stable

On Thu, Jan 09, 2014 at 04:27:31PM -0500, Steven Rostedt wrote:
> Note, the crash came from stressing the deletion and reading of debugfs
> files. I was not able to recreate this via normal files. But I'm not
> sure they are safe. It may just be that the race window is much harder
> to hit.

But "normal" files have a 'destroy_inode' method.  So you've basically
only fixed it for debugfs (and maybe a few other unusual filesystems).
Why doesn't the code look like this:

static void i_callback(struct rcu_head *head)
{
	struct inode *inode = container_of(head, struct inode, i_rcu);
	__destroy_inode(inode);
	if (inode->i_sb->s_op->destroy_inode)
		inode->i_sb->s_op->destroy_inode(inode);
	else
		kmem_cache_free(inode_cachep, inode);
}

static void destroy_inode(struct inode *inode)
{
	BUG_ON(!list_empty(&inode->i_lru));
	call_rcu(&inode->i_rcu, i_callback);
}

We'd then have to get rid of all the call_rcu() invocations in individual
filesystems' destroy_inode methods, but that doesn't sound like a bad
thing to me.

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] vfs: Fix possible NULL pointer dereference in inode_permission()
  2014-01-09 21:42 ` Matthew Wilcox
@ 2014-01-09 21:50   ` Steven Rostedt
  2014-01-09 22:31     ` Al Viro
  0 siblings, 1 reply; 23+ messages in thread
From: Steven Rostedt @ 2014-01-09 21:50 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: LKML, linux-fsdevel, Alexander Viro, Stephen Smalley,
	Christoph Hellwig, Linus Torvalds, Eric Paris, Theodore Ts'o,
	Dave Chinner, James Morris, Paul Moore, Andrew Morton,
	Paul E. McKenney, stable

On Thu, 9 Jan 2014 14:42:39 -0700
Matthew Wilcox <matthew@wil.cx> wrote:

> On Thu, Jan 09, 2014 at 04:27:31PM -0500, Steven Rostedt wrote:
> > Note, the crash came from stressing the deletion and reading of debugfs
> > files. I was not able to recreate this via normal files. But I'm not
> > sure they are safe. It may just be that the race window is much harder
> > to hit.
> 
> But "normal" files have a 'destroy_inode' method.  So you've basically
> only fixed it for debugfs (and maybe a few other unusual filesystems).
> Why doesn't the code look like this:

Because I thought of that after I sent the email ;-)

Well, that's not really true. I don't know the semantics of the
destroy_inode() call. But I should have asked that in my change log.

> 
> static void i_callback(struct rcu_head *head)
> {
> 	struct inode *inode = container_of(head, struct inode, i_rcu);
> 	__destroy_inode(inode);
> 	if (inode->i_sb->s_op->destroy_inode)
> 		inode->i_sb->s_op->destroy_inode(inode);
> 	else
> 		kmem_cache_free(inode_cachep, inode);
> }
> 
> static void destroy_inode(struct inode *inode)
> {
> 	BUG_ON(!list_empty(&inode->i_lru));
> 	call_rcu(&inode->i_rcu, i_callback);
> }
> 
> We'd then have to get rid of all the call_rcu() invocations in individual
> filesystems' destroy_inode methods, but that doesn't sound like a bad
> thing to me.
> 

Which is another reason that I didn't do it, as I didn't know all the
happenings inside the ->destroy_inode() calls. But yeah, I agree with
this.

Also, can iput() sleep? If not then we are OK. Otherwise, we need to be
careful about any mutex being grabbed in those call backs, as the
rcu_callback can't sleep either.

-- Steve

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] vfs: Fix possible NULL pointer dereference in inode_permission()
  2014-01-09 21:50   ` Steven Rostedt
@ 2014-01-09 22:31     ` Al Viro
       [not found]       ` <CA+55aFzCTPYEQCPnLBi1CwmMTocVqCFiCuJ391HkVx1CMw61ug@mail.gmail.com>
  0 siblings, 1 reply; 23+ messages in thread
From: Al Viro @ 2014-01-09 22:31 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Matthew Wilcox, LKML, linux-fsdevel, Stephen Smalley,
	Christoph Hellwig, Linus Torvalds, Eric Paris, Theodore Ts'o,
	Dave Chinner, James Morris, Paul Moore, Andrew Morton,
	Paul E. McKenney, stable

On Thu, Jan 09, 2014 at 04:50:12PM -0500, Steven Rostedt wrote:

> > We'd then have to get rid of all the call_rcu() invocations in individual
> > filesystems' destroy_inode methods, but that doesn't sound like a bad
> > thing to me.

Check what e.g. XFS is doing...

> Which is another reason that I didn't do it, as I didn't know all the
> happenings inside the ->destroy_inode() calls. But yeah, I agree with
> this.
> 
> Also, can iput() sleep? If not then we are OK. Otherwise, we need to be
> careful about any mutex being grabbed in those call backs, as the
> rcu_callback can't sleep either.

iput() definitely can sleep (that's when actual truncation and inode
freeing is done for opened-and-unlinked files - on the final iput() after
close()), but that' irrelevant here - fsnotify_delete_inode() grabs
a bunch of mutexes, which makes calling it from rcu callback no-go.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] vfs: Fix possible NULL pointer dereference in inode_permission()
       [not found]       ` <CA+55aFzCTPYEQCPnLBi1CwmMTocVqCFiCuJ391HkVx1CMw61ug@mail.gmail.com>
@ 2014-01-09 23:10         ` Paul E. McKenney
  2014-01-09 23:25         ` Steven Rostedt
  1 sibling, 0 replies; 23+ messages in thread
From: Paul E. McKenney @ 2014-01-09 23:10 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Al Viro, Dave Chinner, linux-fsdevel, James Morris,
	Andrew Morton, Stephen Smalley, Theodore Ts'o,
	Steven Rostedt, Eric Paris, stable, Paul Moore, LKML,
	Matthew Wilcox, Christoph Hellwig

On Fri, Jan 10, 2014 at 06:41:03AM +0800, Linus Torvalds wrote:
> I think the sane short term fix is to make the kfree() of the i_security
> member be a rcu free, and not clear the member.

Interesting use case.  ;-)

							Thanx, Paul

> Not pretty, but should did this case..
> 
>      Linus
> 
> On Jan 10, 2014 6:31 AM, "Al Viro" <viro@zeniv.linux.org.uk> wrote:
> >
> > iput() definitely can sleep (that's when actual truncation and inode
> > freeing is done for opened-and-unlinked files - on the final iput() after
> > close()), but that' irrelevant here - fsnotify_delete_inode() grabs
> > a bunch of mutexes, which makes calling it from rcu callback no-go.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] vfs: Fix possible NULL pointer dereference in inode_permission()
       [not found]       ` <CA+55aFzCTPYEQCPnLBi1CwmMTocVqCFiCuJ391HkVx1CMw61ug@mail.gmail.com>
  2014-01-09 23:10         ` Paul E. McKenney
@ 2014-01-09 23:25         ` Steven Rostedt
  2014-01-09 23:27           ` Steven Rostedt
  1 sibling, 1 reply; 23+ messages in thread
From: Steven Rostedt @ 2014-01-09 23:25 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Al Viro, Paul McKenney, Dave Chinner, linux-fsdevel,
	James Morris, Andrew Morton, Stephen Smalley, Theodore Ts'o,
	Eric Paris, stable, Paul Moore, LKML, Matthew Wilcox,
	Christoph Hellwig

On Fri, 10 Jan 2014 06:41:03 +0800
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> I think the sane short term fix is to make the kfree() of the i_security
> member be a rcu free, and not clear the member.

You mean my first patch?

https://lkml.org/lkml/2014/1/9/349

-- Steve

> 
> Not pretty, but should did this case..
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] vfs: Fix possible NULL pointer dereference in inode_permission()
  2014-01-09 23:25         ` Steven Rostedt
@ 2014-01-09 23:27           ` Steven Rostedt
  2014-01-09 23:37             ` Eric Paris
  2014-01-09 23:45             ` Paul E. McKenney
  0 siblings, 2 replies; 23+ messages in thread
From: Steven Rostedt @ 2014-01-09 23:27 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Al Viro, Paul McKenney, Dave Chinner, linux-fsdevel,
	James Morris, Andrew Morton, Stephen Smalley, Theodore Ts'o,
	Eric Paris, stable, Paul Moore, LKML, Matthew Wilcox,
	Christoph Hellwig

On Thu, 9 Jan 2014 18:25:23 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Fri, 10 Jan 2014 06:41:03 +0800
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> > I think the sane short term fix is to make the kfree() of the i_security
> > member be a rcu free, and not clear the member.
> 
> You mean my first patch?
> 
> https://lkml.org/lkml/2014/1/9/349
> 

Oh wait, you said not to clear the member. Thus, the patch would look
like this:

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Index: linux-trace.git/security/selinux/hooks.c
===================================================================
--- linux-trace.git.orig/security/selinux/hooks.c
+++ linux-trace.git/security/selinux/hooks.c
@@ -234,6 +234,14 @@ static int inode_alloc_security(struct i
 	return 0;
 }
 
+static void inode_free_rcu(struct rcu_head *head)
+{
+	struct inode_security_struct *isec;
+
+	isec = container_of(head, struct inode_security_struct, rcu);
+	kmem_cache_free(sel_inode_cache, isec);
+}
+
 static void inode_free_security(struct inode *inode)
 {
 	struct inode_security_struct *isec = inode->i_security;
@@ -244,8 +252,7 @@ static void inode_free_security(struct i
 		list_del_init(&isec->list);
 	spin_unlock(&sbsec->isec_lock);
 
-	inode->i_security = NULL;
-	kmem_cache_free(sel_inode_cache, isec);
+	call_rcu(&isec->rcu, inode_free_rcu);
 }
 
 static int file_alloc_security(struct file *file)
Index: linux-trace.git/security/selinux/include/objsec.h
===================================================================
--- linux-trace.git.orig/security/selinux/include/objsec.h
+++ linux-trace.git/security/selinux/include/objsec.h
@@ -38,7 +38,10 @@ struct task_security_struct {
 
 struct inode_security_struct {
 	struct inode *inode;	/* back pointer to inode object */
-	struct list_head list;	/* list of inode_security_struct */
+	union {
+		struct list_head list;	/* list of inode_security_struct */
+		struct rcu_head rcu;	/* for freeing the inode_security_struct */
+	};
 	u32 task_sid;		/* SID of creating task */
 	u32 sid;		/* SID of this object */
 	u16 sclass;		/* security class of this object */

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] vfs: Fix possible NULL pointer dereference in inode_permission()
  2014-01-09 23:27           ` Steven Rostedt
@ 2014-01-09 23:37             ` Eric Paris
  2014-01-09 23:45               ` Steven Rostedt
  2014-01-09 23:53               ` Linus Torvalds
  2014-01-09 23:45             ` Paul E. McKenney
  1 sibling, 2 replies; 23+ messages in thread
From: Eric Paris @ 2014-01-09 23:37 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Linus Torvalds, Al Viro, Paul McKenney, Dave Chinner,
	linux-fsdevel, James Morris, Andrew Morton, Stephen Smalley,
	Theodore Ts'o, stable, Paul Moore, LKML, Matthew Wilcox,
	Christoph Hellwig

On Thu, 2014-01-09 at 18:27 -0500, Steven Rostedt wrote:
> On Thu, 9 Jan 2014 18:25:23 -0500
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > On Fri, 10 Jan 2014 06:41:03 +0800
> > Linus Torvalds <torvalds@linux-foundation.org> wrote:
> > 
> > > I think the sane short term fix is to make the kfree() of the i_security
> > > member be a rcu free, and not clear the member.
> > 
> > You mean my first patch?
> > 
> > https://lkml.org/lkml/2014/1/9/349
> > 
> 
> Oh wait, you said not to clear the member. Thus, the patch would look
> like this:
> 
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

SMACK also needs this change somehow in smack_inode_free_security()

but at least from an SELinux PoV, I think it's quick and easy, but wrong
for maintainability...


> Index: linux-trace.git/security/selinux/hooks.c
> ===================================================================
> --- linux-trace.git.orig/security/selinux/hooks.c
> +++ linux-trace.git/security/selinux/hooks.c
> @@ -234,6 +234,14 @@ static int inode_alloc_security(struct i
>  	return 0;
>  }
>  
> +static void inode_free_rcu(struct rcu_head *head)
> +{
> +	struct inode_security_struct *isec;
> +
> +	isec = container_of(head, struct inode_security_struct, rcu);
> +	kmem_cache_free(sel_inode_cache, isec);
> +}
> +
>  static void inode_free_security(struct inode *inode)
>  {
>  	struct inode_security_struct *isec = inode->i_security;
> @@ -244,8 +252,7 @@ static void inode_free_security(struct i
>  		list_del_init(&isec->list);
>  	spin_unlock(&sbsec->isec_lock);
>  
> -	inode->i_security = NULL;
> -	kmem_cache_free(sel_inode_cache, isec);
> +	call_rcu(&isec->rcu, inode_free_rcu);
>  }
>  
>  static int file_alloc_security(struct file *file)
> Index: linux-trace.git/security/selinux/include/objsec.h
> ===================================================================
> --- linux-trace.git.orig/security/selinux/include/objsec.h
> +++ linux-trace.git/security/selinux/include/objsec.h
> @@ -38,7 +38,10 @@ struct task_security_struct {
>  
>  struct inode_security_struct {
>  	struct inode *inode;	/* back pointer to inode object */
> -	struct list_head list;	/* list of inode_security_struct */
> +	union {
> +		struct list_head list;	/* list of inode_security_struct */
> +		struct rcu_head rcu;	/* for freeing the inode_security_struct */
> +	};
>  	u32 task_sid;		/* SID of creating task */
>  	u32 sid;		/* SID of this object */
>  	u16 sclass;		/* security class of this object */



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] vfs: Fix possible NULL pointer dereference in inode_permission()
  2014-01-09 23:27           ` Steven Rostedt
  2014-01-09 23:37             ` Eric Paris
@ 2014-01-09 23:45             ` Paul E. McKenney
  2014-01-09 23:59               ` Steven Rostedt
  1 sibling, 1 reply; 23+ messages in thread
From: Paul E. McKenney @ 2014-01-09 23:45 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Linus Torvalds, Al Viro, Dave Chinner, linux-fsdevel,
	James Morris, Andrew Morton, Stephen Smalley, Theodore Ts'o,
	Eric Paris, stable, Paul Moore, LKML, Matthew Wilcox,
	Christoph Hellwig

On Thu, Jan 09, 2014 at 06:27:56PM -0500, Steven Rostedt wrote:
> On Thu, 9 Jan 2014 18:25:23 -0500
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > On Fri, 10 Jan 2014 06:41:03 +0800
> > Linus Torvalds <torvalds@linux-foundation.org> wrote:
> > 
> > > I think the sane short term fix is to make the kfree() of the i_security
> > > member be a rcu free, and not clear the member.
> > 
> > You mean my first patch?
> > 
> > https://lkml.org/lkml/2014/1/9/349
> > 
> 
> Oh wait, you said not to clear the member. Thus, the patch would look
> like this:
> 
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> 
> Index: linux-trace.git/security/selinux/hooks.c
> ===================================================================
> --- linux-trace.git.orig/security/selinux/hooks.c
> +++ linux-trace.git/security/selinux/hooks.c
> @@ -234,6 +234,14 @@ static int inode_alloc_security(struct i
>  	return 0;
>  }
> 
> +static void inode_free_rcu(struct rcu_head *head)
> +{
> +	struct inode_security_struct *isec;
> +
> +	isec = container_of(head, struct inode_security_struct, rcu);
> +	kmem_cache_free(sel_inode_cache, isec);
> +}
> +
>  static void inode_free_security(struct inode *inode)
>  {
>  	struct inode_security_struct *isec = inode->i_security;
> @@ -244,8 +252,7 @@ static void inode_free_security(struct i
>  		list_del_init(&isec->list);
>  	spin_unlock(&sbsec->isec_lock);
> 
> -	inode->i_security = NULL;
> -	kmem_cache_free(sel_inode_cache, isec);
> +	call_rcu(&isec->rcu, inode_free_rcu);

Does not clearing ->i_security mean that RCU readers can traverse
this pointer after the invocation of call_rcu()?  If so, this is
problematic.  (If something else already prevents readers from getting
here, no problem.)

							Thanx, Paul

>  }
> 
>  static int file_alloc_security(struct file *file)
> Index: linux-trace.git/security/selinux/include/objsec.h
> ===================================================================
> --- linux-trace.git.orig/security/selinux/include/objsec.h
> +++ linux-trace.git/security/selinux/include/objsec.h
> @@ -38,7 +38,10 @@ struct task_security_struct {
> 
>  struct inode_security_struct {
>  	struct inode *inode;	/* back pointer to inode object */
> -	struct list_head list;	/* list of inode_security_struct */
> +	union {
> +		struct list_head list;	/* list of inode_security_struct */
> +		struct rcu_head rcu;	/* for freeing the inode_security_struct */
> +	};
>  	u32 task_sid;		/* SID of creating task */
>  	u32 sid;		/* SID of this object */
>  	u16 sclass;		/* security class of this object */
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] vfs: Fix possible NULL pointer dereference in inode_permission()
  2014-01-09 23:37             ` Eric Paris
@ 2014-01-09 23:45               ` Steven Rostedt
  2014-01-09 23:53               ` Linus Torvalds
  1 sibling, 0 replies; 23+ messages in thread
From: Steven Rostedt @ 2014-01-09 23:45 UTC (permalink / raw)
  To: Eric Paris
  Cc: Linus Torvalds, Al Viro, Paul McKenney, Dave Chinner,
	linux-fsdevel, James Morris, Andrew Morton, Stephen Smalley,
	Theodore Ts'o, stable, Paul Moore, LKML, Matthew Wilcox,
	Christoph Hellwig

On Thu, 09 Jan 2014 18:37:06 -0500
Eric Paris <eparis@redhat.com> wrote:


> > Oh wait, you said not to clear the member. Thus, the patch would look
> > like this:
> > 
> > Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> 
> SMACK also needs this change somehow in smack_inode_free_security()
> 
> but at least from an SELinux PoV, I think it's quick and easy, but wrong
> for maintainability...

I agree, but for this late in the -rc window, this may be the best
thing for now. We can think of a better solution for the future.

-- Steve

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] vfs: Fix possible NULL pointer dereference in inode_permission()
  2014-01-09 23:37             ` Eric Paris
  2014-01-09 23:45               ` Steven Rostedt
@ 2014-01-09 23:53               ` Linus Torvalds
  2014-01-10  0:06                 ` Al Viro
  1 sibling, 1 reply; 23+ messages in thread
From: Linus Torvalds @ 2014-01-09 23:53 UTC (permalink / raw)
  To: Eric Paris
  Cc: Steven Rostedt, Al Viro, Paul McKenney, Dave Chinner,
	linux-fsdevel, James Morris, Andrew Morton, Stephen Smalley,
	Theodore Ts'o, stable, Paul Moore, LKML, Matthew Wilcox,
	Christoph Hellwig

On Fri, Jan 10, 2014 at 7:37 AM, Eric Paris <eparis@redhat.com> wrote:
>
> but at least from an SELinux PoV, I think it's quick and easy, but wrong
> for maintainability...

Yeah, it's a hack, and it's wrong, and we should figure out how to do
it right. Likely we should just tie the lifetime of the i_security
member directly to the lifetime of the inode itself, and just make the
rule be that security_inode_free() gets called from whatever frees the
inode itself, and *not* have an extra rcu callback etc. But that
sounds like a bigger change than I'm comfy with right now, so the
hacky one might be the band-aid to do for stable..

The problem, of course, is that all the different filesystems have
their own inode allocations/freeing. Of course, they all tend to share
the same pattern ("call_rcu xyz_i_callback"), so maybe we could try to
make that a more generic thing? Like have a "free_inode" vfs callback,
and do the call_rcu delaying at the VFS level..

And maybe, just maybe, we could just say that that is what
"destroy_inode()" is, and that we will just call it from rcu context.
All the IO has hopefully been done earlier  Yes/no?

                 Linus

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] vfs: Fix possible NULL pointer dereference in inode_permission()
  2014-01-09 23:45             ` Paul E. McKenney
@ 2014-01-09 23:59               ` Steven Rostedt
  2014-01-10  0:44                 ` Paul E. McKenney
  0 siblings, 1 reply; 23+ messages in thread
From: Steven Rostedt @ 2014-01-09 23:59 UTC (permalink / raw)
  To: paulmck
  Cc: Linus Torvalds, Al Viro, Dave Chinner, linux-fsdevel,
	James Morris, Andrew Morton, Stephen Smalley, Theodore Ts'o,
	Eric Paris, stable, Paul Moore, LKML, Matthew Wilcox,
	Christoph Hellwig

On Thu, 9 Jan 2014 15:45:37 -0800
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> >  static void inode_free_security(struct inode *inode)
> >  {
> >  	struct inode_security_struct *isec = inode->i_security;
> > @@ -244,8 +252,7 @@ static void inode_free_security(struct i
> >  		list_del_init(&isec->list);
> >  	spin_unlock(&sbsec->isec_lock);
> > 
> > -	inode->i_security = NULL;
> > -	kmem_cache_free(sel_inode_cache, isec);
> > +	call_rcu(&isec->rcu, inode_free_rcu);
> 
> Does not clearing ->i_security mean that RCU readers can traverse
> this pointer after the invocation of call_rcu()?  If so, this is
> problematic.  (If something else already prevents readers from getting
> here, no problem.)

This is called when we are about to free the inode. Look at
destroy_inode(). Basically, this is the same as doing:

	call_rcu(&isec->rcu, inode_free_rcu);
	call_rcu(&inode->i_rcu, i_callback);

Where i_callback() does the free of the inode.

If you can access inode->i_security, after a call_rcu, then you can
also access the inode itself that has just been freed.

Yes, technically, having two separate call_rcu(), the first grace
period can end before the second, but everything to remove the inode
from sight has already been set up before that first call_rcu() is
made. That means when the first call_rcu() is executed, the inode
should already be invisible to the readers.


- Steve

> 
> 							Thanx, Paul
> 
> >  }
> > 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] vfs: Fix possible NULL pointer dereference in inode_permission()
  2014-01-09 23:53               ` Linus Torvalds
@ 2014-01-10  0:06                 ` Al Viro
  2014-01-10  0:09                   ` Al Viro
                                     ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Al Viro @ 2014-01-10  0:06 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Eric Paris, Steven Rostedt, Paul McKenney, Dave Chinner,
	linux-fsdevel, James Morris, Andrew Morton, Stephen Smalley,
	Theodore Ts'o, stable, Paul Moore, LKML, Matthew Wilcox,
	Christoph Hellwig

On Fri, Jan 10, 2014 at 07:53:41AM +0800, Linus Torvalds wrote:
> On Fri, Jan 10, 2014 at 7:37 AM, Eric Paris <eparis@redhat.com> wrote:
> >
> > but at least from an SELinux PoV, I think it's quick and easy, but wrong
> > for maintainability...
> 
> Yeah, it's a hack, and it's wrong, and we should figure out how to do
> it right. Likely we should just tie the lifetime of the i_security
> member directly to the lifetime of the inode itself, and just make the
> rule be that security_inode_free() gets called from whatever frees the
> inode itself, and *not* have an extra rcu callback etc. But that
> sounds like a bigger change than I'm comfy with right now, so the
> hacky one might be the band-aid to do for stable..
> 
> The problem, of course, is that all the different filesystems have
> their own inode allocations/freeing. Of course, they all tend to share
> the same pattern ("call_rcu xyz_i_callback"), so maybe we could try to
> make that a more generic thing? Like have a "free_inode" vfs callback,
> and do the call_rcu delaying at the VFS level..
> 
> And maybe, just maybe, we could just say that that is what
> "destroy_inode()" is, and that we will just call it from rcu context.
> All the IO has hopefully been done earlier  Yes/no?

Check what XFS is doing ;-/  That's where those call_rcu() have come from.
Sure, we can separate the simple "just do call_rcu(...->free_inode)" case
and hit it whenever full ->free_inode is there and ->destroy_inode isn't.
Not too pretty, but removal of tons of boilerplate might be worth doing
that anyway.  But ->destroy_inode() is still needed for cases where fs
has its own idea of inode lifetime rules.  Again, check what XFS is doing
in that area...

There's an extra source of headache, BTW - what about the "LSM stacking"
crowd and their plans?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] vfs: Fix possible NULL pointer dereference in inode_permission()
  2014-01-10  0:06                 ` Al Viro
@ 2014-01-10  0:09                   ` Al Viro
  2014-01-10  0:18                   ` Linus Torvalds
  2014-01-10  9:31                     ` Christoph Hellwig
  2 siblings, 0 replies; 23+ messages in thread
From: Al Viro @ 2014-01-10  0:09 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Eric Paris, Steven Rostedt, Paul McKenney, Dave Chinner,
	linux-fsdevel, James Morris, Andrew Morton, Stephen Smalley,
	Theodore Ts'o, stable, Paul Moore, LKML, Matthew Wilcox,
	Christoph Hellwig

On Fri, Jan 10, 2014 at 12:06:42AM +0000, Al Viro wrote:

> and hit it whenever full ->free_inode is there and ->destroy_inode isn't.

Grr... "whenever ->free_inode is there and full ->destroy_inode isn't".
Apologies for sloppy editing...

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] vfs: Fix possible NULL pointer dereference in inode_permission()
  2014-01-10  0:06                 ` Al Viro
  2014-01-10  0:09                   ` Al Viro
@ 2014-01-10  0:18                   ` Linus Torvalds
  2014-01-10  2:36                     ` James Morris
  2014-01-10  9:31                     ` Christoph Hellwig
  2 siblings, 1 reply; 23+ messages in thread
From: Linus Torvalds @ 2014-01-10  0:18 UTC (permalink / raw)
  To: Al Viro
  Cc: Eric Paris, Steven Rostedt, Paul McKenney, Dave Chinner,
	linux-fsdevel, James Morris, Andrew Morton, Stephen Smalley,
	Theodore Ts'o, stable, Paul Moore, LKML, Matthew Wilcox,
	Christoph Hellwig

On Fri, Jan 10, 2014 at 8:06 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Sure, we can separate the simple "just do call_rcu(...->free_inode)" case
> and hit it whenever full ->free_inode is there and ->destroy_inode isn't.
> Not too pretty, but removal of tons of boilerplate might be worth doing
> that anyway.

Yeah.

> But ->destroy_inode() is still needed for cases where fs
> has its own idea of inode lifetime rules.  Again, check what XFS is doing
> in that area...

Ok, so we can't change destroy_inode, and we'd need to add a new op
for just freeing it.

Painful mainly because there are so many filesystems, but it shouldn't
be *complicated*.

> There's an extra source of headache, BTW - what about the "LSM stacking"
> crowd and their plans?

LSM stacking is a pipedream right  now anyway, isn't it? It's been
talked about for years and years, I've never seen a patch-set that is
even remotely something we'd seriously consider.

            Linus

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] vfs: Fix possible NULL pointer dereference in inode_permission()
  2014-01-09 23:59               ` Steven Rostedt
@ 2014-01-10  0:44                 ` Paul E. McKenney
  0 siblings, 0 replies; 23+ messages in thread
From: Paul E. McKenney @ 2014-01-10  0:44 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Linus Torvalds, Al Viro, Dave Chinner, linux-fsdevel,
	James Morris, Andrew Morton, Stephen Smalley, Theodore Ts'o,
	Eric Paris, stable, Paul Moore, LKML, Matthew Wilcox,
	Christoph Hellwig

On Thu, Jan 09, 2014 at 06:59:07PM -0500, Steven Rostedt wrote:
> On Thu, 9 Jan 2014 15:45:37 -0800
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > >  static void inode_free_security(struct inode *inode)
> > >  {
> > >  	struct inode_security_struct *isec = inode->i_security;
> > > @@ -244,8 +252,7 @@ static void inode_free_security(struct i
> > >  		list_del_init(&isec->list);
> > >  	spin_unlock(&sbsec->isec_lock);
> > > 
> > > -	inode->i_security = NULL;
> > > -	kmem_cache_free(sel_inode_cache, isec);
> > > +	call_rcu(&isec->rcu, inode_free_rcu);
> > 
> > Does not clearing ->i_security mean that RCU readers can traverse
> > this pointer after the invocation of call_rcu()?  If so, this is
> > problematic.  (If something else already prevents readers from getting
> > here, no problem.)
> 
> This is called when we are about to free the inode. Look at
> destroy_inode(). Basically, this is the same as doing:
> 
> 	call_rcu(&isec->rcu, inode_free_rcu);
> 	call_rcu(&inode->i_rcu, i_callback);
> 
> Where i_callback() does the free of the inode.
> 
> If you can access inode->i_security, after a call_rcu, then you can
> also access the inode itself that has just been freed.
> 
> Yes, technically, having two separate call_rcu(), the first grace
> period can end before the second, but everything to remove the inode
> from sight has already been set up before that first call_rcu() is
> made. That means when the first call_rcu() is executed, the inode
> should already be invisible to the readers.

Got it, should be fine then, sorry for the noise.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] vfs: Fix possible NULL pointer dereference in inode_permission()
  2014-01-10  0:18                   ` Linus Torvalds
@ 2014-01-10  2:36                     ` James Morris
  0 siblings, 0 replies; 23+ messages in thread
From: James Morris @ 2014-01-10  2:36 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Al Viro, Eric Paris, Steven Rostedt, Paul McKenney, Dave Chinner,
	linux-fsdevel, James Morris, Andrew Morton, Stephen Smalley,
	Theodore Ts'o, stable, Paul Moore, LKML, Matthew Wilcox,
	Christoph Hellwig

On Fri, 10 Jan 2014, Linus Torvalds wrote:

> > There's an extra source of headache, BTW - what about the "LSM stacking"
> > crowd and their plans?
> 
> LSM stacking is a pipedream right  now anyway, isn't it? It's been
> talked about for years and years, I've never seen a patch-set that is
> even remotely something we'd seriously consider.
> 

And there's no good justification for it.


-- 
James Morris
<jmorris@namei.org>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] vfs: Fix possible NULL pointer dereference in inode_permission()
  2014-01-10  0:06                 ` Al Viro
@ 2014-01-10  9:31                     ` Christoph Hellwig
  2014-01-10  0:18                   ` Linus Torvalds
  2014-01-10  9:31                     ` Christoph Hellwig
  2 siblings, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2014-01-10  9:31 UTC (permalink / raw)
  To: Al Viro
  Cc: Linus Torvalds, Eric Paris, Steven Rostedt, Paul McKenney,
	Dave Chinner, linux-fsdevel, James Morris, Andrew Morton,
	Stephen Smalley, Theodore Ts'o, stable, Paul Moore, LKML,
	Matthew Wilcox, xfs

On Fri, Jan 10, 2014 at 12:06:42AM +0000, Al Viro wrote:
> Check what XFS is doing ;-/  That's where those call_rcu() have come from.
> Sure, we can separate the simple "just do call_rcu(...->free_inode)" case
> and hit it whenever full ->free_inode is there and ->destroy_inode isn't.
> Not too pretty, but removal of tons of boilerplate might be worth doing
> that anyway.  But ->destroy_inode() is still needed for cases where fs
> has its own idea of inode lifetime rules.  Again, check what XFS is doing
> in that area...

Btw, I'd really love to get rid of the XFS ->destroy_inode abuse, it's
been a long time thorn in the flesh.

What's really needed there to make XFS behave more similar to everyone
else is a way for the filesystem to say: "I can't actually free this
inode right now, but I'll come back to you later".  That's what we
actually do right now, except we pretend that the VFS inode gets freed,
while its memory lives on (punt intended).


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] vfs: Fix possible NULL pointer dereference in inode_permission()
@ 2014-01-10  9:31                     ` Christoph Hellwig
  0 siblings, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2014-01-10  9:31 UTC (permalink / raw)
  To: Al Viro
  Cc: Theodore Ts'o, Paul Moore, Matthew Wilcox, Stephen Smalley,
	LKML, Steven Rostedt, xfs, Eric Paris, James Morris,
	linux-fsdevel, stable, Andrew Morton, Paul McKenney,
	Linus Torvalds

On Fri, Jan 10, 2014 at 12:06:42AM +0000, Al Viro wrote:
> Check what XFS is doing ;-/  That's where those call_rcu() have come from.
> Sure, we can separate the simple "just do call_rcu(...->free_inode)" case
> and hit it whenever full ->free_inode is there and ->destroy_inode isn't.
> Not too pretty, but removal of tons of boilerplate might be worth doing
> that anyway.  But ->destroy_inode() is still needed for cases where fs
> has its own idea of inode lifetime rules.  Again, check what XFS is doing
> in that area...

Btw, I'd really love to get rid of the XFS ->destroy_inode abuse, it's
been a long time thorn in the flesh.

What's really needed there to make XFS behave more similar to everyone
else is a way for the filesystem to say: "I can't actually free this
inode right now, but I'll come back to you later".  That's what we
actually do right now, except we pretend that the VFS inode gets freed,
while its memory lives on (punt intended).

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] vfs: Fix possible NULL pointer dereference in inode_permission()
  2014-01-10  9:31                     ` Christoph Hellwig
@ 2014-01-10 18:14                       ` Ben Myers
  -1 siblings, 0 replies; 23+ messages in thread
From: Ben Myers @ 2014-01-10 18:14 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Theodore Ts'o, Paul Moore, Matthew Wilcox,
	Stephen Smalley, LKML, Steven Rostedt, xfs, Eric Paris,
	James Morris, linux-fsdevel, stable, Andrew Morton,
	Paul McKenney, Linus Torvalds

Christoph,

On Fri, Jan 10, 2014 at 01:31:48AM -0800, Christoph Hellwig wrote:
> On Fri, Jan 10, 2014 at 12:06:42AM +0000, Al Viro wrote:
> > Check what XFS is doing ;-/  That's where those call_rcu() have come from.
> > Sure, we can separate the simple "just do call_rcu(...->free_inode)" case
> > and hit it whenever full ->free_inode is there and ->destroy_inode isn't.
> > Not too pretty, but removal of tons of boilerplate might be worth doing
> > that anyway.  But ->destroy_inode() is still needed for cases where fs
> > has its own idea of inode lifetime rules.  Again, check what XFS is doing
> > in that area...
> 
> Btw, I'd really love to get rid of the XFS ->destroy_inode abuse, it's
> been a long time thorn in the flesh.

I believe this behavior is related to freeing of an inode cluster.

> What's really needed there to make XFS behave more similar to everyone
> else is a way for the filesystem to say: "I can't actually free this
> inode right now, but I'll come back to you later".

This test might read something like:  "If my link count has gone to zero, and I
am the last inode in my cluster to be freed, and there are other inodes from my
cluster incore, I cannot be freed."

Should be doable.  Maybe there are other reasons.

-Ben

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] vfs: Fix possible NULL pointer dereference in inode_permission()
@ 2014-01-10 18:14                       ` Ben Myers
  0 siblings, 0 replies; 23+ messages in thread
From: Ben Myers @ 2014-01-10 18:14 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Theodore Ts'o, Paul Moore, Matthew Wilcox, Linus Torvalds,
	Eric Paris, LKML, Steven Rostedt, xfs, Al Viro, James Morris,
	linux-fsdevel, stable, Andrew Morton, Paul McKenney,
	Stephen Smalley

Christoph,

On Fri, Jan 10, 2014 at 01:31:48AM -0800, Christoph Hellwig wrote:
> On Fri, Jan 10, 2014 at 12:06:42AM +0000, Al Viro wrote:
> > Check what XFS is doing ;-/  That's where those call_rcu() have come from.
> > Sure, we can separate the simple "just do call_rcu(...->free_inode)" case
> > and hit it whenever full ->free_inode is there and ->destroy_inode isn't.
> > Not too pretty, but removal of tons of boilerplate might be worth doing
> > that anyway.  But ->destroy_inode() is still needed for cases where fs
> > has its own idea of inode lifetime rules.  Again, check what XFS is doing
> > in that area...
> 
> Btw, I'd really love to get rid of the XFS ->destroy_inode abuse, it's
> been a long time thorn in the flesh.

I believe this behavior is related to freeing of an inode cluster.

> What's really needed there to make XFS behave more similar to everyone
> else is a way for the filesystem to say: "I can't actually free this
> inode right now, but I'll come back to you later".

This test might read something like:  "If my link count has gone to zero, and I
am the last inode in my cluster to be freed, and there are other inodes from my
cluster incore, I cannot be freed."

Should be doable.  Maybe there are other reasons.

-Ben

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] vfs: Fix possible NULL pointer dereference in inode_permission()
  2014-01-10 18:14                       ` Ben Myers
@ 2014-01-11 10:32                         ` Christoph Hellwig
  -1 siblings, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2014-01-11 10:32 UTC (permalink / raw)
  To: Ben Myers
  Cc: Christoph Hellwig, Al Viro, Theodore Ts'o, Paul Moore,
	Matthew Wilcox, Stephen Smalley, LKML, Steven Rostedt, xfs,
	Eric Paris, James Morris, linux-fsdevel, stable, Andrew Morton,
	Paul McKenney, Linus Torvalds

On Fri, Jan 10, 2014 at 12:14:34PM -0600, Ben Myers wrote:
> > What's really needed there to make XFS behave more similar to everyone
> > else is a way for the filesystem to say: "I can't actually free this
> > inode right now, but I'll come back to you later".
> 
> This test might read something like:  "If my link count has gone to zero, and I
> am the last inode in my cluster to be freed, and there are other inodes from my
> cluster incore, I cannot be freed."

It's more complicated than that.  In theory we would free the inode
easily as soon as the VFS wants it, but performance would be horrible
as we would have to synchronously write back the inode.  Note that it
really matters for the interface, that just needs to be an:  I won't
free this right now, but I'll call you back later when I can.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] vfs: Fix possible NULL pointer dereference in inode_permission()
@ 2014-01-11 10:32                         ` Christoph Hellwig
  0 siblings, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2014-01-11 10:32 UTC (permalink / raw)
  To: Ben Myers
  Cc: Theodore Ts'o, Paul Moore, Matthew Wilcox, Linus Torvalds,
	Eric Paris, LKML, Steven Rostedt, xfs, Christoph Hellwig,
	Al Viro, James Morris, linux-fsdevel, stable, Andrew Morton,
	Paul McKenney, Stephen Smalley

On Fri, Jan 10, 2014 at 12:14:34PM -0600, Ben Myers wrote:
> > What's really needed there to make XFS behave more similar to everyone
> > else is a way for the filesystem to say: "I can't actually free this
> > inode right now, but I'll come back to you later".
> 
> This test might read something like:  "If my link count has gone to zero, and I
> am the last inode in my cluster to be freed, and there are other inodes from my
> cluster incore, I cannot be freed."

It's more complicated than that.  In theory we would free the inode
easily as soon as the VFS wants it, but performance would be horrible
as we would have to synchronously write back the inode.  Note that it
really matters for the interface, that just needs to be an:  I won't
free this right now, but I'll call you back later when I can.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2014-01-11 10:32 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-09 21:27 [PATCH] vfs: Fix possible NULL pointer dereference in inode_permission() Steven Rostedt
2014-01-09 21:42 ` Matthew Wilcox
2014-01-09 21:50   ` Steven Rostedt
2014-01-09 22:31     ` Al Viro
     [not found]       ` <CA+55aFzCTPYEQCPnLBi1CwmMTocVqCFiCuJ391HkVx1CMw61ug@mail.gmail.com>
2014-01-09 23:10         ` Paul E. McKenney
2014-01-09 23:25         ` Steven Rostedt
2014-01-09 23:27           ` Steven Rostedt
2014-01-09 23:37             ` Eric Paris
2014-01-09 23:45               ` Steven Rostedt
2014-01-09 23:53               ` Linus Torvalds
2014-01-10  0:06                 ` Al Viro
2014-01-10  0:09                   ` Al Viro
2014-01-10  0:18                   ` Linus Torvalds
2014-01-10  2:36                     ` James Morris
2014-01-10  9:31                   ` Christoph Hellwig
2014-01-10  9:31                     ` Christoph Hellwig
2014-01-10 18:14                     ` Ben Myers
2014-01-10 18:14                       ` Ben Myers
2014-01-11 10:32                       ` Christoph Hellwig
2014-01-11 10:32                         ` Christoph Hellwig
2014-01-09 23:45             ` Paul E. McKenney
2014-01-09 23:59               ` Steven Rostedt
2014-01-10  0:44                 ` Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.