linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ceph: Fix use-after-free in __ceph_remove_cap
@ 2019-10-17 14:46 Luis Henriques
  2019-10-21 12:38 ` Jeff Layton
  0 siblings, 1 reply; 9+ messages in thread
From: Luis Henriques @ 2019-10-17 14:46 UTC (permalink / raw)
  To: Jeff Layton, Sage Weil, Ilya Dryomov
  Cc: ceph-devel, linux-kernel, Luis Henriques

KASAN reports a use-after-free when running xfstest generic/531, with the
following trace:

[  293.903362]  kasan_report+0xe/0x20
[  293.903365]  rb_erase+0x1f/0x790
[  293.903370]  __ceph_remove_cap+0x201/0x370
[  293.903375]  __ceph_remove_caps+0x4b/0x70
[  293.903380]  ceph_evict_inode+0x4e/0x360
[  293.903386]  evict+0x169/0x290
[  293.903390]  __dentry_kill+0x16f/0x250
[  293.903394]  dput+0x1c6/0x440
[  293.903398]  __fput+0x184/0x330
[  293.903404]  task_work_run+0xb9/0xe0
[  293.903410]  exit_to_usermode_loop+0xd3/0xe0
[  293.903413]  do_syscall_64+0x1a0/0x1c0
[  293.903417]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

This happens because __ceph_remove_cap() may queue a cap release
(__ceph_queue_cap_release) which can be scheduled before that cap is
removed from the inode list with

	rb_erase(&cap->ci_node, &ci->i_caps);

And, when this finally happens, the use-after-free will occur.

This can be fixed by protecting the rb_erase with the s_cap_lock spinlock,
which is used by ceph_send_cap_releases(), before the cap is freed.

Signed-off-by: Luis Henriques <lhenriques@suse.com>
---
 fs/ceph/caps.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index d3b9c9d5c1bd..21ee38cabe98 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -1089,13 +1089,13 @@ void __ceph_remove_cap(struct ceph_cap *cap, bool queue_release)
 	}
 	cap->cap_ino = ci->i_vino.ino;
 
-	spin_unlock(&session->s_cap_lock);
-
 	/* remove from inode list */
 	rb_erase(&cap->ci_node, &ci->i_caps);
 	if (ci->i_auth_cap == cap)
 		ci->i_auth_cap = NULL;
 
+	spin_unlock(&session->s_cap_lock);
+
 	if (removed)
 		ceph_put_cap(mdsc, cap);
 

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] ceph: Fix use-after-free in __ceph_remove_cap
  2019-10-17 14:46 [PATCH] ceph: Fix use-after-free in __ceph_remove_cap Luis Henriques
@ 2019-10-21 12:38 ` Jeff Layton
  2019-10-21 14:51   ` Luis Henriques
  0 siblings, 1 reply; 9+ messages in thread
From: Jeff Layton @ 2019-10-21 12:38 UTC (permalink / raw)
  To: Luis Henriques, Sage Weil, Ilya Dryomov; +Cc: ceph-devel, linux-kernel

On Thu, 2019-10-17 at 15:46 +0100, Luis Henriques wrote:
> KASAN reports a use-after-free when running xfstest generic/531, with the
> following trace:
> 
> [  293.903362]  kasan_report+0xe/0x20
> [  293.903365]  rb_erase+0x1f/0x790
> [  293.903370]  __ceph_remove_cap+0x201/0x370
> [  293.903375]  __ceph_remove_caps+0x4b/0x70
> [  293.903380]  ceph_evict_inode+0x4e/0x360
> [  293.903386]  evict+0x169/0x290
> [  293.903390]  __dentry_kill+0x16f/0x250
> [  293.903394]  dput+0x1c6/0x440
> [  293.903398]  __fput+0x184/0x330
> [  293.903404]  task_work_run+0xb9/0xe0
> [  293.903410]  exit_to_usermode_loop+0xd3/0xe0
> [  293.903413]  do_syscall_64+0x1a0/0x1c0
> [  293.903417]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> This happens because __ceph_remove_cap() may queue a cap release
> (__ceph_queue_cap_release) which can be scheduled before that cap is
> removed from the inode list with
> 
> 	rb_erase(&cap->ci_node, &ci->i_caps);
> 
> And, when this finally happens, the use-after-free will occur.
> 
> This can be fixed by protecting the rb_erase with the s_cap_lock spinlock,
> which is used by ceph_send_cap_releases(), before the cap is freed.
> 
> Signed-off-by: Luis Henriques <lhenriques@suse.com>
> ---
>  fs/ceph/caps.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
> index d3b9c9d5c1bd..21ee38cabe98 100644
> --- a/fs/ceph/caps.c
> +++ b/fs/ceph/caps.c
> @@ -1089,13 +1089,13 @@ void __ceph_remove_cap(struct ceph_cap *cap, bool queue_release)
>  	}
>  	cap->cap_ino = ci->i_vino.ino;
>  
> -	spin_unlock(&session->s_cap_lock);
> -
>  	/* remove from inode list */
>  	rb_erase(&cap->ci_node, &ci->i_caps);
>  	if (ci->i_auth_cap == cap)
>  		ci->i_auth_cap = NULL;
>  
> +	spin_unlock(&session->s_cap_lock);
> +
>  	if (removed)
>  		ceph_put_cap(mdsc, cap);
>  

Is there any reason we need to wait until this point to remove it from
the rbtree? ISTM that we ought to just do that at the beginning of the
function, before we take the s_cap_lock.
-- 
Jeff Layton <jlayton@kernel.org>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ceph: Fix use-after-free in __ceph_remove_cap
  2019-10-21 12:38 ` Jeff Layton
@ 2019-10-21 14:51   ` Luis Henriques
       [not found]     ` <CAAM7YA=dg8ufUWqrD_V8pSdvxrnU+knOW4uW4io_b=Lwjhpg5Q@mail.gmail.com>
  2019-10-23 18:47     ` Jeff Layton
  0 siblings, 2 replies; 9+ messages in thread
From: Luis Henriques @ 2019-10-21 14:51 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Sage Weil, Ilya Dryomov, ceph-devel, linux-kernel


Jeff Layton <jlayton@kernel.org> writes:

> On Thu, 2019-10-17 at 15:46 +0100, Luis Henriques wrote:
>> KASAN reports a use-after-free when running xfstest generic/531, with the
>> following trace:
>>
>> [  293.903362]  kasan_report+0xe/0x20
>> [  293.903365]  rb_erase+0x1f/0x790
>> [  293.903370]  __ceph_remove_cap+0x201/0x370
>> [  293.903375]  __ceph_remove_caps+0x4b/0x70
>> [  293.903380]  ceph_evict_inode+0x4e/0x360
>> [  293.903386]  evict+0x169/0x290
>> [  293.903390]  __dentry_kill+0x16f/0x250
>> [  293.903394]  dput+0x1c6/0x440
>> [  293.903398]  __fput+0x184/0x330
>> [  293.903404]  task_work_run+0xb9/0xe0
>> [  293.903410]  exit_to_usermode_loop+0xd3/0xe0
>> [  293.903413]  do_syscall_64+0x1a0/0x1c0
>> [  293.903417]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>
>> This happens because __ceph_remove_cap() may queue a cap release
>> (__ceph_queue_cap_release) which can be scheduled before that cap is
>> removed from the inode list with
>>
>> 	rb_erase(&cap->ci_node, &ci->i_caps);
>>
>> And, when this finally happens, the use-after-free will occur.
>>
>> This can be fixed by protecting the rb_erase with the s_cap_lock spinlock,
>> which is used by ceph_send_cap_releases(), before the cap is freed.
>>
>> Signed-off-by: Luis Henriques <lhenriques@suse.com>
>> ---
>>  fs/ceph/caps.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
>> index d3b9c9d5c1bd..21ee38cabe98 100644
>> --- a/fs/ceph/caps.c
>> +++ b/fs/ceph/caps.c
>> @@ -1089,13 +1089,13 @@ void __ceph_remove_cap(struct ceph_cap *cap, bool queue_release)
>>  	}
>>  	cap->cap_ino = ci->i_vino.ino;
>>
>> -	spin_unlock(&session->s_cap_lock);
>> -
>>  	/* remove from inode list */
>>  	rb_erase(&cap->ci_node, &ci->i_caps);
>>  	if (ci->i_auth_cap == cap)
>>  		ci->i_auth_cap = NULL;
>>
>> +	spin_unlock(&session->s_cap_lock);
>> +
>>  	if (removed)
>>  		ceph_put_cap(mdsc, cap);
>>
>
> Is there any reason we need to wait until this point to remove it from
> the rbtree? ISTM that we ought to just do that at the beginning of the
> function, before we take the s_cap_lock.

That sounds good to me, at least at a first glace.  I spent some time
looking for any possible issues in the code, and even run a few tests.

However, looking at git log I found commit f818a73674c5 ("ceph: fix cap
removal races"), which moved that rb_erase from the beginning of the
function to it's current position.  So, unless the race mentioned in
this commit has disappeared in the meantime (which is possible, this
commit is from 2010!), this rbtree operation shouldn't be changed.

And I now wonder if my patch isn't introducing a race too...
__ceph_remove_cap() is supposed to always be called with the session
mutex held, except for the ceph_evict_inode() path.  Which is where I'm
seeing the UAF.  So, maybe what's missing here is the s_mutex.  Hmm...

Cheers,
--
Luis

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ceph: Fix use-after-free in __ceph_remove_cap
       [not found]     ` <CAAM7YA=dg8ufUWqrD_V8pSdvxrnU+knOW4uW4io_b=Lwjhpg5Q@mail.gmail.com>
@ 2019-10-22 13:47       ` Luis Henriques
  2019-10-23 10:29         ` Luis Henriques
  0 siblings, 1 reply; 9+ messages in thread
From: Luis Henriques @ 2019-10-22 13:47 UTC (permalink / raw)
  To: Yan, Zheng
  Cc: Jeff Layton, Sage Weil, Ilya Dryomov, ceph-devel,
	Linux Kernel Mailing List

On Tue, Oct 22, 2019 at 08:48:56PM +0800, Yan, Zheng wrote:
> On Mon, Oct 21, 2019 at 10:55 PM Luis Henriques <lhenriques@suse.com> wrote:
> 
> >
> > Jeff Layton <jlayton@kernel.org> writes:
> >
> > > On Thu, 2019-10-17 at 15:46 +0100, Luis Henriques wrote:
> > >> KASAN reports a use-after-free when running xfstest generic/531, with
> > the
> > >> following trace:
> > >>
> > >> [  293.903362]  kasan_report+0xe/0x20
> > >> [  293.903365]  rb_erase+0x1f/0x790
> > >> [  293.903370]  __ceph_remove_cap+0x201/0x370
> > >> [  293.903375]  __ceph_remove_caps+0x4b/0x70
> > >> [  293.903380]  ceph_evict_inode+0x4e/0x360
> > >> [  293.903386]  evict+0x169/0x290
> > >> [  293.903390]  __dentry_kill+0x16f/0x250
> > >> [  293.903394]  dput+0x1c6/0x440
> > >> [  293.903398]  __fput+0x184/0x330
> > >> [  293.903404]  task_work_run+0xb9/0xe0
> > >> [  293.903410]  exit_to_usermode_loop+0xd3/0xe0
> > >> [  293.903413]  do_syscall_64+0x1a0/0x1c0
> > >> [  293.903417]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > >>
> > >> This happens because __ceph_remove_cap() may queue a cap release
> > >> (__ceph_queue_cap_release) which can be scheduled before that cap is
> > >> removed from the inode list with
> > >>
> > >>      rb_erase(&cap->ci_node, &ci->i_caps);
> > >>
> > >> And, when this finally happens, the use-after-free will occur.
> > >>
> > >> This can be fixed by protecting the rb_erase with the s_cap_lock
> > spinlock,
> > >> which is used by ceph_send_cap_releases(), before the cap is freed.
> > >>
> > >> Signed-off-by: Luis Henriques <lhenriques@suse.com>
> > >> ---
> > >>  fs/ceph/caps.c | 4 ++--
> > >>  1 file changed, 2 insertions(+), 2 deletions(-)
> > >>
> > >> diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
> > >> index d3b9c9d5c1bd..21ee38cabe98 100644
> > >> --- a/fs/ceph/caps.c
> > >> +++ b/fs/ceph/caps.c
> > >> @@ -1089,13 +1089,13 @@ void __ceph_remove_cap(struct ceph_cap *cap,
> > bool queue_release)
> > >>      }
> > >>      cap->cap_ino = ci->i_vino.ino;
> > >>
> > >> -    spin_unlock(&session->s_cap_lock);
> > >> -
> > >>      /* remove from inode list */
> > >>      rb_erase(&cap->ci_node, &ci->i_caps);
> > >>      if (ci->i_auth_cap == cap)
> > >>              ci->i_auth_cap = NULL;
> > >>
> > >> +    spin_unlock(&session->s_cap_lock);
> > >> +
> > >>      if (removed)
> > >>              ceph_put_cap(mdsc, cap);
> > >>
> > >
> > > Is there any reason we need to wait until this point to remove it from
> > > the rbtree? ISTM that we ought to just do that at the beginning of the
> > > function, before we take the s_cap_lock.
> >
> > That sounds good to me, at least at a first glace.  I spent some time
> > looking for any possible issues in the code, and even run a few tests.
> >
> > However, looking at git log I found commit f818a73674c5 ("ceph: fix cap
> > removal races"), which moved that rb_erase from the beginning of the
> > function to it's current position.  So, unless the race mentioned in
> > this commit has disappeared in the meantime (which is possible, this
> > commit is from 2010!), this rbtree operation shouldn't be changed.
> >
> > And I now wonder if my patch isn't introducing a race too...
> > __ceph_remove_cap() is supposed to always be called with the session
> > mutex held, except for the ceph_evict_inode() path.  Which is where I'm
> > seeing the UAF.  So, maybe what's missing here is the s_mutex.  Hmm...
> >
> >
> we can't lock s_mutex here, because i_ceph_lock is locked

Well, my idea wasn't to get s_mutex here but earlier in the stack.
Maybe in ceph_evict_inode, protecting the call to __ceph_remove_caps.
But I didn't really looked into that yet, so I'm not really sure if
that's feasible (or even if that would fix this UAF).  I suspect that's
not possible anyway, due to the comment above __ceph_remove_cap:

  caller will not hold session s_mutex if called from destroy_inode.

Cheers,
--
Luís

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ceph: Fix use-after-free in __ceph_remove_cap
  2019-10-22 13:47       ` Luis Henriques
@ 2019-10-23 10:29         ` Luis Henriques
  0 siblings, 0 replies; 9+ messages in thread
From: Luis Henriques @ 2019-10-23 10:29 UTC (permalink / raw)
  To: Yan, Zheng
  Cc: Jeff Layton, Sage Weil, Ilya Dryomov, ceph-devel,
	Linux Kernel Mailing List


Luis Henriques <lhenriques@suse.com> writes:

> On Tue, Oct 22, 2019 at 08:48:56PM +0800, Yan, Zheng wrote:
>> On Mon, Oct 21, 2019 at 10:55 PM Luis Henriques <lhenriques@suse.com> wrote:
>> 
>> >
>> > Jeff Layton <jlayton@kernel.org> writes:
>> >
>> > > On Thu, 2019-10-17 at 15:46 +0100, Luis Henriques wrote:
>> > >> KASAN reports a use-after-free when running xfstest generic/531, with
>> > the
>> > >> following trace:
>> > >>
>> > >> [  293.903362]  kasan_report+0xe/0x20
>> > >> [  293.903365]  rb_erase+0x1f/0x790
>> > >> [  293.903370]  __ceph_remove_cap+0x201/0x370
>> > >> [  293.903375]  __ceph_remove_caps+0x4b/0x70
>> > >> [  293.903380]  ceph_evict_inode+0x4e/0x360
>> > >> [  293.903386]  evict+0x169/0x290
>> > >> [  293.903390]  __dentry_kill+0x16f/0x250
>> > >> [  293.903394]  dput+0x1c6/0x440
>> > >> [  293.903398]  __fput+0x184/0x330
>> > >> [  293.903404]  task_work_run+0xb9/0xe0
>> > >> [  293.903410]  exit_to_usermode_loop+0xd3/0xe0
>> > >> [  293.903413]  do_syscall_64+0x1a0/0x1c0
>> > >> [  293.903417]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>> > >>
>> > >> This happens because __ceph_remove_cap() may queue a cap release
>> > >> (__ceph_queue_cap_release) which can be scheduled before that cap is
>> > >> removed from the inode list with
>> > >>
>> > >>      rb_erase(&cap->ci_node, &ci->i_caps);
>> > >>
>> > >> And, when this finally happens, the use-after-free will occur.
>> > >>
>> > >> This can be fixed by protecting the rb_erase with the s_cap_lock
>> > spinlock,
>> > >> which is used by ceph_send_cap_releases(), before the cap is freed.
>> > >>
>> > >> Signed-off-by: Luis Henriques <lhenriques@suse.com>
>> > >> ---
>> > >>  fs/ceph/caps.c | 4 ++--
>> > >>  1 file changed, 2 insertions(+), 2 deletions(-)
>> > >>
>> > >> diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
>> > >> index d3b9c9d5c1bd..21ee38cabe98 100644
>> > >> --- a/fs/ceph/caps.c
>> > >> +++ b/fs/ceph/caps.c
>> > >> @@ -1089,13 +1089,13 @@ void __ceph_remove_cap(struct ceph_cap *cap,
>> > bool queue_release)
>> > >>      }
>> > >>      cap->cap_ino = ci->i_vino.ino;
>> > >>
>> > >> -    spin_unlock(&session->s_cap_lock);
>> > >> -
>> > >>      /* remove from inode list */
>> > >>      rb_erase(&cap->ci_node, &ci->i_caps);
>> > >>      if (ci->i_auth_cap == cap)
>> > >>              ci->i_auth_cap = NULL;
>> > >>
>> > >> +    spin_unlock(&session->s_cap_lock);
>> > >> +
>> > >>      if (removed)
>> > >>              ceph_put_cap(mdsc, cap);
>> > >>
>> > >
>> > > Is there any reason we need to wait until this point to remove it from
>> > > the rbtree? ISTM that we ought to just do that at the beginning of the
>> > > function, before we take the s_cap_lock.
>> >
>> > That sounds good to me, at least at a first glace.  I spent some time
>> > looking for any possible issues in the code, and even run a few tests.
>> >
>> > However, looking at git log I found commit f818a73674c5 ("ceph: fix cap
>> > removal races"), which moved that rb_erase from the beginning of the
>> > function to it's current position.  So, unless the race mentioned in
>> > this commit has disappeared in the meantime (which is possible, this
>> > commit is from 2010!), this rbtree operation shouldn't be changed.
>> >
>> > And I now wonder if my patch isn't introducing a race too...
>> > __ceph_remove_cap() is supposed to always be called with the session
>> > mutex held, except for the ceph_evict_inode() path.  Which is where I'm
>> > seeing the UAF.  So, maybe what's missing here is the s_mutex.  Hmm...
>> >
>> >
>> we can't lock s_mutex here, because i_ceph_lock is locked
>
> Well, my idea wasn't to get s_mutex here but earlier in the stack.
> Maybe in ceph_evict_inode, protecting the call to __ceph_remove_caps.
> But I didn't really looked into that yet, so I'm not really sure if

Ok, I looked into that now and obviously that's not possible.  So, I
guess my original patch is still the best option.

Cheers,
-- 
Luis

> that's feasible (or even if that would fix this UAF).  I suspect that's
> not possible anyway, due to the comment above __ceph_remove_cap:
>
>   caller will not hold session s_mutex if called from destroy_inode.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ceph: Fix use-after-free in __ceph_remove_cap
  2019-10-21 14:51   ` Luis Henriques
       [not found]     ` <CAAM7YA=dg8ufUWqrD_V8pSdvxrnU+knOW4uW4io_b=Lwjhpg5Q@mail.gmail.com>
@ 2019-10-23 18:47     ` Jeff Layton
  2019-10-25 13:05       ` [PATCH v2] " Luis Henriques
  1 sibling, 1 reply; 9+ messages in thread
From: Jeff Layton @ 2019-10-23 18:47 UTC (permalink / raw)
  To: Luis Henriques; +Cc: Sage Weil, Ilya Dryomov, ceph-devel, linux-kernel

On Mon, 2019-10-21 at 15:51 +0100, Luis Henriques wrote:
> Jeff Layton <jlayton@kernel.org> writes:
> 
> > On Thu, 2019-10-17 at 15:46 +0100, Luis Henriques wrote:
> > > KASAN reports a use-after-free when running xfstest generic/531, with the
> > > following trace:
> > > 
> > > [  293.903362]  kasan_report+0xe/0x20
> > > [  293.903365]  rb_erase+0x1f/0x790
> > > [  293.903370]  __ceph_remove_cap+0x201/0x370
> > > [  293.903375]  __ceph_remove_caps+0x4b/0x70
> > > [  293.903380]  ceph_evict_inode+0x4e/0x360
> > > [  293.903386]  evict+0x169/0x290
> > > [  293.903390]  __dentry_kill+0x16f/0x250
> > > [  293.903394]  dput+0x1c6/0x440
> > > [  293.903398]  __fput+0x184/0x330
> > > [  293.903404]  task_work_run+0xb9/0xe0
> > > [  293.903410]  exit_to_usermode_loop+0xd3/0xe0
> > > [  293.903413]  do_syscall_64+0x1a0/0x1c0
> > > [  293.903417]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > 
> > > This happens because __ceph_remove_cap() may queue a cap release
> > > (__ceph_queue_cap_release) which can be scheduled before that cap is
> > > removed from the inode list with
> > > 
> > > 	rb_erase(&cap->ci_node, &ci->i_caps);
> > > 
> > > And, when this finally happens, the use-after-free will occur.
> > > 
> > > This can be fixed by protecting the rb_erase with the s_cap_lock spinlock,
> > > which is used by ceph_send_cap_releases(), before the cap is freed.
> > > 
> > > Signed-off-by: Luis Henriques <lhenriques@suse.com>
> > > ---
> > >  fs/ceph/caps.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
> > > index d3b9c9d5c1bd..21ee38cabe98 100644
> > > --- a/fs/ceph/caps.c
> > > +++ b/fs/ceph/caps.c
> > > @@ -1089,13 +1089,13 @@ void __ceph_remove_cap(struct ceph_cap *cap, bool queue_release)
> > >  	}
> > >  	cap->cap_ino = ci->i_vino.ino;
> > > 
> > > -	spin_unlock(&session->s_cap_lock);
> > > -
> > >  	/* remove from inode list */
> > >  	rb_erase(&cap->ci_node, &ci->i_caps);
> > >  	if (ci->i_auth_cap == cap)
> > >  		ci->i_auth_cap = NULL;
> > > 
> > > +	spin_unlock(&session->s_cap_lock);
> > > +
> > >  	if (removed)
> > >  		ceph_put_cap(mdsc, cap);
> > > 
> > 
> > Is there any reason we need to wait until this point to remove it from
> > the rbtree? ISTM that we ought to just do that at the beginning of the
> > function, before we take the s_cap_lock.
> 
> That sounds good to me, at least at a first glace.  I spent some time
> looking for any possible issues in the code, and even run a few tests.
> 
> However, looking at git log I found commit f818a73674c5 ("ceph: fix cap
> removal races"), which moved that rb_erase from the beginning of the
> function to it's current position.  So, unless the race mentioned in
> this commit has disappeared in the meantime (which is possible, this
> commit is from 2010!), this rbtree operation shouldn't be changed.
> 
> And I now wonder if my patch isn't introducing a race too...
> __ceph_remove_cap() is supposed to always be called with the session
> mutex held, except for the ceph_evict_inode() path.  Which is where I'm
> seeing the UAF.  So, maybe what's missing here is the s_mutex.  Hmm...
> 

I don't get it. That commit log talks about needing to ensure that the
backpointer is cleared under the lock which is fine, but I don't see why
we need to keep it in the inode's rbtree until that point.

Unhashing an object before you potentially free it is just good
practice, IMO. If we need to do something different here, then I think
it'd be good to add a comment explaining why.
-- 
Jeff Layton <jlayton@kernel.org>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2] ceph: Fix use-after-free in __ceph_remove_cap
  2019-10-23 18:47     ` Jeff Layton
@ 2019-10-25 13:05       ` Luis Henriques
  2019-10-27 12:31         ` Jeff Layton
  0 siblings, 1 reply; 9+ messages in thread
From: Luis Henriques @ 2019-10-25 13:05 UTC (permalink / raw)
  To: Jeff Layton, Sage Weil, Ilya Dryomov, Yan, Zheng
  Cc: ceph-devel, linux-kernel, Luis Henriques

KASAN reports a use-after-free when running xfstest generic/531, with the
following trace:

[  293.903362]  kasan_report+0xe/0x20
[  293.903365]  rb_erase+0x1f/0x790
[  293.903370]  __ceph_remove_cap+0x201/0x370
[  293.903375]  __ceph_remove_caps+0x4b/0x70
[  293.903380]  ceph_evict_inode+0x4e/0x360
[  293.903386]  evict+0x169/0x290
[  293.903390]  __dentry_kill+0x16f/0x250
[  293.903394]  dput+0x1c6/0x440
[  293.903398]  __fput+0x184/0x330
[  293.903404]  task_work_run+0xb9/0xe0
[  293.903410]  exit_to_usermode_loop+0xd3/0xe0
[  293.903413]  do_syscall_64+0x1a0/0x1c0
[  293.903417]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

This happens because __ceph_remove_cap() may queue a cap release
(__ceph_queue_cap_release) which can be scheduled before that cap is
removed from the inode list with

	rb_erase(&cap->ci_node, &ci->i_caps);

And, when this finally happens, the use-after-free will occur.

This can be fixed by removing the cap from the inode list before being
removed from the session list, and thus eliminating the risk of an UAF.

Signed-off-by: Luis Henriques <lhenriques@suse.com>
---
Hi!

So, after spending some time trying to find possible races throught code
review and testing, I modified the fix according to Jeff's suggestion.

Cheers,
Luis

fs/ceph/caps.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index d3b9c9d5c1bd..a9ce858c37d0 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -1058,6 +1058,11 @@ void __ceph_remove_cap(struct ceph_cap *cap, bool queue_release)
 
 	dout("__ceph_remove_cap %p from %p\n", cap, &ci->vfs_inode);
 
+	/* remove from inode list */
+	rb_erase(&cap->ci_node, &ci->i_caps);
+	if (ci->i_auth_cap == cap)
+		ci->i_auth_cap = NULL;
+
 	/* remove from session list */
 	spin_lock(&session->s_cap_lock);
 	if (session->s_cap_iterator == cap) {
@@ -1091,11 +1096,6 @@ void __ceph_remove_cap(struct ceph_cap *cap, bool queue_release)
 
 	spin_unlock(&session->s_cap_lock);
 
-	/* remove from inode list */
-	rb_erase(&cap->ci_node, &ci->i_caps);
-	if (ci->i_auth_cap == cap)
-		ci->i_auth_cap = NULL;
-
 	if (removed)
 		ceph_put_cap(mdsc, cap);
 

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] ceph: Fix use-after-free in __ceph_remove_cap
  2019-10-25 13:05       ` [PATCH v2] " Luis Henriques
@ 2019-10-27 12:31         ` Jeff Layton
  2019-10-28  9:18           ` Luis Henriques
  0 siblings, 1 reply; 9+ messages in thread
From: Jeff Layton @ 2019-10-27 12:31 UTC (permalink / raw)
  To: Luis Henriques, Sage Weil, Ilya Dryomov, Yan, Zheng
  Cc: ceph-devel, linux-kernel

On Fri, 2019-10-25 at 14:05 +0100, Luis Henriques wrote:
> KASAN reports a use-after-free when running xfstest generic/531, with the
> following trace:
> 
> [  293.903362]  kasan_report+0xe/0x20
> [  293.903365]  rb_erase+0x1f/0x790
> [  293.903370]  __ceph_remove_cap+0x201/0x370
> [  293.903375]  __ceph_remove_caps+0x4b/0x70
> [  293.903380]  ceph_evict_inode+0x4e/0x360
> [  293.903386]  evict+0x169/0x290
> [  293.903390]  __dentry_kill+0x16f/0x250
> [  293.903394]  dput+0x1c6/0x440
> [  293.903398]  __fput+0x184/0x330
> [  293.903404]  task_work_run+0xb9/0xe0
> [  293.903410]  exit_to_usermode_loop+0xd3/0xe0
> [  293.903413]  do_syscall_64+0x1a0/0x1c0
> [  293.903417]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> This happens because __ceph_remove_cap() may queue a cap release
> (__ceph_queue_cap_release) which can be scheduled before that cap is
> removed from the inode list with
> 
> 	rb_erase(&cap->ci_node, &ci->i_caps);
> 
> And, when this finally happens, the use-after-free will occur.
> 
> This can be fixed by removing the cap from the inode list before being
> removed from the session list, and thus eliminating the risk of an UAF.
> 
> Signed-off-by: Luis Henriques <lhenriques@suse.com>
> ---
> Hi!
> 
> So, after spending some time trying to find possible races throught code
> review and testing, I modified the fix according to Jeff's suggestion.
> 
> Cheers,
> Luis
> 
> fs/ceph/caps.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
> index d3b9c9d5c1bd..a9ce858c37d0 100644
> --- a/fs/ceph/caps.c
> +++ b/fs/ceph/caps.c
> @@ -1058,6 +1058,11 @@ void __ceph_remove_cap(struct ceph_cap *cap, bool queue_release)
>  
>  	dout("__ceph_remove_cap %p from %p\n", cap, &ci->vfs_inode);
>  
> +	/* remove from inode list */
> +	rb_erase(&cap->ci_node, &ci->i_caps);
> +	if (ci->i_auth_cap == cap)
> +		ci->i_auth_cap = NULL;
> +
>  	/* remove from session list */
>  	spin_lock(&session->s_cap_lock);
>  	if (session->s_cap_iterator == cap) {
> @@ -1091,11 +1096,6 @@ void __ceph_remove_cap(struct ceph_cap *cap, bool queue_release)
>  
>  	spin_unlock(&session->s_cap_lock);
>  
> -	/* remove from inode list */
> -	rb_erase(&cap->ci_node, &ci->i_caps);
> -	if (ci->i_auth_cap == cap)
> -		ci->i_auth_cap = NULL;
> -
>  	if (removed)
>  		ceph_put_cap(mdsc, cap);
>  

Looks good. Merged with a slight modification to the comment:

+       /* remove from inode's cap rbtree, and clear auth cap */

Thanks!
-- 
Jeff Layton <jlayton@kernel.org>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] ceph: Fix use-after-free in __ceph_remove_cap
  2019-10-27 12:31         ` Jeff Layton
@ 2019-10-28  9:18           ` Luis Henriques
  0 siblings, 0 replies; 9+ messages in thread
From: Luis Henriques @ 2019-10-28  9:18 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Sage Weil, Ilya Dryomov, Yan, Zheng, ceph-devel, linux-kernel

On Sun, Oct 27, 2019 at 08:31:49AM -0400, Jeff Layton wrote:
> On Fri, 2019-10-25 at 14:05 +0100, Luis Henriques wrote:
> > KASAN reports a use-after-free when running xfstest generic/531, with the
> > following trace:
> > 
> > [  293.903362]  kasan_report+0xe/0x20
> > [  293.903365]  rb_erase+0x1f/0x790
> > [  293.903370]  __ceph_remove_cap+0x201/0x370
> > [  293.903375]  __ceph_remove_caps+0x4b/0x70
> > [  293.903380]  ceph_evict_inode+0x4e/0x360
> > [  293.903386]  evict+0x169/0x290
> > [  293.903390]  __dentry_kill+0x16f/0x250
> > [  293.903394]  dput+0x1c6/0x440
> > [  293.903398]  __fput+0x184/0x330
> > [  293.903404]  task_work_run+0xb9/0xe0
> > [  293.903410]  exit_to_usermode_loop+0xd3/0xe0
> > [  293.903413]  do_syscall_64+0x1a0/0x1c0
> > [  293.903417]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > 
> > This happens because __ceph_remove_cap() may queue a cap release
> > (__ceph_queue_cap_release) which can be scheduled before that cap is
> > removed from the inode list with
> > 
> > 	rb_erase(&cap->ci_node, &ci->i_caps);
> > 
> > And, when this finally happens, the use-after-free will occur.
> > 
> > This can be fixed by removing the cap from the inode list before being
> > removed from the session list, and thus eliminating the risk of an UAF.
> > 
> > Signed-off-by: Luis Henriques <lhenriques@suse.com>
> > ---
> > Hi!
> > 
> > So, after spending some time trying to find possible races throught code
> > review and testing, I modified the fix according to Jeff's suggestion.
> > 
> > Cheers,
> > Luis
> > 
> > fs/ceph/caps.c | 10 +++++-----
> >  1 file changed, 5 insertions(+), 5 deletions(-)
> > 
> > diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
> > index d3b9c9d5c1bd..a9ce858c37d0 100644
> > --- a/fs/ceph/caps.c
> > +++ b/fs/ceph/caps.c
> > @@ -1058,6 +1058,11 @@ void __ceph_remove_cap(struct ceph_cap *cap, bool queue_release)
> >  
> >  	dout("__ceph_remove_cap %p from %p\n", cap, &ci->vfs_inode);
> >  
> > +	/* remove from inode list */
> > +	rb_erase(&cap->ci_node, &ci->i_caps);
> > +	if (ci->i_auth_cap == cap)
> > +		ci->i_auth_cap = NULL;
> > +
> >  	/* remove from session list */
> >  	spin_lock(&session->s_cap_lock);
> >  	if (session->s_cap_iterator == cap) {
> > @@ -1091,11 +1096,6 @@ void __ceph_remove_cap(struct ceph_cap *cap, bool queue_release)
> >  
> >  	spin_unlock(&session->s_cap_lock);
> >  
> > -	/* remove from inode list */
> > -	rb_erase(&cap->ci_node, &ci->i_caps);
> > -	if (ci->i_auth_cap == cap)
> > -		ci->i_auth_cap = NULL;
> > -
> >  	if (removed)
> >  		ceph_put_cap(mdsc, cap);
> >  
> 
> Looks good. Merged with a slight modification to the comment:
> 
> +       /* remove from inode's cap rbtree, and clear auth cap */
> 

Awesome, thanks.  Regarding CC: stable@, feel free to add that too.  I
guess it makes sense, even if this issue has been there for a while
apparently without any reports.  (Although UAFs may have non-obvious
side effects, I guess...)

Cheers,
--
Luís

> Thanks!
> -- 
> Jeff Layton <jlayton@kernel.org>
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-10-28  9:18 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-17 14:46 [PATCH] ceph: Fix use-after-free in __ceph_remove_cap Luis Henriques
2019-10-21 12:38 ` Jeff Layton
2019-10-21 14:51   ` Luis Henriques
     [not found]     ` <CAAM7YA=dg8ufUWqrD_V8pSdvxrnU+knOW4uW4io_b=Lwjhpg5Q@mail.gmail.com>
2019-10-22 13:47       ` Luis Henriques
2019-10-23 10:29         ` Luis Henriques
2019-10-23 18:47     ` Jeff Layton
2019-10-25 13:05       ` [PATCH v2] " Luis Henriques
2019-10-27 12:31         ` Jeff Layton
2019-10-28  9:18           ` Luis Henriques

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).