linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] aio: Add support for the POLLFREE
@ 2021-09-13 18:37 Ramji Jiyani
  2021-09-29 18:20 ` Jeff Moyer
  0 siblings, 1 reply; 3+ messages in thread
From: Ramji Jiyani @ 2021-09-13 18:37 UTC (permalink / raw)
  To: Alexander Viro, Benjamin LaHaise, Arnd Bergmann
  Cc: Ramji Jiyani, kernel-team, linux-fsdevel, linux-aio,
	linux-kernel, linux-arch

Commit f5cb779ba163 ("ANDROID: binder: remove waitqueue when thread
exits.") fixed the use-after-free in eventpoll but aio still has the
same issue because it doesn't honor the POLLFREE flag.

Add support for the POLLFREE flag to force complete iocb inline in
aio_poll_wake(). A thread may use it to signal it's exit and/or request
to cleanup while pending poll request. In this case, aio_poll_wake()
needs to make sure it doesn't keep any reference to the queue entry
before returning from wake to avoid possible use after free via
poll_cancel() path.

The POLLFREE flag is no more exclusive to the epoll and is being
shared with the aio. Remove comment from poll.h to avoid confusion.
Also enclosed the POLLFREE macro definition in parentheses to fix
checkpatch error.

Signed-off-by: Ramji Jiyani <ramjiyani@google.com>
---
 fs/aio.c                        | 45 ++++++++++++++++++---------------
 include/uapi/asm-generic/poll.h |  2 +-
 2 files changed, 26 insertions(+), 21 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index 51b08ab01dff..5d539c05df42 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1674,6 +1674,7 @@ static int aio_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync,
 {
 	struct poll_iocb *req = container_of(wait, struct poll_iocb, wait);
 	struct aio_kiocb *iocb = container_of(req, struct aio_kiocb, poll);
+	struct kioctx *ctx = iocb->ki_ctx;
 	__poll_t mask = key_to_poll(key);
 	unsigned long flags;
 
@@ -1683,29 +1684,33 @@ static int aio_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync,
 
 	list_del_init(&req->wait.entry);
 
-	if (mask && spin_trylock_irqsave(&iocb->ki_ctx->ctx_lock, flags)) {
-		struct kioctx *ctx = iocb->ki_ctx;
+	/*
+	 * Use irqsave/irqrestore because not all filesystems (e.g. fuse)
+	 * call this function with IRQs disabled and because IRQs have to
+	 * be disabled before ctx_lock is obtained.
+	 */
+	if (mask & POLLFREE) {
+		/* Force complete iocb inline to remove refs to deleted entry */
+		spin_lock_irqsave(&ctx->ctx_lock, flags);
+	} else if (!(mask && spin_trylock_irqsave(&ctx->ctx_lock, flags))) {
+		/* Can't complete iocb inline; schedule for later */
+		schedule_work(&req->work);
+		return 1;
+	}
 
-		/*
-		 * Try to complete the iocb inline if we can. Use
-		 * irqsave/irqrestore because not all filesystems (e.g. fuse)
-		 * call this function with IRQs disabled and because IRQs
-		 * have to be disabled before ctx_lock is obtained.
-		 */
-		list_del(&iocb->ki_list);
-		iocb->ki_res.res = mangle_poll(mask);
-		req->done = true;
-		if (iocb->ki_eventfd && eventfd_signal_allowed()) {
-			iocb = NULL;
-			INIT_WORK(&req->work, aio_poll_put_work);
-			schedule_work(&req->work);
-		}
-		spin_unlock_irqrestore(&ctx->ctx_lock, flags);
-		if (iocb)
-			iocb_put(iocb);
-	} else {
+	/* complete iocb inline */
+	list_del(&iocb->ki_list);
+	iocb->ki_res.res = mangle_poll(mask);
+	req->done = true;
+	if (iocb->ki_eventfd && eventfd_signal_allowed()) {
+		iocb = NULL;
+		INIT_WORK(&req->work, aio_poll_put_work);
 		schedule_work(&req->work);
 	}
+	spin_unlock_irqrestore(&ctx->ctx_lock, flags);
+	if (iocb)
+		iocb_put(iocb);
+
 	return 1;
 }
 
diff --git a/include/uapi/asm-generic/poll.h b/include/uapi/asm-generic/poll.h
index 41b509f410bf..35b1b69af729 100644
--- a/include/uapi/asm-generic/poll.h
+++ b/include/uapi/asm-generic/poll.h
@@ -29,7 +29,7 @@
 #define POLLRDHUP       0x2000
 #endif
 
-#define POLLFREE	(__force __poll_t)0x4000	/* currently only for epoll */
+#define POLLFREE	((__force __poll_t)0x4000)
 
 #define POLL_BUSY_LOOP	(__force __poll_t)0x8000
 
-- 
2.33.0.309.g3052b89438-goog


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] aio: Add support for the POLLFREE
  2021-09-13 18:37 [PATCH] aio: Add support for the POLLFREE Ramji Jiyani
@ 2021-09-29 18:20 ` Jeff Moyer
  2021-09-29 19:50   ` Ramji Jiyani
  0 siblings, 1 reply; 3+ messages in thread
From: Jeff Moyer @ 2021-09-29 18:20 UTC (permalink / raw)
  To: Ramji Jiyani
  Cc: Alexander Viro, Benjamin LaHaise, Arnd Bergmann, kernel-team,
	linux-fsdevel, linux-aio, linux-kernel, linux-arch, oleg, hch

Adding Oleg and Christoph.

Ramji Jiyani <ramjiyani@google.com> writes:

> Commit f5cb779ba163 ("ANDROID: binder: remove waitqueue when thread
> exits.") fixed the use-after-free in eventpoll but aio still has the
> same issue because it doesn't honor the POLLFREE flag.
>
> Add support for the POLLFREE flag to force complete iocb inline in
> aio_poll_wake(). A thread may use it to signal it's exit and/or request
> to cleanup while pending poll request. In this case, aio_poll_wake()
> needs to make sure it doesn't keep any reference to the queue entry
> before returning from wake to avoid possible use after free via
> poll_cancel() path.

Is this an in-kernel user?  Can you explain more about how or when this
happens?  Do you have a stack trace that shows the problem?  I'm not
sure this use of POLLFREE exactly follows with the initial intention of
the flag, but hopefully Oleg can comment on that.

Thanks,
Jeff

> The POLLFREE flag is no more exclusive to the epoll and is being
> shared with the aio. Remove comment from poll.h to avoid confusion.
> Also enclosed the POLLFREE macro definition in parentheses to fix
> checkpatch error.
>
> Signed-off-by: Ramji Jiyani <ramjiyani@google.com>
> ---
>  fs/aio.c                        | 45 ++++++++++++++++++---------------
>  include/uapi/asm-generic/poll.h |  2 +-
>  2 files changed, 26 insertions(+), 21 deletions(-)
>
> diff --git a/fs/aio.c b/fs/aio.c
> index 51b08ab01dff..5d539c05df42 100644
> --- a/fs/aio.c
> +++ b/fs/aio.c
> @@ -1674,6 +1674,7 @@ static int aio_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync,
>  {
>  	struct poll_iocb *req = container_of(wait, struct poll_iocb, wait);
>  	struct aio_kiocb *iocb = container_of(req, struct aio_kiocb, poll);
> +	struct kioctx *ctx = iocb->ki_ctx;
>  	__poll_t mask = key_to_poll(key);
>  	unsigned long flags;
>  
> @@ -1683,29 +1684,33 @@ static int aio_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync,
>  
>  	list_del_init(&req->wait.entry);
>  
> -	if (mask && spin_trylock_irqsave(&iocb->ki_ctx->ctx_lock, flags)) {
> -		struct kioctx *ctx = iocb->ki_ctx;
> +	/*
> +	 * Use irqsave/irqrestore because not all filesystems (e.g. fuse)
> +	 * call this function with IRQs disabled and because IRQs have to
> +	 * be disabled before ctx_lock is obtained.
> +	 */
> +	if (mask & POLLFREE) {
> +		/* Force complete iocb inline to remove refs to deleted entry */
> +		spin_lock_irqsave(&ctx->ctx_lock, flags);
> +	} else if (!(mask && spin_trylock_irqsave(&ctx->ctx_lock, flags))) {
> +		/* Can't complete iocb inline; schedule for later */
> +		schedule_work(&req->work);
> +		return 1;
> +	}
>  
> -		/*
> -		 * Try to complete the iocb inline if we can. Use
> -		 * irqsave/irqrestore because not all filesystems (e.g. fuse)
> -		 * call this function with IRQs disabled and because IRQs
> -		 * have to be disabled before ctx_lock is obtained.
> -		 */
> -		list_del(&iocb->ki_list);
> -		iocb->ki_res.res = mangle_poll(mask);
> -		req->done = true;
> -		if (iocb->ki_eventfd && eventfd_signal_allowed()) {
> -			iocb = NULL;
> -			INIT_WORK(&req->work, aio_poll_put_work);
> -			schedule_work(&req->work);
> -		}
> -		spin_unlock_irqrestore(&ctx->ctx_lock, flags);
> -		if (iocb)
> -			iocb_put(iocb);
> -	} else {
> +	/* complete iocb inline */
> +	list_del(&iocb->ki_list);
> +	iocb->ki_res.res = mangle_poll(mask);
> +	req->done = true;
> +	if (iocb->ki_eventfd && eventfd_signal_allowed()) {
> +		iocb = NULL;
> +		INIT_WORK(&req->work, aio_poll_put_work);
>  		schedule_work(&req->work);
>  	}
> +	spin_unlock_irqrestore(&ctx->ctx_lock, flags);
> +	if (iocb)
> +		iocb_put(iocb);
> +
>  	return 1;
>  }
>  
> diff --git a/include/uapi/asm-generic/poll.h b/include/uapi/asm-generic/poll.h
> index 41b509f410bf..35b1b69af729 100644
> --- a/include/uapi/asm-generic/poll.h
> +++ b/include/uapi/asm-generic/poll.h
> @@ -29,7 +29,7 @@
>  #define POLLRDHUP       0x2000
>  #endif
>  
> -#define POLLFREE	(__force __poll_t)0x4000	/* currently only for epoll */
> +#define POLLFREE	((__force __poll_t)0x4000)
>  
>  #define POLL_BUSY_LOOP	(__force __poll_t)0x8000


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] aio: Add support for the POLLFREE
  2021-09-29 18:20 ` Jeff Moyer
@ 2021-09-29 19:50   ` Ramji Jiyani
  0 siblings, 0 replies; 3+ messages in thread
From: Ramji Jiyani @ 2021-09-29 19:50 UTC (permalink / raw)
  To: Jeff Moyer
  Cc: Alexander Viro, Benjamin LaHaise, Arnd Bergmann, kernel-team,
	linux-fsdevel, linux-aio, linux-kernel, linux-arch, oleg, hch

Hi Jeff:

Thanks for the response.

On Wed, Sep 29, 2021 at 11:18 AM Jeff Moyer <jmoyer@redhat.com> wrote:
>
> Adding Oleg and Christoph.
>
> Ramji Jiyani <ramjiyani@google.com> writes:
>
> > Commit f5cb779ba163 ("ANDROID: binder: remove waitqueue when thread
> > exits.") fixed the use-after-free in eventpoll but aio still has the
> > same issue because it doesn't honor the POLLFREE flag.
> >
> > Add support for the POLLFREE flag to force complete iocb inline in
> > aio_poll_wake(). A thread may use it to signal it's exit and/or request
> > to cleanup while pending poll request. In this case, aio_poll_wake()
> > needs to make sure it doesn't keep any reference to the queue entry
> > before returning from wake to avoid possible use after free via
> > poll_cancel() path.
>
> Is this an in-kernel user?

Yes, an in-kernel user here is android binder.

> Can you explain more about how or when this
> happens?  Do you have a stack trace that shows the problem?

This is to fix a use after free issue between binder thread and aio
interactions.

Suppose we poll a binder file through aio. The poll inserts a wait_queue_entry
into the wait_queue_head list associated with the file. And it takes a
reference on the binder file, so it can't go away. The poll then returns to
userspace while poll operation remains pending (async io).

So after starting the poll, we can run BINDER_THREAD_EXIT to free the wait
queue head.The aio poll is still queued though, so aio will UAF the queue head.

Here are the simplified codes to sequence of events in which case UAF occurs.

static int aio_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync,
                void *key)
{
         if (mask && !(mask & req->events))    // (1)
                return 0;

        list_del_init(&req->wait.entry);      // (2)

        if (mask && spin_trylock_irqsave(&iocb->ki_ctx->ctx_lock, flags)) {
                // complete request now
                list_del(&iocb->ki_list);     // (3)
                ...
        } else {
                // complete later
                schedule_work(&req->work);    // (4)
        }
        return 1;
}

In check (1), mask has EPOLLHUP from binder and actually req->events will
always have EPOLLHUP as well (this is set unconditionally in aio_poll()).
So we proceed to (2), which unlinks the wait queue entry. Next, we complete
the request either right now (3) or later with scheduled work (4), depending
on whether the trylock of ctx_lock succeeds.

It turns out that UAF is possible in case (4), where the request remains linked
through ki_list. This allows io_cancel() to find and cancel the request:

static int aio_poll_cancel(struct kiocb *iocb)
{
        ...

        spin_lock(&req->head->lock);                // (5)
        WRITE_ONCE(req->cancelled, true);
        if (!list_empty(&req->wait.entry)) {
                list_del_init(&req->wait.entry);    // (6)
                schedule_work(&aiocb->poll.work);
        }
        spin_unlock(&req->head->lock);

        return 0;
}

The list_del_init() at (6) would UAF, but the wait queue entry was already
unlinked, so this is unreachable. But req->head in (5) also points to the
freed queue head, so spin_lock() will still UAF.

There was a similar issue with eventpoll. Unlike aio's aio_poll_wake()
eventpoll's
ep_poll_callback() honors the POLLFREE flag. It makes sure that both the queue
entry is unlinked and the queue head pointer is cleared in case of POLLFREE.

The patch is introducing the POLLFREE in which case the request needs to be
honored inline as the reference may not be valid in future time when the worker
thread gets to it.

> I'm not sure this use of POLLFREE exactly follows with the initial intention of
> the flag, but hopefully Oleg can comment on that.
>
> Thanks,
> Jeff

Thanks,
Ram

>
> > The POLLFREE flag is no more exclusive to the epoll and is being
> > shared with the aio. Remove comment from poll.h to avoid confusion.
> > Also enclosed the POLLFREE macro definition in parentheses to fix
> > checkpatch error.
> >
> > Signed-off-by: Ramji Jiyani <ramjiyani@google.com>
> > ---
> >  fs/aio.c                        | 45 ++++++++++++++++++---------------
> >  include/uapi/asm-generic/poll.h |  2 +-
> >  2 files changed, 26 insertions(+), 21 deletions(-)
> >
> > diff --git a/fs/aio.c b/fs/aio.c
> > index 51b08ab01dff..5d539c05df42 100644
> > --- a/fs/aio.c
> > +++ b/fs/aio.c
> > @@ -1674,6 +1674,7 @@ static int aio_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync,
> >  {
> >       struct poll_iocb *req = container_of(wait, struct poll_iocb, wait);
> >       struct aio_kiocb *iocb = container_of(req, struct aio_kiocb, poll);
> > +     struct kioctx *ctx = iocb->ki_ctx;
> >       __poll_t mask = key_to_poll(key);
> >       unsigned long flags;
> >
> > @@ -1683,29 +1684,33 @@ static int aio_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync,
> >
> >       list_del_init(&req->wait.entry);
> >
> > -     if (mask && spin_trylock_irqsave(&iocb->ki_ctx->ctx_lock, flags)) {
> > -             struct kioctx *ctx = iocb->ki_ctx;
> > +     /*
> > +      * Use irqsave/irqrestore because not all filesystems (e.g. fuse)
> > +      * call this function with IRQs disabled and because IRQs have to
> > +      * be disabled before ctx_lock is obtained.
> > +      */
> > +     if (mask & POLLFREE) {
> > +             /* Force complete iocb inline to remove refs to deleted entry */
> > +             spin_lock_irqsave(&ctx->ctx_lock, flags);
> > +     } else if (!(mask && spin_trylock_irqsave(&ctx->ctx_lock, flags))) {
> > +             /* Can't complete iocb inline; schedule for later */
> > +             schedule_work(&req->work);
> > +             return 1;
> > +     }
> >
> > -             /*
> > -              * Try to complete the iocb inline if we can. Use
> > -              * irqsave/irqrestore because not all filesystems (e.g. fuse)
> > -              * call this function with IRQs disabled and because IRQs
> > -              * have to be disabled before ctx_lock is obtained.
> > -              */
> > -             list_del(&iocb->ki_list);
> > -             iocb->ki_res.res = mangle_poll(mask);
> > -             req->done = true;
> > -             if (iocb->ki_eventfd && eventfd_signal_allowed()) {
> > -                     iocb = NULL;
> > -                     INIT_WORK(&req->work, aio_poll_put_work);
> > -                     schedule_work(&req->work);
> > -             }
> > -             spin_unlock_irqrestore(&ctx->ctx_lock, flags);
> > -             if (iocb)
> > -                     iocb_put(iocb);
> > -     } else {
> > +     /* complete iocb inline */
> > +     list_del(&iocb->ki_list);
> > +     iocb->ki_res.res = mangle_poll(mask);
> > +     req->done = true;
> > +     if (iocb->ki_eventfd && eventfd_signal_allowed()) {
> > +             iocb = NULL;
> > +             INIT_WORK(&req->work, aio_poll_put_work);
> >               schedule_work(&req->work);
> >       }
> > +     spin_unlock_irqrestore(&ctx->ctx_lock, flags);
> > +     if (iocb)
> > +             iocb_put(iocb);
> > +
> >       return 1;
> >  }
> >
> > diff --git a/include/uapi/asm-generic/poll.h b/include/uapi/asm-generic/poll.h
> > index 41b509f410bf..35b1b69af729 100644
> > --- a/include/uapi/asm-generic/poll.h
> > +++ b/include/uapi/asm-generic/poll.h
> > @@ -29,7 +29,7 @@
> >  #define POLLRDHUP       0x2000
> >  #endif
> >
> > -#define POLLFREE     (__force __poll_t)0x4000        /* currently only for epoll */
> > +#define POLLFREE     ((__force __poll_t)0x4000)
> >
> >  #define POLL_BUSY_LOOP       (__force __poll_t)0x8000
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-09-29 19:50 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-13 18:37 [PATCH] aio: Add support for the POLLFREE Ramji Jiyani
2021-09-29 18:20 ` Jeff Moyer
2021-09-29 19:50   ` Ramji Jiyani

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).