linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: syzbot <syzbot+5b1df0420c523b45a953@syzkaller.appspotmail.com>,
	bcrl@kvack.org, linux-aio@kvack.org,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	syzkaller-bugs@googlegroups.com, viro@zeniv.linux.org.uk,
	akpm@linux-foundation.org
Subject: Re: possible deadlock in aio_poll
Date: Wed, 17 Oct 2018 19:55:32 -0400	[thread overview]
Message-ID: <20181017235532.GC4568@redhat.com> (raw)
In-Reply-To: <20180910165317.GA3237@infradead.org>

On Mon, Sep 10, 2018 at 09:53:17AM -0700, Christoph Hellwig wrote:
> On Mon, Sep 10, 2018 at 12:41:05AM -0700, syzbot wrote:
> > =====================================================
> > WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
> > 4.19.0-rc2+ #229 Not tainted
> > -----------------------------------------------------
> > syz-executor2/9399 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
> > 00000000126506e0 (&ctx->fd_wqh){+.+.}, at: spin_lock
> > include/linux/spinlock.h:329 [inline]
> > 00000000126506e0 (&ctx->fd_wqh){+.+.}, at: aio_poll+0x760/0x1420
> > fs/aio.c:1747
> > 
> > and this task is already holding:
> > 000000002bed6bf6 (&(&ctx->ctx_lock)->rlock){..-.}, at: spin_lock_irq
> > include/linux/spinlock.h:354 [inline]
> > 000000002bed6bf6 (&(&ctx->ctx_lock)->rlock){..-.}, at: aio_poll+0x738/0x1420
> > fs/aio.c:1746
> > which would create a new lock dependency:
> >  (&(&ctx->ctx_lock)->rlock){..-.} -> (&ctx->fd_wqh){+.+.}
> 
> ctx->fd_wqh seems to only exist in userfaultfd, which indeed seems
> to do strange open coded waitqueue locking, and seems to fail to disable
> irqs.  Something like this should fix it:
> 
> diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
> index bfa0ec69f924..356d2b8568c1 100644
> --- a/fs/userfaultfd.c
> +++ b/fs/userfaultfd.c
> @@ -1026,7 +1026,7 @@ static ssize_t userfaultfd_ctx_read(struct userfaultfd_ctx *ctx, int no_wait,
>  	struct userfaultfd_ctx *fork_nctx = NULL;
>  
>  	/* always take the fd_wqh lock before the fault_pending_wqh lock */
> -	spin_lock(&ctx->fd_wqh.lock);
> +	spin_lock_irq(&ctx->fd_wqh.lock);
>  	__add_wait_queue(&ctx->fd_wqh, &wait);
>  	for (;;) {
>  		set_current_state(TASK_INTERRUPTIBLE);
> @@ -1112,13 +1112,13 @@ static ssize_t userfaultfd_ctx_read(struct userfaultfd_ctx *ctx, int no_wait,
>  			ret = -EAGAIN;
>  			break;
>  		}
> -		spin_unlock(&ctx->fd_wqh.lock);
> +		spin_unlock_irq(&ctx->fd_wqh.lock);
>  		schedule();
> -		spin_lock(&ctx->fd_wqh.lock);
> +		spin_lock_irq(&ctx->fd_wqh.lock);
>  	}
>  	__remove_wait_queue(&ctx->fd_wqh, &wait);
>  	__set_current_state(TASK_RUNNING);
> -	spin_unlock(&ctx->fd_wqh.lock);
> +	spin_unlock_irq(&ctx->fd_wqh.lock);
>  
>  	if (!ret && msg->event == UFFD_EVENT_FORK) {
>  		ret = resolve_userfault_fork(ctx, fork_nctx, msg);
> 

Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>

This is lock inversion with userfaultfd_poll that takes the fq_wqh
after the irqsafe aio lock. And the aio lock can be taken from softirq
(so potentially for interrupts) leading to a potential lock inversion
deadlock.

I suggest to add a comment about the above in the code before the
first spin_lock_irq to explain why it needs to be _irq or it's not
obvious.

c430d1e848ff1240d126e79780f3c26208b8aed9 was just a false positive
instead.

I didn't comment on c430d1e848ff1240d126e79780f3c26208b8aed9 because I
was too busy with the speculative execution issues at the time and it
was just fine to drop the microoptimization, but while at it can we
look in how to add a spin_acquire or find a way to teach lockdep in
another way, so it's happy even if we restore the microoptimization?

If we do that, in addition we should also initialize the
ctx->fault_wqh spinlock to locked in the same patch (a spin_lock
during uffd ctx creation will do) to be sure nobody takes it as
further robustness feature against future modification (it gets more
self documenting too that it's not supposed to be taken and the
fault_pending_wq.lock has to be taken instead).

Thanks,
Andrea

  parent reply	other threads:[~2018-10-18  7:53 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-10  7:41 possible deadlock in aio_poll syzbot
2018-09-10 16:53 ` Christoph Hellwig
2018-09-10 18:14   ` Miklos Szeredi
2018-09-11  6:33     ` Christoph Hellwig
2018-09-11  7:20       ` Miklos Szeredi
2018-10-17 23:55   ` Andrea Arcangeli [this message]
2018-10-27  6:16 ` syzbot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181017235532.GC4568@redhat.com \
    --to=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=bcrl@kvack.org \
    --cc=hch@infradead.org \
    --cc=linux-aio@kvack.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=syzbot+5b1df0420c523b45a953@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).