Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
From: Miklos Szeredi <miklos@szeredi.hu>
To: Bart Van Assche <bvanassche@acm.org>
Cc: Christoph Hellwig <hch@lst.de>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	linux-fsdevel@vger.kernel.org, linux-aio <linux-aio@kvack.org>,
	syzbot <syzkaller@googlegroups.com>,
	Avi Kivity <avi@scylladb.com>, Eric Dumazet <edumazet@google.com>,
	stable <stable@vger.kernel.org>
Subject: Re: [PATCH] aio: Fix locking in aio_poll()
Date: Wed, 6 Feb 2019 09:36:26 +0100
Message-ID: <20190206083626.GA19821@veci.piliscsaba.redhat.com> (raw)
In-Reply-To: <1549414384.34241.77.camel@acm.org>

On Tue, Feb 05, 2019 at 04:53:04PM -0800, Bart Van Assche wrote:
> On Tue, 2019-02-05 at 09:12 +0100, Miklos Szeredi wrote:
> > Not all variants of the waitqueue interface require irqs to be
> > disabled, and fuse has nothing whatsoever to do with irqs, so there's
> > no sane reason to disable them.
> > 
> > Also, AFAICS, the fuse device does not support asynchronous IO.  I
> > just don't get what this is about...
> 
> Hi Miklos,
> 
> Could this be what happens?
> 
> aio_poll() calls vfs_poll()
>   vfs_poll() calls fuse_dev_poll()
>     fuse_dev_poll() calls poll_wait(file, &fiq->waitq, wait)
>       poll_wait() calls aio_poll_queue_proc(file, &fiq->waitq, wait)
>         aio_poll_queue_proc() stores &fiq->waitq in pt->iocb->poll.head
> aio_poll() calls spin_lock_irq(&ctx->ctx_lock)
> aio_poll() calls spin_lock(&req->head->lock) (req == &pt->iocb->poll).
> 
> I think the lockdep complaint is about the FUSE fiq->waitq lock not being
> IRQ-safe and about aio_poll() creating a dependency between an IRQ-safe lock
> (ctx->ctx_lock) and a lock that is not IRQ-safe (fiq->waitq).

Yep.  And is AIO doing anything in irq context?  If so, why?

Would it make sense to just disable AIO on file descriptors where it makes zero
sense?

E.g. this patch:

Thanks,
Miklos

---
diff --git a/fs/aio.c b/fs/aio.c
index b906ff70c90f..6f5edd28a83c 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1435,6 +1435,9 @@ static int aio_prep_rw(struct kiocb *req, const struct iocb *iocb)
 	req->ki_filp = fget(iocb->aio_fildes);
 	if (unlikely(!req->ki_filp))
 		return -EBADF;
+	ret = -EINVAL;
+	if (req->ki_filp->f_mode & FMODE_NOAIO)
+		goto out_fput;
 	req->ki_complete = aio_complete_rw;
 	req->ki_pos = iocb->aio_offset;
 	req->ki_flags = iocb_flags(req->ki_filp);
@@ -1607,7 +1610,8 @@ static int aio_fsync(struct fsync_iocb *req, const struct iocb *iocb,
 	req->file = fget(iocb->aio_fildes);
 	if (unlikely(!req->file))
 		return -EBADF;
-	if (unlikely(!req->file->f_op->fsync)) {
+	if (unlikely(req->file->f_mode & FMODE_NOAIO) ||
+	    unlikely(!req->file->f_op->fsync)) {
 		fput(req->file);
 		return -EINVAL;
 	}
@@ -1755,6 +1759,9 @@ static ssize_t aio_poll(struct aio_kiocb *aiocb, const struct iocb *iocb)
 	apt.iocb = aiocb;
 	apt.error = -EINVAL; /* same as no support for IOCB_CMD_POLL */
 
+	if (req->file->f_mode & FMODE_NOAIO)
+		goto out;
+
 	/* initialized the list so that we can do list_empty checks */
 	INIT_LIST_HEAD(&req->wait.entry);
 	init_waitqueue_func_entry(&req->wait, aio_poll_wake);
diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index a5e516a40e7a..c00aa39ec7f0 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -1397,6 +1397,7 @@ static int fuse_dev_open(struct inode *inode, struct file *file)
 	 * keep track of whether the file has been mounted already.
 	 */
 	file->private_data = NULL;
+	file->f_mode |= FMODE_NOAIO;
 	return 0;
 }
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 29d8e2cfed0e..7cda153baad3 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -162,6 +162,9 @@ typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset,
 /* File does not contribute to nr_files count */
 #define FMODE_NOACCOUNT	((__force fmode_t)0x20000000)
 
+/* File does not support AIO ops */
+#define FMODE_NOAIO	((__force fmode_t)0x40000000)
+
 /*
  * Flag for rw_copy_check_uvector and compat_rw_copy_check_uvector
  * that indicates that they should check the contents of the iovec are

  reply index

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-04 17:45 Bart Van Assche
2019-02-04 17:49 ` Christoph Hellwig
2019-02-05  8:12   ` Miklos Szeredi
2019-02-06  0:53     ` Bart Van Assche
2019-02-06  8:36       ` Miklos Szeredi [this message]
2019-02-06 13:47         ` Christoph Hellwig
2019-02-06 14:31           ` Miklos Szeredi
2019-02-06 13:39       ` Christoph Hellwig
2019-02-09  0:59 Bart Van Assche
2019-02-12  7:56 ` Christoph Hellwig
2019-02-21 22:28   ` Bart Van Assche
2019-02-22  3:17     ` Al Viro

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190206083626.GA19821@veci.piliscsaba.redhat.com \
    --to=miklos@szeredi.hu \
    --cc=avi@scylladb.com \
    --cc=bvanassche@acm.org \
    --cc=edumazet@google.com \
    --cc=hch@lst.de \
    --cc=linux-aio@kvack.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=syzkaller@googlegroups.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org linux-fsdevel@archiver.kernel.org
	public-inbox-index linux-fsdevel


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox