linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Al Viro <viro@zeniv.linux.org.uk>, Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>,
	Jason Baron <jbaron@akamai.com>,
	kgraul@linux.ibm.com, ktkhai@virtuozzo.com,
	kyeongdon.kim@lge.com,
	Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
	Netdev <netdev@vger.kernel.org>,
	pabeni@redhat.com, syzkaller-bugs@googlegroups.com,
	xiyou.wangcong@gmail.com, Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH] aio: prevent the final fput() in the middle of vfs_poll() (Re: KASAN: use-after-free Read in unix_dgram_poll)
Date: Sun, 3 Mar 2019 11:44:33 -0800	[thread overview]
Message-ID: <CAHk-=wiQ2YNB+2FT+jbdcC6qfYqvS2kLLAZbmPE_7sLfUPwFJg@mail.gmail.com> (raw)
In-Reply-To: <20190303151846.GQ2217@ZenIV.linux.org.uk>

[-- Attachment #1: Type: text/plain, Size: 1912 bytes --]

On Sun, Mar 3, 2019 at 7:18 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> > Maybe unrelated to this bug, but...  What's to prevent a wakeup
> > that happens just after we'd been added to a waitqueue by ->poll()
> > triggering aio_poll_wake(), which gets to aio_poll_complete()
> > with its fput() *before* we'd reached the end of ->poll() instance?

I'm assuming you're talking about the second vfs_poll() in
aio_poll_complete_work()? The one we call before we check for
"rew->cancelled" properly under the spinlock?

> 1) io_submit(2) allocates aio_kiocb instance and passes it to aio_poll()
> 2) aio_poll() resolves the descriptor to struct file by
>         req->file = fget(iocb->aio_fildes)
[...]

So honestly, the whole filp handling in aio looks overly complicated to me.

All the different operations do that silly fget/fput() dance, although
aio_read/write at least tried to make a common helper function for
handling it.

But as far as I can tell, they *all* could do:

 - look up file in aio_submit() when allocating and creating the aio_kiocb

 - drop the filp in 'iocb_put()' (which happens whether things
complete successfully or not).

and we'd have avoided a lot of complexity, and we'd have avoided this bug.

Your patch fixes the poll() case, but it does so by just letting the
existing complexity remain, and adding a second fget/fput pair in the
poll logic.

It would seem like it would be much better to rip all the complexity
out entirely, and replace it with sane, simple and obviously correct
code.

Hmm?

In other words, why wouldn't something like the attached work instead?

TOTALLY UNTESTED! It builds, and it looks sane, but maybe I'm
overlooking some obvious problem with it. But doesn't it look nice to
see

 2 files changed, 41 insertions(+), 50 deletions(-)

with actual code reduction, and a fundamental simplification in
handling of the file pointer?

                 Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 7944 bytes --]

 fs/aio.c           | 83 ++++++++++++++++++++++--------------------------------
 include/linux/fs.h |  8 +++++-
 2 files changed, 41 insertions(+), 50 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index aaaaf4d12c73..9eccd2ea6ca9 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -167,14 +167,18 @@ struct kioctx {
 	unsigned		id;
 };
 
+/*
+ * First field must be 'ki_filp' in all the
+ * iocb unions!
+ */
 struct fsync_iocb {
+	struct file		*ki_filp;
 	struct work_struct	work;
-	struct file		*file;
 	bool			datasync;
 };
 
 struct poll_iocb {
-	struct file		*file;
+	struct file		*ki_filp;
 	struct wait_queue_head	*head;
 	__poll_t		events;
 	bool			woken;
@@ -183,8 +187,15 @@ struct poll_iocb {
 	struct work_struct	work;
 };
 
+/*
+ * NOTE! Each of the iocb union members has "ki_filp" as
+ * the first entry in their struct definition. So you can
+ * access the file pointer either directly through this
+ * anonymous union, or through any of the sub-structs.
+ */
 struct aio_kiocb {
 	union {
+		struct file		*ki_filp;
 		struct kiocb		rw;
 		struct fsync_iocb	fsync;
 		struct poll_iocb	poll;
@@ -1060,6 +1071,7 @@ static inline void iocb_put(struct aio_kiocb *iocb)
 {
 	if (refcount_read(&iocb->ki_refcnt) == 0 ||
 	    refcount_dec_and_test(&iocb->ki_refcnt)) {
+		fput(iocb->ki_filp);
 		percpu_ref_put(&iocb->ki_ctx->reqs);
 		kmem_cache_free(kiocb_cachep, iocb);
 	}
@@ -1413,7 +1425,7 @@ static void aio_complete_rw(struct kiocb *kiocb, long res, long res2)
 		aio_remove_iocb(iocb);
 
 	if (kiocb->ki_flags & IOCB_WRITE) {
-		struct inode *inode = file_inode(kiocb->ki_filp);
+		struct inode *inode = file_inode(iocb->ki_filp);
 
 		/*
 		 * Tell lockdep we inherited freeze protection from submission
@@ -1421,10 +1433,9 @@ static void aio_complete_rw(struct kiocb *kiocb, long res, long res2)
 		 */
 		if (S_ISREG(inode->i_mode))
 			__sb_writers_acquired(inode->i_sb, SB_FREEZE_WRITE);
-		file_end_write(kiocb->ki_filp);
+		file_end_write(iocb->ki_filp);
 	}
 
-	fput(kiocb->ki_filp);
 	aio_complete(iocb, res, res2);
 }
 
@@ -1432,9 +1443,6 @@ static int aio_prep_rw(struct kiocb *req, const struct iocb *iocb)
 {
 	int ret;
 
-	req->ki_filp = fget(iocb->aio_fildes);
-	if (unlikely(!req->ki_filp))
-		return -EBADF;
 	req->ki_complete = aio_complete_rw;
 	req->private = NULL;
 	req->ki_pos = iocb->aio_offset;
@@ -1451,7 +1459,7 @@ static int aio_prep_rw(struct kiocb *req, const struct iocb *iocb)
 		ret = ioprio_check_cap(iocb->aio_reqprio);
 		if (ret) {
 			pr_debug("aio ioprio check cap error: %d\n", ret);
-			goto out_fput;
+			return ret;
 		}
 
 		req->ki_ioprio = iocb->aio_reqprio;
@@ -1460,14 +1468,10 @@ static int aio_prep_rw(struct kiocb *req, const struct iocb *iocb)
 
 	ret = kiocb_set_rw_flags(req, iocb->aio_rw_flags);
 	if (unlikely(ret))
-		goto out_fput;
+		return ret;
 
 	req->ki_flags &= ~IOCB_HIPRI; /* no one is going to poll for this I/O */
 	return 0;
-
-out_fput:
-	fput(req->ki_filp);
-	return ret;
 }
 
 static int aio_setup_rw(int rw, const struct iocb *iocb, struct iovec **iovec,
@@ -1521,24 +1525,19 @@ static ssize_t aio_read(struct kiocb *req, const struct iocb *iocb,
 	if (ret)
 		return ret;
 	file = req->ki_filp;
-
-	ret = -EBADF;
 	if (unlikely(!(file->f_mode & FMODE_READ)))
-		goto out_fput;
+		return -EBADF;
 	ret = -EINVAL;
 	if (unlikely(!file->f_op->read_iter))
-		goto out_fput;
+		return -EINVAL;
 
 	ret = aio_setup_rw(READ, iocb, &iovec, vectored, compat, &iter);
 	if (ret)
-		goto out_fput;
+		return ret;
 	ret = rw_verify_area(READ, file, &req->ki_pos, iov_iter_count(&iter));
 	if (!ret)
 		aio_rw_done(req, call_read_iter(file, req, &iter));
 	kfree(iovec);
-out_fput:
-	if (unlikely(ret))
-		fput(file);
 	return ret;
 }
 
@@ -1555,16 +1554,14 @@ static ssize_t aio_write(struct kiocb *req, const struct iocb *iocb,
 		return ret;
 	file = req->ki_filp;
 
-	ret = -EBADF;
 	if (unlikely(!(file->f_mode & FMODE_WRITE)))
-		goto out_fput;
-	ret = -EINVAL;
+		return -EBADF;
 	if (unlikely(!file->f_op->write_iter))
-		goto out_fput;
+		return -EINVAL;
 
 	ret = aio_setup_rw(WRITE, iocb, &iovec, vectored, compat, &iter);
 	if (ret)
-		goto out_fput;
+		return ret;
 	ret = rw_verify_area(WRITE, file, &req->ki_pos, iov_iter_count(&iter));
 	if (!ret) {
 		/*
@@ -1582,9 +1579,6 @@ static ssize_t aio_write(struct kiocb *req, const struct iocb *iocb,
 		aio_rw_done(req, call_write_iter(file, req, &iter));
 	}
 	kfree(iovec);
-out_fput:
-	if (unlikely(ret))
-		fput(file);
 	return ret;
 }
 
@@ -1593,8 +1587,7 @@ static void aio_fsync_work(struct work_struct *work)
 	struct fsync_iocb *req = container_of(work, struct fsync_iocb, work);
 	int ret;
 
-	ret = vfs_fsync(req->file, req->datasync);
-	fput(req->file);
+	ret = vfs_fsync(req->ki_filp, req->datasync);
 	aio_complete(container_of(req, struct aio_kiocb, fsync), ret, 0);
 }
 
@@ -1605,13 +1598,8 @@ static int aio_fsync(struct fsync_iocb *req, const struct iocb *iocb,
 			iocb->aio_rw_flags))
 		return -EINVAL;
 
-	req->file = fget(iocb->aio_fildes);
-	if (unlikely(!req->file))
-		return -EBADF;
-	if (unlikely(!req->file->f_op->fsync)) {
-		fput(req->file);
+	if (unlikely(!req->ki_filp->f_op->fsync))
 		return -EINVAL;
-	}
 
 	req->datasync = datasync;
 	INIT_WORK(&req->work, aio_fsync_work);
@@ -1621,10 +1609,7 @@ static int aio_fsync(struct fsync_iocb *req, const struct iocb *iocb,
 
 static inline void aio_poll_complete(struct aio_kiocb *iocb, __poll_t mask)
 {
-	struct file *file = iocb->poll.file;
-
 	aio_complete(iocb, mangle_poll(mask), 0);
-	fput(file);
 }
 
 static void aio_poll_complete_work(struct work_struct *work)
@@ -1636,7 +1621,7 @@ static void aio_poll_complete_work(struct work_struct *work)
 	__poll_t mask = 0;
 
 	if (!READ_ONCE(req->cancelled))
-		mask = vfs_poll(req->file, &pt) & req->events;
+		mask = vfs_poll(iocb->ki_filp, &pt) & req->events;
 
 	/*
 	 * Note that ->ki_cancel callers also delete iocb from active_reqs after
@@ -1743,9 +1728,6 @@ static ssize_t aio_poll(struct aio_kiocb *aiocb, const struct iocb *iocb)
 
 	INIT_WORK(&req->work, aio_poll_complete_work);
 	req->events = demangle_poll(iocb->aio_buf) | EPOLLERR | EPOLLHUP;
-	req->file = fget(iocb->aio_fildes);
-	if (unlikely(!req->file))
-		return -EBADF;
 
 	req->head = NULL;
 	req->woken = false;
@@ -1763,7 +1745,7 @@ static ssize_t aio_poll(struct aio_kiocb *aiocb, const struct iocb *iocb)
 	/* one for removal from waitqueue, one for this function */
 	refcount_set(&aiocb->ki_refcnt, 2);
 
-	mask = vfs_poll(req->file, &apt.pt) & req->events;
+	mask = vfs_poll(aiocb->ki_filp, &apt.pt) & req->events;
 	if (unlikely(!req->head)) {
 		/* we did not manage to set up a waitqueue, done */
 		goto out;
@@ -1788,10 +1770,8 @@ static ssize_t aio_poll(struct aio_kiocb *aiocb, const struct iocb *iocb)
 	spin_unlock_irq(&ctx->ctx_lock);
 
 out:
-	if (unlikely(apt.error)) {
-		fput(req->file);
+	if (unlikely(apt.error))
 		return apt.error;
-	}
 
 	if (mask)
 		aio_poll_complete(aiocb, mask);
@@ -1829,6 +1809,11 @@ static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb,
 	if (unlikely(!req))
 		goto out_put_reqs_available;
 
+	req->ki_filp = fget(iocb->aio_fildes);
+	ret = -EBADF;
+	if (unlikely(!req->ki_filp))
+		goto out_put_req;
+
 	if (iocb->aio_flags & IOCB_FLAG_RESFD) {
 		/*
 		 * If the IOCB_FLAG_RESFD flag of aio_flags is set, get an
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 29d8e2cfed0e..fd423fec8d83 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -304,13 +304,19 @@ enum rw_hint {
 
 struct kiocb {
 	struct file		*ki_filp;
+
+	/* The 'ki_filp' pointer is shared in a union for aio */
+	randomized_struct_fields_start
+
 	loff_t			ki_pos;
 	void (*ki_complete)(struct kiocb *iocb, long ret, long ret2);
 	void			*private;
 	int			ki_flags;
 	u16			ki_hint;
 	u16			ki_ioprio; /* See linux/ioprio.h */
-} __randomize_layout;
+
+	randomized_struct_fields_end
+};
 
 static inline bool is_sync_kiocb(struct kiocb *kiocb)
 {

  parent reply	other threads:[~2019-03-03 19:45 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-03 10:22 KASAN: use-after-free Read in unix_dgram_poll syzbot
2019-03-03 13:55 ` Al Viro
2019-03-03 15:18   ` [PATCH] aio: prevent the final fput() in the middle of vfs_poll() (Re: KASAN: use-after-free Read in unix_dgram_poll) Al Viro
2019-03-03 18:37     ` Eric Dumazet
2019-03-03 19:44     ` Linus Torvalds [this message]
2019-03-03 20:13       ` Linus Torvalds
2019-03-03 20:30       ` Al Viro
2019-03-03 22:23         ` Linus Torvalds
2019-03-04  2:36           ` Al Viro
2019-03-04 21:22             ` Linus Torvalds
2019-03-07  0:03               ` [PATCH 1/8] aio: make sure file is pinned Al Viro
2019-03-07  0:03                 ` [PATCH 2/8] aio_poll_wake(): don't set ->woken if we ignore the wakeup Al Viro
2019-03-07  2:18                   ` Al Viro
2019-03-08 11:16                     ` zhengbin (A)
2019-03-07  0:03                 ` [PATCH 3/8] aio_poll(): sanitize the logics after vfs_poll(), get rid of leak on error Al Viro
2019-03-07  2:11                   ` zhengbin (A)
2019-03-07  0:03                 ` [PATCH 4/8] aio_poll(): get rid of weird refcounting Al Viro
2019-03-07  0:03                 ` [PATCH 5/8] make aio_read()/aio_write() return int Al Viro
2019-03-07  0:03                 ` [PATCH 6/8] move dropping ->ki_eventfd into iocb_put() Al Viro
2019-03-07  0:03                 ` [PATCH 7/8] deal with get_reqs_available() in aio_get_req() itself Al Viro
2019-03-07  0:03                 ` [PATCH 8/8] aio: move sanity checks and request allocation to io_submit_one() Al Viro
2019-03-07  0:23                 ` [PATCH 1/8] aio: make sure file is pinned Linus Torvalds
2019-03-07  0:41                   ` Al Viro
2019-03-07  0:48                     ` Al Viro
2019-03-07  1:20                       ` Al Viro
2019-03-07  1:30                         ` Linus Torvalds
2019-03-08  3:36                           ` Al Viro
2019-03-08 15:50                             ` Christoph Hellwig
2019-03-10  7:06                             ` Al Viro
2019-03-10  7:08                               ` [PATCH 1/8] pin iocb through aio Al Viro
2019-03-10  7:08                                 ` [PATCH 2/8] keep io_event in aio_kiocb Al Viro
2019-03-11 19:43                                   ` Christoph Hellwig
2019-03-11 21:17                                     ` Al Viro
2019-03-10  7:08                                 ` [PATCH 3/8] aio: store event at final iocb_put() Al Viro
2019-03-11 19:44                                   ` Christoph Hellwig
2019-03-11 21:13                                     ` Al Viro
2019-03-11 22:52                                       ` Al Viro
2019-03-10  7:08                                 ` [PATCH 4/8] Fix aio_poll() races Al Viro
2019-03-11 19:58                                   ` Christoph Hellwig
2019-03-11 21:06                                     ` Al Viro
2019-03-12 19:18                                       ` Christoph Hellwig
2019-03-10  7:08                                 ` [PATCH 5/8] make aio_read()/aio_write() return int Al Viro
2019-03-11 19:44                                   ` Christoph Hellwig
2019-03-10  7:08                                 ` [PATCH 6/8] move dropping ->ki_eventfd into iocb_destroy() Al Viro
2019-03-11 19:46                                   ` Christoph Hellwig
2019-03-10  7:08                                 ` [PATCH 7/8] deal with get_reqs_available() in aio_get_req() itself Al Viro
2019-03-11 19:46                                   ` Christoph Hellwig
2019-03-10  7:08                                 ` [PATCH 8/8] aio: move sanity checks and request allocation to io_submit_one() Al Viro
2019-03-11 19:48                                   ` Christoph Hellwig
2019-03-11 21:12                                     ` Al Viro
2019-03-11 19:41                                 ` [PATCH 1/8] pin iocb through aio Christoph Hellwig
2019-03-11 19:41                               ` [PATCH 1/8] aio: make sure file is pinned Christoph Hellwig
2019-03-04  7:53     ` [PATCH] aio: prevent the final fput() in the middle of vfs_poll() (Re: KASAN: use-after-free Read in unix_dgram_poll) Dmitry Vyukov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHk-=wiQ2YNB+2FT+jbdcC6qfYqvS2kLLAZbmPE_7sLfUPwFJg@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=hch@lst.de \
    --cc=jbaron@akamai.com \
    --cc=kgraul@linux.ibm.com \
    --cc=ktkhai@virtuozzo.com \
    --cc=kyeongdon.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).