From: Jan Kara <jack@suse.cz>
To: Lukas Czerner <lczerner@redhat.com>
Cc: linux-fsdevel@vger.kernel.org, jack@suse.cz, eguan@redhat.com
Subject: Re: [PATCH] fs: Avoid invalidation in interrupt context in dio_complete()
Date: Tue, 17 Oct 2017 11:53:56 +0200 [thread overview]
Message-ID: <20171017095356.GQ9762@quack2.suse.cz> (raw)
In-Reply-To: <1508231484-30966-1-git-send-email-lczerner@redhat.com>
On Tue 17-10-17 11:11:24, Lukas Czerner wrote:
> Currently we try to defer completion of async DIO to the process context
> in case there are any mapped pages associated with the inode so that we
> can invalidate the pages when the IO completes. However the check is racy
> and the pages can be mapped afterwards. If this happens we might end up
> calling invalidate_inode_pages2_range() in dio_complete() in interrupt
> context which could sleep. This can be reproduced by generic/451.
>
> Fix this by passing the information whether we can or can't invalidate
> to the dio_complete(). Thanks Eryu Guan for reporting this and Jan Kara
> for suggesting a fix.
>
> Fixes: 332391a9935d ("fs: Fix page cache inconsistency when mixing buffered and AIO DIO")
> Signed-off-by: Lukas Czerner <lczerner@redhat.com>
> Reported-by: Eryu Guan <eguan@redhat.com>
Looks good to me. You can add:
Reviewed-by: Jan Kara <jack@suse.cz>
BTW, I guess you should CC someone who can merge the patch...
Honza
> ---
> fs/direct-io.c | 19 +++++++++++++------
> 1 file changed, 13 insertions(+), 6 deletions(-)
>
> diff --git a/fs/direct-io.c b/fs/direct-io.c
> index 96415c6..5632548 100644
> --- a/fs/direct-io.c
> +++ b/fs/direct-io.c
> @@ -45,6 +45,12 @@
> #define DIO_PAGES 64
>
> /*
> + * Flags for dio_complete()
> + */
> +#define DIO_COMPLETE_ASYNC 0x01 /* This is async IO */
> +#define DIO_COMPLETE_INVALIDATE 0x02 /* Can invalidate pages */
> +
> +/*
> * This code generally works in units of "dio_blocks". A dio_block is
> * somewhere between the hard sector size and the filesystem block size. it
> * is determined on a per-invocation basis. When talking to the filesystem
> @@ -225,7 +231,7 @@ static inline struct page *dio_get_page(struct dio *dio,
> * filesystems can use it to hold additional state between get_block calls and
> * dio_complete.
> */
> -static ssize_t dio_complete(struct dio *dio, ssize_t ret, bool is_async)
> +static ssize_t dio_complete(struct dio *dio, ssize_t ret, unsigned int flags)
> {
> loff_t offset = dio->iocb->ki_pos;
> ssize_t transferred = 0;
> @@ -266,7 +272,8 @@ static ssize_t dio_complete(struct dio *dio, ssize_t ret, bool is_async)
> * one is a pretty crazy thing to do, so we don't support it 100%. If
> * this invalidation fails, tough, the write still worked...
> */
> - if (ret > 0 && dio->op == REQ_OP_WRITE &&
> + if (flags & DIO_COMPLETE_INVALIDATE &&
> + ret > 0 && dio->op == REQ_OP_WRITE &&
> dio->inode->i_mapping->nrpages) {
> err = invalidate_inode_pages2_range(dio->inode->i_mapping,
> offset >> PAGE_SHIFT,
> @@ -285,7 +292,7 @@ static ssize_t dio_complete(struct dio *dio, ssize_t ret, bool is_async)
> if (!(dio->flags & DIO_SKIP_DIO_COUNT))
> inode_dio_end(dio->inode);
>
> - if (is_async) {
> + if (flags & DIO_COMPLETE_ASYNC) {
> /*
> * generic_write_sync expects ki_pos to have been updated
> * already, but the submission path only does this for
> @@ -306,7 +313,7 @@ static void dio_aio_complete_work(struct work_struct *work)
> {
> struct dio *dio = container_of(work, struct dio, complete_work);
>
> - dio_complete(dio, 0, true);
> + dio_complete(dio, 0, DIO_COMPLETE_ASYNC | DIO_COMPLETE_INVALIDATE);
> }
>
> static blk_status_t dio_bio_complete(struct dio *dio, struct bio *bio);
> @@ -348,7 +355,7 @@ static void dio_bio_end_aio(struct bio *bio)
> queue_work(dio->inode->i_sb->s_dio_done_wq,
> &dio->complete_work);
> } else {
> - dio_complete(dio, 0, true);
> + dio_complete(dio, 0, DIO_COMPLETE_ASYNC);
> }
> }
> }
> @@ -1360,7 +1367,7 @@ do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,
> dio_await_completion(dio);
>
> if (drop_refcount(dio) == 0) {
> - retval = dio_complete(dio, retval, false);
> + retval = dio_complete(dio, retval, DIO_COMPLETE_INVALIDATE);
> } else
> BUG_ON(retval != -EIOCBQUEUED);
>
> --
> 2.7.5
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply other threads:[~2017-10-17 9:53 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-17 9:11 [PATCH] fs: Avoid invalidation in interrupt context in dio_complete() Lukas Czerner
2017-10-17 9:53 ` Jan Kara [this message]
2017-10-17 10:41 ` Eryu Guan
2017-10-17 10:44 ` Lukas Czerner
2017-10-17 14:43 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171017095356.GQ9762@quack2.suse.cz \
--to=jack@suse.cz \
--cc=eguan@redhat.com \
--cc=lczerner@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).