All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Lukas Czerner <lczerner@redhat.com>
Cc: Jeff Moyer <jmoyer@redhat.com>, Jan Kara <jack@suse.cz>,
	linux-fsdevel@vger.kernel.org, viro@zeniv.linux.org.uk,
	david@fromorbit.com
Subject: Re: [PATCH v6] fs: Fix page cache inconsistency when mixing buffered and AIO DIO
Date: Mon, 14 Aug 2017 11:43:31 +0200	[thread overview]
Message-ID: <20170814094331.GD16353@quack2.suse.cz> (raw)
In-Reply-To: <20170811090301.kr5f4tgbaw2jaj7j@rh_laptop>

Hello,

On Fri 11-08-17 11:03:01, Lukas Czerner wrote:
> On Thu, Aug 10, 2017 at 10:22:47AM -0400, Jeff Moyer wrote:
> > Jan Kara <jack@suse.cz> writes:
> > 
> > > On Thu 10-08-17 14:59:57, Lukas Czerner wrote:
> > >> Currently when mixing buffered reads and asynchronous direct writes it
> > >> is possible to end up with the situation where we have stale data in the
> > >> page cache while the new data is already written to disk. This is
> > >> permanent until the affected pages are flushed away. Despite the fact
> > >> that mixing buffered and direct IO is ill-advised it does pose a thread
> > >> for a data integrity, is unexpected and should be fixed.
> > >> 
> > >> Fix this by deferring completion of asynchronous direct writes to a
> > >> process context in the case that there are mapped pages to be found in
> > >> the inode. Later before the completion in dio_complete() invalidate
> > >> the pages in question. This ensures that after the completion the pages
> > >> in the written area are either unmapped, or populated with up-to-date
> > >> data. Also do the same for the iomap case which uses
> > >> iomap_dio_complete() instead.
> > >> 
> > >> This has a side effect of deferring the completion to a process context
> > >> for every AIO DIO that happens on inode that has pages mapped. However
> > >> since the consensus is that this is ill-advised practice the performance
> > >> implication should not be a problem.
> > >> 
> > >> This was based on proposal from Jeff Moyer, thanks!
> > >
> > > It seems the invalidation can be also removed from
> > > generic_file_direct_write(), can't it? It is duplicit there the same way as
> > > it was in the iomap code...
> > 
> > Yep, sure looks that way.
> 
> Hrm, ok. Technically speaking generic_file_direct_write does not have to
> eventually end up with dio_complete() being called. This will change the
> behaviour for those that implement dio differently. Looking at the users
> now, vast majority will end up with complete_dio() so maybe this is not
> a problem.

OK, so this seems to be the problem with 9p, fuse, nfs, lustre.

> This is in contrast with iomap_dio_rw() which will end up calling
> iomap_dio_complete() so the situation is different there.
> 
> Maybe adding mapping->nrpages check would be better than outright
> removing it ?

OK, I agree we cannot just remove the invalidation. But shouldn't we rather
fix the above mentioned filesystems? Otherwise they will keep having issues
you are trying to fix? But for now I could live with keeping the
invalidation behind nrpages check and adding a comment why we kept it
there...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2017-08-14  9:43 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-13 15:17 [PATCH] fs: Fix page cache inconsistency when mixing buffered and AIO DIO Lukas Czerner
2017-07-14 10:41 ` kbuild test robot
2017-07-14 13:40   ` Lukas Czerner
2017-07-14 15:40 ` [PATCH v2] " Lukas Czerner
2017-07-17 15:12   ` Jan Kara
2017-07-17 15:28     ` Lukas Czerner
2017-07-17 15:39       ` Jeff Moyer
2017-07-17 16:17         ` Jan Kara
2017-07-17 19:52           ` Jeff Moyer
2017-07-18  7:39         ` Lukas Czerner
2017-07-18  9:06           ` Jan Kara
2017-07-18  9:32             ` Lukas Czerner
2017-07-18 12:19   ` [PATCH v3] " Lukas Czerner
2017-07-18 13:44     ` Christoph Hellwig
2017-07-18 14:17       ` Jan Kara
2017-07-19  8:42       ` Lukas Czerner
2017-07-19  8:48     ` [PATCH v4] " Lukas Czerner
2017-07-19  9:26       ` Jan Kara
2017-07-19 11:01         ` Lukas Czerner
2017-07-19 11:28     ` [PATCH v5] " Lukas Czerner
2017-07-19 11:37       ` Jan Kara
2017-07-19 12:17       ` Jeff Moyer
2017-08-03 18:10       ` Jeff Moyer
2017-08-04 10:09         ` Dave Chinner
2017-08-07 15:52           ` Jeff Moyer
2017-08-08  8:41             ` Lukas Czerner
2017-08-10 12:59       ` [PATCH v6] " Lukas Czerner
2017-08-10 13:56         ` Jan Kara
2017-08-10 14:22           ` Jeff Moyer
2017-08-11  9:03             ` Lukas Czerner
2017-08-14  9:43               ` Jan Kara [this message]
2017-08-15 12:47                 ` Lukas Czerner
2017-08-15 13:28         ` [PATCH v7] " Lukas Czerner
2017-08-16 13:15           ` Jan Kara
2017-08-16 16:01           ` Darrick J. Wong
2017-09-21 13:44           ` Jeff Moyer
2017-09-21 13:44           ` Lukas Czerner
2017-09-21 14:14             ` Jens Axboe
2017-10-10 14:34           ` David Sterba
2017-10-11  9:21             ` Lukas Czerner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170814094331.GD16353@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=david@fromorbit.com \
    --cc=jmoyer@redhat.com \
    --cc=lczerner@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.