From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com ([209.132.183.28]:39900 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752441AbdHOMrX (ORCPT ); Tue, 15 Aug 2017 08:47:23 -0400 Date: Tue, 15 Aug 2017 14:47:20 +0200 From: Lukas Czerner To: Jan Kara Cc: Jeff Moyer , linux-fsdevel@vger.kernel.org, viro@zeniv.linux.org.uk, david@fromorbit.com Subject: Re: [PATCH v6] fs: Fix page cache inconsistency when mixing buffered and AIO DIO Message-ID: <20170815124720.ueee4mbhfuffsknp@rh_laptop> References: <1500463692-4982-1-git-send-email-lczerner@redhat.com> <1502369997-15665-1-git-send-email-lczerner@redhat.com> <20170810135640.GB14925@quack2.suse.cz> <20170811090301.kr5f4tgbaw2jaj7j@rh_laptop> <20170814094331.GD16353@quack2.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170814094331.GD16353@quack2.suse.cz> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Mon, Aug 14, 2017 at 11:43:31AM +0200, Jan Kara wrote: > Hello, > > On Fri 11-08-17 11:03:01, Lukas Czerner wrote: > > On Thu, Aug 10, 2017 at 10:22:47AM -0400, Jeff Moyer wrote: > > > Jan Kara writes: > > > > > > > On Thu 10-08-17 14:59:57, Lukas Czerner wrote: > > > >> Currently when mixing buffered reads and asynchronous direct writes it > > > >> is possible to end up with the situation where we have stale data in the > > > >> page cache while the new data is already written to disk. This is > > > >> permanent until the affected pages are flushed away. Despite the fact > > > >> that mixing buffered and direct IO is ill-advised it does pose a thread > > > >> for a data integrity, is unexpected and should be fixed. > > > >> > > > >> Fix this by deferring completion of asynchronous direct writes to a > > > >> process context in the case that there are mapped pages to be found in > > > >> the inode. Later before the completion in dio_complete() invalidate > > > >> the pages in question. This ensures that after the completion the pages > > > >> in the written area are either unmapped, or populated with up-to-date > > > >> data. Also do the same for the iomap case which uses > > > >> iomap_dio_complete() instead. > > > >> > > > >> This has a side effect of deferring the completion to a process context > > > >> for every AIO DIO that happens on inode that has pages mapped. However > > > >> since the consensus is that this is ill-advised practice the performance > > > >> implication should not be a problem. > > > >> > > > >> This was based on proposal from Jeff Moyer, thanks! > > > > > > > > It seems the invalidation can be also removed from > > > > generic_file_direct_write(), can't it? It is duplicit there the same way as > > > > it was in the iomap code... > > > > > > Yep, sure looks that way. > > > > Hrm, ok. Technically speaking generic_file_direct_write does not have to > > eventually end up with dio_complete() being called. This will change the > > behaviour for those that implement dio differently. Looking at the users > > now, vast majority will end up with complete_dio() so maybe this is not > > a problem. > > OK, so this seems to be the problem with 9p, fuse, nfs, lustre. > > > This is in contrast with iomap_dio_rw() which will end up calling > > iomap_dio_complete() so the situation is different there. > > > > Maybe adding mapping->nrpages check would be better than outright > > removing it ? > > OK, I agree we cannot just remove the invalidation. But shouldn't we rather > fix the above mentioned filesystems? Otherwise they will keep having issues > you are trying to fix? But for now I could live with keeping the > invalidation behind nrpages check and adding a comment why we kept it > there... Right, I'd rather have closere on this neverending patch. The rest of the fs can be fixed later. -Lukas > > Honza > -- > Jan Kara > SUSE Labs, CR