linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Matthew Wilcox <willy@infradead.org>
Cc: Jan Kara <jack@suse.cz>,
	linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org
Subject: Re: [PATCH 0/3 RFC] fs: Hole punch vs page cache filling races
Date: Fri, 22 Jan 2021 15:32:32 +0100	[thread overview]
Message-ID: <20210122143232.GA1175@quack2.suse.cz> (raw)
In-Reply-To: <20210121192755.GC4127393@casper.infradead.org>

On Thu 21-01-21 19:27:55, Matthew Wilcox wrote:
> On Wed, Jan 20, 2021 at 05:06:08PM +0100, Jan Kara wrote:
> > Hello,
> > 
> > Amir has reported [1] a that ext4 has a potential issues when reads can race
> > with hole punching possibly exposing stale data from freed blocks or even
> > corrupting filesystem when stale mapping data gets used for writeout. The
> > problem is that during hole punching, new page cache pages can get instantiated
> > in a punched range after truncate_inode_pages() has run but before the
> > filesystem removes blocks from the file.  In principle any filesystem
> > implementing hole punching thus needs to implement a mechanism to block
> > instantiating page cache pages during hole punching to avoid this race. This is
> > further complicated by the fact that there are multiple places that can
> > instantiate pages in page cache.  We can have regular read(2) or page fault
> > doing this but fadvise(2) or madvise(2) can also result in reading in page
> > cache pages through force_page_cache_readahead().
> 
> Doesn't this indicate that we're doing truncates in the wrong order?
> ie first we should deallocate the blocks, then we should free the page
> cache that was caching the contents of those blocks.  We'd need to
> make sure those pages in the page cache don't get written back to disc
> (either by taking pages in the page cache off the lru list or having
> the filesystem handle writeback of pages to a freed extent as a no-op).

Well, it depends on how much you wish to complicate the matters :).
Filesystems have metadata information attached to pages (e.g. buffer
heads), once you are removing blocks from a file, this information is
becoming stale. So it makes perfect sense to first evict page cache to
remove this metadata caching information and then remove blocks from
on-disk structures.

You can obviously try to do it the other way around - i.e., first remove
blocks from on-disk structures and then remove the cached information from
the page cache. But then you have to make sure stale cached information
isn't used until everything is in sync. So whichever way you slice it, you
have information in two places, you need to keep it in sync and you need
some synchronization between different updaters of this information
in both places so that they cannot get those two places out of sync...

TLDR: I don't see much benefit in switching the ordering.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2021-01-22 14:43 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-20 16:06 [PATCH 0/3 RFC] fs: Hole punch vs page cache filling races Jan Kara
2021-01-20 16:06 ` [PATCH 1/3] mm: Do not pass iter into generic_file_buffered_read_get_pages() Jan Kara
2021-01-20 16:18   ` Christoph Hellwig
2021-01-20 16:06 ` [PATCH 2/3] mm: Provide address_space operation for filling pages for read Jan Kara
2021-01-20 16:20   ` Christoph Hellwig
2021-01-20 17:27     ` Jan Kara
2021-01-20 17:28       ` Christoph Hellwig
2021-01-20 17:56         ` Matthew Wilcox
2021-04-02 21:17     ` Kent Overstreet
2021-04-06 12:21       ` Jan Kara
2021-01-20 16:06 ` [PATCH 3/3] ext4: Fix stale data exposure when read races with hole punch Jan Kara
2021-01-21 19:27 ` [PATCH 0/3 RFC] fs: Hole punch vs page cache filling races Matthew Wilcox
2021-01-22 14:32   ` Jan Kara [this message]
2021-04-02 19:34 ` Theodore Ts'o
2021-04-06 12:17   ` Jan Kara
2021-04-06 16:45     ` Theodore Ts'o
2021-04-06 16:50       ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210122143232.GA1175@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).