From: Matthew Wilcox <willy@infradead.org> To: Jan Kara <jack@suse.cz> Cc: linux-fsdevel@vger.kernel.org, Christoph Hellwig <hch@infradead.org>, Dave Chinner <david@fromorbit.com>, ceph-devel@vger.kernel.org, Chao Yu <yuchao0@huawei.com>, Damien Le Moal <damien.lemoal@wdc.com>, "Darrick J. Wong" <darrick.wong@oracle.com>, Jaegeuk Kim <jaegeuk@kernel.org>, Jeff Layton <jlayton@kernel.org>, Johannes Thumshirn <jth@kernel.org>, linux-cifs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, linux-mm@kvack.org, linux-xfs@vger.kernel.org, Miklos Szeredi <miklos@szeredi.hu>, Steve French <sfrench@samba.org>, Ted Tso <tytso@mit.edu> Subject: Re: [PATCH 03/11] mm: Protect operations adding pages to page cache with invalidate_lock Date: Wed, 12 May 2021 15:40:21 +0100 [thread overview] Message-ID: <YJvo1bGG1tG+gtgC@casper.infradead.org> (raw) In-Reply-To: <20210512134631.4053-3-jack@suse.cz> On Wed, May 12, 2021 at 03:46:11PM +0200, Jan Kara wrote: > Currently, serializing operations such as page fault, read, or readahead > against hole punching is rather difficult. The basic race scheme is > like: > > fallocate(FALLOC_FL_PUNCH_HOLE) read / fault / .. > truncate_inode_pages_range() > <create pages in page > cache here> > <update fs block mapping and free blocks> > > Now the problem is in this way read / page fault / readahead can > instantiate pages in page cache with potentially stale data (if blocks > get quickly reused). Avoiding this race is not simple - page locks do > not work because we want to make sure there are *no* pages in given > range. inode->i_rwsem does not work because page fault happens under > mmap_sem which ranks below inode->i_rwsem. Also using it for reads makes > the performance for mixed read-write workloads suffer. > > So create a new rw_semaphore in the address_space - invalidate_lock - > that protects adding of pages to page cache for page faults / reads / > readahead. Remind me (or, rather, add to the documentation) why we have to hold the invalidate_lock during the call to readpage / readahead, and we don't just hold it around the call to add_to_page_cache / add_to_page_cache_locked / add_to_page_cache_lru ? I appreciate that ->readpages is still going to suck, but we're down to just three implementations of ->readpages now (9p, cifs & nfs). Also, could I trouble you to run the comments through 'fmt' (or equivalent)? It's easier to read if you're not kissing right up on 80 columns. > +++ b/fs/inode.c > @@ -190,6 +190,9 @@ int inode_init_always(struct super_block *sb, struct inode *inode) > mapping_set_gfp_mask(mapping, GFP_HIGHUSER_MOVABLE); > mapping->private_data = NULL; > mapping->writeback_index = 0; > + init_rwsem(&mapping->invalidate_lock); > + lockdep_set_class(&mapping->invalidate_lock, > + &sb->s_type->invalidate_lock_key); Why not: __init_rwsem(&mapping->invalidate_lock, "mapping.invalidate_lock", &sb->s_type->invalidate_lock_key);
WARNING: multiple messages have this Message-ID (diff)
From: Matthew Wilcox <willy@infradead.org> To: Jan Kara <jack@suse.cz> Cc: linux-cifs@vger.kernel.org, Damien Le Moal <damien.lemoal@wdc.com>, linux-ext4@vger.kernel.org, Ted Tso <tytso@mit.edu>, "Darrick J. Wong" <darrick.wong@oracle.com>, Jeff Layton <jlayton@kernel.org>, Steve French <sfrench@samba.org>, Dave Chinner <david@fromorbit.com>, linux-f2fs-devel@lists.sourceforge.net, Christoph Hellwig <hch@infradead.org>, linux-mm@kvack.org, Miklos Szeredi <miklos@szeredi.hu>, linux-fsdevel@vger.kernel.org, Jaegeuk Kim <jaegeuk@kernel.org>, ceph-devel@vger.kernel.org, Johannes Thumshirn <jth@kernel.org>, linux-xfs@vger.kernel.org Subject: Re: [f2fs-dev] [PATCH 03/11] mm: Protect operations adding pages to page cache with invalidate_lock Date: Wed, 12 May 2021 15:40:21 +0100 [thread overview] Message-ID: <YJvo1bGG1tG+gtgC@casper.infradead.org> (raw) In-Reply-To: <20210512134631.4053-3-jack@suse.cz> On Wed, May 12, 2021 at 03:46:11PM +0200, Jan Kara wrote: > Currently, serializing operations such as page fault, read, or readahead > against hole punching is rather difficult. The basic race scheme is > like: > > fallocate(FALLOC_FL_PUNCH_HOLE) read / fault / .. > truncate_inode_pages_range() > <create pages in page > cache here> > <update fs block mapping and free blocks> > > Now the problem is in this way read / page fault / readahead can > instantiate pages in page cache with potentially stale data (if blocks > get quickly reused). Avoiding this race is not simple - page locks do > not work because we want to make sure there are *no* pages in given > range. inode->i_rwsem does not work because page fault happens under > mmap_sem which ranks below inode->i_rwsem. Also using it for reads makes > the performance for mixed read-write workloads suffer. > > So create a new rw_semaphore in the address_space - invalidate_lock - > that protects adding of pages to page cache for page faults / reads / > readahead. Remind me (or, rather, add to the documentation) why we have to hold the invalidate_lock during the call to readpage / readahead, and we don't just hold it around the call to add_to_page_cache / add_to_page_cache_locked / add_to_page_cache_lru ? I appreciate that ->readpages is still going to suck, but we're down to just three implementations of ->readpages now (9p, cifs & nfs). Also, could I trouble you to run the comments through 'fmt' (or equivalent)? It's easier to read if you're not kissing right up on 80 columns. > +++ b/fs/inode.c > @@ -190,6 +190,9 @@ int inode_init_always(struct super_block *sb, struct inode *inode) > mapping_set_gfp_mask(mapping, GFP_HIGHUSER_MOVABLE); > mapping->private_data = NULL; > mapping->writeback_index = 0; > + init_rwsem(&mapping->invalidate_lock); > + lockdep_set_class(&mapping->invalidate_lock, > + &sb->s_type->invalidate_lock_key); Why not: __init_rwsem(&mapping->invalidate_lock, "mapping.invalidate_lock", &sb->s_type->invalidate_lock_key); _______________________________________________ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
next prev parent reply other threads:[~2021-05-12 14:41 UTC|newest] Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-05-12 13:46 [PATCH 0/11 v5] fs: Hole punch vs page cache filling races Jan Kara 2021-05-12 13:46 ` [f2fs-dev] " Jan Kara 2021-05-12 13:46 ` [PATCH 01/11] mm: Fix comments mentioning i_mutex Jan Kara 2021-05-12 13:46 ` [f2fs-dev] " Jan Kara 2021-05-12 13:46 ` [PATCH 02/11] documentation: Sync file_operations members with reality Jan Kara 2021-05-12 13:46 ` [f2fs-dev] " Jan Kara 2021-05-12 13:46 ` [PATCH 03/11] mm: Protect operations adding pages to page cache with invalidate_lock Jan Kara 2021-05-12 13:46 ` [f2fs-dev] " Jan Kara 2021-05-12 14:20 ` Matthew Wilcox 2021-05-12 14:20 ` [f2fs-dev] " Matthew Wilcox 2021-05-13 17:49 ` Jan Kara 2021-05-13 17:49 ` [f2fs-dev] " Jan Kara 2021-05-12 14:40 ` Matthew Wilcox [this message] 2021-05-12 14:40 ` Matthew Wilcox 2021-05-13 19:01 ` Jan Kara 2021-05-13 19:01 ` [f2fs-dev] " Jan Kara 2021-05-13 19:38 ` Matthew Wilcox 2021-05-13 19:38 ` [f2fs-dev] " Matthew Wilcox 2021-05-14 11:07 ` Jan Kara 2021-05-14 11:07 ` [f2fs-dev] " Jan Kara 2021-05-12 15:23 ` Darrick J. Wong 2021-05-12 15:23 ` [f2fs-dev] " Darrick J. Wong 2021-05-13 17:44 ` Jan Kara 2021-05-13 17:44 ` [f2fs-dev] " Jan Kara 2021-05-13 18:52 ` Darrick J. Wong 2021-05-13 18:52 ` [f2fs-dev] " Darrick J. Wong 2021-05-13 23:19 ` Dave Chinner 2021-05-13 23:19 ` [f2fs-dev] " Dave Chinner 2021-05-14 16:17 ` Darrick J. Wong 2021-05-14 16:17 ` [f2fs-dev] " Darrick J. Wong 2021-05-17 11:21 ` Jan Kara 2021-05-17 11:21 ` [f2fs-dev] " Jan Kara 2021-05-18 22:36 ` Dave Chinner 2021-05-18 22:36 ` [f2fs-dev] " Dave Chinner 2021-05-19 10:57 ` Jan Kara 2021-05-19 10:57 ` [f2fs-dev] " Jan Kara 2021-05-12 13:46 ` [PATCH 04/11] ext4: Convert to use mapping->invalidate_lock Jan Kara 2021-05-12 13:46 ` [f2fs-dev] " Jan Kara 2021-05-12 13:46 ` [PATCH 05/11] ext2: Convert to using invalidate_lock Jan Kara 2021-05-12 13:46 ` [f2fs-dev] " Jan Kara 2021-05-12 13:46 ` [PATCH 06/11] xfs: Convert to use invalidate_lock Jan Kara 2021-05-12 13:46 ` [f2fs-dev] " Jan Kara 2021-05-12 13:46 ` [PATCH 07/11] zonefs: Convert to using invalidate_lock Jan Kara 2021-05-12 13:46 ` [f2fs-dev] " Jan Kara 2021-05-13 0:34 ` Damien Le Moal 2021-05-13 0:34 ` [f2fs-dev] " Damien Le Moal 2021-05-13 0:34 ` Damien Le Moal 2021-05-12 13:46 ` [PATCH 08/11] f2fs: " Jan Kara 2021-05-12 13:46 ` [f2fs-dev] " Jan Kara 2021-05-12 18:00 ` kernel test robot 2021-05-12 18:00 ` kernel test robot 2021-05-12 13:46 ` [PATCH 09/11] fuse: " Jan Kara 2021-05-12 13:46 ` [f2fs-dev] " Jan Kara 2021-05-12 13:46 ` [PATCH 10/11] ceph: Fix race between hole punch and page fault Jan Kara 2021-05-12 13:46 ` [f2fs-dev] " Jan Kara 2021-05-12 15:19 ` Jeff Layton 2021-05-12 15:19 ` [f2fs-dev] " Jeff Layton 2021-05-12 15:19 ` Jeff Layton 2021-05-12 13:46 ` [PATCH 11/11] cifs: " Jan Kara 2021-05-12 13:46 ` [f2fs-dev] " Jan Kara
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=YJvo1bGG1tG+gtgC@casper.infradead.org \ --to=willy@infradead.org \ --cc=ceph-devel@vger.kernel.org \ --cc=damien.lemoal@wdc.com \ --cc=darrick.wong@oracle.com \ --cc=david@fromorbit.com \ --cc=hch@infradead.org \ --cc=jack@suse.cz \ --cc=jaegeuk@kernel.org \ --cc=jlayton@kernel.org \ --cc=jth@kernel.org \ --cc=linux-cifs@vger.kernel.org \ --cc=linux-ext4@vger.kernel.org \ --cc=linux-f2fs-devel@lists.sourceforge.net \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=linux-xfs@vger.kernel.org \ --cc=miklos@szeredi.hu \ --cc=sfrench@samba.org \ --cc=tytso@mit.edu \ --cc=yuchao0@huawei.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.