From: Jan Kara <jack@suse.cz>
To: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>,
linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
linux-xfs@vger.kernel.org, Ted Tso <tytso@mit.edu>,
Christoph Hellwig <hch@infradead.org>,
Amir Goldstein <amir73il@gmail.com>
Subject: Re: [PATCH 2/7] mm: Protect operations adding pages to page cache with i_mapping_lock
Date: Wed, 14 Apr 2021 14:23:19 +0200 [thread overview]
Message-ID: <20210414122319.GD31323@quack2.suse.cz> (raw)
In-Reply-To: <20210414000113.GG63242@dread.disaster.area>
On Wed 14-04-21 10:01:13, Dave Chinner wrote:
> On Tue, Apr 13, 2021 at 01:28:46PM +0200, Jan Kara wrote:
> > *
> > * ->mmap_lock
> > * ->i_mmap_rwsem
> > @@ -85,7 +86,8 @@
> > * ->i_pages lock (arch-dependent flush_dcache_mmap_lock)
> > *
> > * ->mmap_lock
> > - * ->lock_page (access_process_vm)
> > + * ->i_mapping_sem (filemap_fault)
> > + * ->lock_page (filemap_fault, access_process_vm)
> > *
> > * ->i_rwsem (generic_perform_write)
> > * ->mmap_lock (fault_in_pages_readable->do_page_fault)
> > @@ -2276,16 +2278,28 @@ static int filemap_update_page(struct kiocb *iocb,
> > {
> > int error;
> >
> > + if (iocb->ki_flags & IOCB_NOWAIT) {
> > + if (!down_read_trylock(&mapping->host->i_mapping_sem))
> > + return -EAGAIN;
> > + } else {
> > + down_read(&mapping->host->i_mapping_sem);
> > + }
>
> We really need a lock primitive for this. The number of times this
> exact lock pattern is being replicated all through the IO path is
> getting out of hand.
>
> static inline bool
> down_read_try_or_lock(struct rwsem *sem, bool try)
> {
> if (try) {
> if (!down_read_trylock(sem))
> return false;
> } else {
> down_read(&mapping->host->i_mapping_sem);
> }
> return true;
> }
>
> and the callers become:
>
> if (!down_read_try_or_lock(sem, (iocb->ki_flags & IOCB_NOWAIT)))
> return -EAGAIN;
>
> We can do the same with mutex_try_or_lock(), down_try_or_lock(), etc
> and we don't need to rely on cargo cult knowledge to propagate this
> pattern anymore. Because I'm betting relatively few people actually
> know why the code is written this way because the only place it is
> documented is in an XFS commit message....
>
> Doing this is a separate cleanup, though, and not something that
> needs to be done in this patchset.
Yep, good idea but let's do it in a separate patch set.
> > index c5b0457415be..ac5bb50b3a4c 100644
> > --- a/mm/readahead.c
> > +++ b/mm/readahead.c
> > @@ -192,6 +192,7 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
> > */
> > unsigned int nofs = memalloc_nofs_save();
> >
> > + down_read(&mapping->host->i_mapping_sem);
> > /*
> > * Preallocate as many pages as we will need.
> > */
>
> I can't say I'm a great fan of having the mapping reach back up to
> the host to lock the host. THis seems the wrong way around to me
> given that most of the locking in the IO path is in "host locks
> mapping" and "mapping locks internal mapping structures" order...
>
> I also come back to the naming confusion here, in that when we look
> at this in long hand from the inode perspective, this chain actually
> looks like:
>
> lock(inode->i_mapping->inode->i_mapping_sem)
>
> i.e. the mapping is reaching back up outside it's scope to lock
> itself against other inode->i_mapping operations. Smells of layering
> violations to me.
>
> So, next question: should this truncate semanphore actually be part
> of the address space, not the inode? This patch is actually moving
> the page fault serialisation from the inode into the address space
> operations when page faults and page cache operations are done, so
> maybe the lock should also make that move? That would help clear up
> the naming problem, because now we can name it based around what it
> serialises in the address space, not the address space as a whole...
I think that moving the lock to address_space makes some sence although the
lock actually protects consistency of inode->i_mapping->i_pages with
whatever the filesystem has in its file_offset->disk_block mapping
structures (which are generally associated with the inode). So it is not
only about inode->i_mapping contents but I agree that struct address_space
is probably a bit more logical place than struct inode.
Regarding the name: How about i_pages_rwsem? The lock is protecting
invalidation of mapping->i_pages and needs to be held until insertion of
pages into i_pages is safe again...
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply other threads:[~2021-04-14 12:23 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-13 11:28 [PATCH 0/7 RFC v3] fs: Hole punch vs page cache filling races Jan Kara
2021-04-13 11:28 ` [PATCH 1/7] mm: Fix comments mentioning i_mutex Jan Kara
2021-04-13 12:38 ` Christoph Hellwig
2021-04-13 11:28 ` [PATCH 2/7] mm: Protect operations adding pages to page cache with i_mapping_lock Jan Kara
2021-04-13 12:57 ` Christoph Hellwig
2021-04-13 13:56 ` Jan Kara
2021-04-14 0:01 ` Dave Chinner
2021-04-14 12:23 ` Jan Kara [this message]
2021-04-14 21:57 ` Dave Chinner
2021-04-15 13:11 ` Jan Kara
2021-04-14 22:25 ` Matthew Wilcox
2021-04-15 2:05 ` Dave Chinner
2021-04-13 11:28 ` [PATCH 3/7] ext4: Convert to use inode->i_mapping_sem Jan Kara
2021-04-13 11:28 ` [PATCH 4/7] ext2: Convert to using i_mapping_sem Jan Kara
2021-04-13 11:28 ` [PATCH 5/7] xfs: Convert to use i_mapping_sem Jan Kara
2021-04-13 13:05 ` Christoph Hellwig
2021-04-13 13:42 ` Jan Kara
2021-04-13 11:28 ` [PATCH 6/7] zonefs: Convert to using i_mapping_sem Jan Kara
2021-04-13 11:28 ` [PATCH 7/7] fs: Remove i_mapping_sem protection from .page_mkwrite handlers Jan Kara
2021-04-13 13:09 ` [PATCH 0/7 RFC v3] fs: Hole punch vs page cache filling races Christoph Hellwig
2021-04-13 14:17 ` Jan Kara
2021-04-19 15:20 ` Matthew Wilcox
2021-04-19 16:25 ` Jan Kara
2021-04-20 22:12 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210414122319.GD31323@quack2.suse.cz \
--to=jack@suse.cz \
--cc=amir73il@gmail.com \
--cc=david@fromorbit.com \
--cc=hch@infradead.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).