From: Jan Kara <jack@suse.cz>
To: Matthew Wilcox <willy@infradead.org>
Cc: Jan Kara <jack@suse.cz>, Jan Kara <jack@suse.com>,
tytso@mit.edu, Andreas Dilger <adilger.kernel@dilger.ca>,
linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 3/3] ext4: Use generic_quota_read()
Date: Wed, 8 Jun 2022 16:21:38 +0200 [thread overview]
Message-ID: <20220608142138.7nejka3yobgsdmd7@quack3.lan> (raw)
In-Reply-To: <Yp/+fSoHgPIhiHQR@casper.infradead.org>
On Wed 08-06-22 02:42:21, Matthew Wilcox wrote:
> On Mon, Jun 06, 2022 at 10:38:14AM +0200, Jan Kara wrote:
> > On Sun 05-06-22 15:38:15, Matthew Wilcox (Oracle) wrote:
> > > The comment about the page cache is rather stale; the buffer cache will
> > > read into the page cache if the buffer isn't present, and the page cache
> > > will not take any locks if the page is present.
> > >
> > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> >
> > This will not work for couple of reasons, see below. BTW, I don't think the
> > comment about page cache was stale (but lacking details I admit ;). As far
> > as I remember (and it was really many years ago - definitely pre-git era)
> > the problem was (mainly on the write side) that before current state of the
> > code we were using calls like vfs_read() / vfs_write() to get quota
> > information and that was indeed prone to deadlocks.
>
> Ah yes, vfs_write() might indeed be prone to deadlocks. Particularly
> if we're doing it under the dq_mutex and any memory allocation might
> have recursed into reclaim ;-)
>
> I actually found the commit in linux-fullhistory. Changelog for
> context:
>
> commit b72debd66a6ed
> Author: Jan Kara <jack@suse.cz>
> Date: Mon Jan 3 04:12:24 2005 -0800
>
> [PATCH] Fix of quota deadlock on pagelock: quota core
\o/ to you history searching skills :)
> > > @@ -6924,20 +6882,21 @@ static ssize_t ext4_quota_write(struct super_block *sb, int type,
> > > return -EIO;
> > > }
> > >
> > > - do {
> > > - bh = ext4_bread(handle, inode, blk,
> > > - EXT4_GET_BLOCKS_CREATE |
> > > - EXT4_GET_BLOCKS_METADATA_NOFAIL);
> > > - } while (PTR_ERR(bh) == -ENOSPC &&
> > > - ext4_should_retry_alloc(inode->i_sb, &retries));
> > > - if (IS_ERR(bh))
> > > - return PTR_ERR(bh);
> > > - if (!bh)
> > > + folio = read_mapping_folio(inode->i_mapping, off / PAGE_SIZE, NULL);
> > > + if (IS_ERR(folio))
> > > + return PTR_ERR(folio);
> > > + head = folio_buffers(folio);
> > > + if (!head)
> > > + head = alloc_page_buffers(&folio->page, sb->s_blocksize, false);
> > > + if (!head)
> > > goto out;
> > > + bh = head;
> > > + while ((bh_offset(bh) + sb->s_blocksize) <= (off % PAGE_SIZE))
> > > + bh = bh->b_this_page;
> >
> > We miss proper handling of blocks that are currently beyond i_size
> > (we are extending the quota file), plus we also miss any mapping of buffers
> > to appropriate disk blocks here...
> >
> > It could be all fixed by replicating what we do in ext4_write_begin() but
> > I'm not quite convinced using inode's page cache is really worth it...
>
> Ah, yes, write_begin. Of course that's what I should have used.
>
> I'm looking at this from the point of view of removing buffer_heads
> where possible. Of course, it's not possible for ext4 while the journal
> relies on buffer_heads, but if we can steer filesystems away from using
> sb_bread() (or equivalents), I think that's a good thing.
Well, ext4 uses sb_bread() (sb_getblk()) for all its metadata so quota
code, which is rather well localized, is the least of your worries I'm
afraid ;).
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply other threads:[~2022-06-08 14:21 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-05 14:38 [PATCH 0/3] Cache quota files in the page cache Matthew Wilcox (Oracle)
2022-06-05 14:38 ` [PATCH 1/3] quota: Prevent memory allocation recursion while holding dq_lock Matthew Wilcox (Oracle)
2022-06-06 8:03 ` Jan Kara
2022-06-06 12:42 ` Matthew Wilcox
2022-06-06 13:08 ` Jan Kara
2022-06-05 14:38 ` [PATCH 2/3] quota: Support using the page cache for quota files Matthew Wilcox (Oracle)
2022-06-06 2:53 ` kernel test robot
2022-06-06 4:05 ` kernel test robot
2022-06-06 4:36 ` kernel test robot
2022-06-05 14:38 ` [PATCH 3/3] ext4: Use generic_quota_read() Matthew Wilcox (Oracle)
2022-06-06 8:38 ` Jan Kara
2022-06-08 1:42 ` Matthew Wilcox
2022-06-08 14:21 ` Jan Kara [this message]
2022-06-13 7:59 ` [ext4] fa96490369: WARNING:at_fs/ext4/inode.c:#ext4_invalidate_folio kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220608142138.7nejka3yobgsdmd7@quack3.lan \
--to=jack@suse.cz \
--cc=adilger.kernel@dilger.ca \
--cc=jack@suse.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=tytso@mit.edu \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).