Linux-ext4 Archive on lore.kernel.org
 help / color / Atom feed
From: Jan Kara <jack@suse.cz>
To: Christoph Hellwig <hch@infradead.org>
Cc: Zhang Yi <yi.zhang@huawei.com>,
	linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz,
	yukuai3@huawei.com
Subject: Re: [RFC PATCH v2 7/7] ext4: fix race between blkdev_releasepage() and ext4_put_super()
Date: Wed, 21 Apr 2021 15:46:34 +0200
Message-ID: <20210421134634.GT8706@quack2.suse.cz> (raw)
In-Reply-To: <20210420130841.GA3618564@infradead.org>

On Tue 20-04-21 14:08:41, Christoph Hellwig wrote:
> On Fri, Apr 16, 2021 at 04:00:48PM +0800, Zhang Yi wrote:
> > Now, we use already use "if (bdev->bd_super)" to prevent call into
> > ->bdev_try_to_free_page unless the super is alive, and the problem is
> > bd_super becomes NULL concurrently after this check. So, IIUC, I think it's
> > the same to switch to check the superblock is active or not. The acvive
> > flag also could becomes inactive (raced by umount) after we call into
> > bdev_try_to_free_page().
> 
> Indeed.
> 
> > In order to close this race, One solution is introduce a lock to synchronize
> > the active state between kill_block_super() and blkdev_releasepage(), but
> > the releasing page process have to try to acquire this lock in
> > blkdev_releasepage() for each page, and the umount process still need to wait
> > until the page release if some one invoke into ->bdev_try_to_free_page().
> > I think this solution may affect performace and is not a good way.
> > Think about it in depth, use percpu refcount seems have the smallest
> > performance effect on blkdev_releasepage().
> > 
> > If you don't like the refcount, maybe we could add synchronize_rcu_expedited()
> > in ext4_put_super(), it also could prevent this race. Any suggestions?
> 
> I really don't like to put a lot of overhead into the core VFS and block
> device code.  ext4/jbd does not own the block device inode and really
> has no business controlling releasepage for it.  I suspect the right
> answer might be to simply revert the commit that added this hook.

Indeed, after 12 years in kernel .bdev_try_to_free_page is implemented only
by ext4. So maybe it is not that important? I agree with Zhang and
Christoph that getting the lifetime rules sorted out will be hairy and it
is questionable, whether it is worth the additional pages we can reclaim.
Ted, do you remember what was the original motivation for this?

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply index

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-14 13:47 [RFC PATCH v2 0/7] ext4, jbd2: fix 3 issues about bdev_try_to_free_page() Zhang Yi
2021-04-14 13:47 ` [RFC PATCH v2 1/7] jbd2: remove the out label in __jbd2_journal_remove_checkpoint() Zhang Yi
2021-04-21 10:01   ` Jan Kara
2021-04-14 13:47 ` [RFC PATCH v2 2/7] jbd2: ensure abort the journal if detect IO error when writing original buffer back Zhang Yi
2021-04-21 13:20   ` Jan Kara
2021-04-14 13:47 ` [RFC PATCH v2 3/7] jbd2: don't abort the journal when freeing buffers Zhang Yi
2021-04-21 13:23   ` Jan Kara
2021-04-14 13:47 ` [RFC PATCH v2 4/7] jbd2: do not free buffers in jbd2_journal_try_to_free_buffers() Zhang Yi
2021-04-15 14:46   ` Christoph Hellwig
2021-04-14 13:47 ` [RFC PATCH v2 5/7] ext4: use RCU to protect accessing superblock in blkdev_releasepage() Zhang Yi
2021-04-15 14:48   ` Christoph Hellwig
2021-04-14 13:47 ` [RFC PATCH v2 6/7] fs: introduce a usage count into the superblock Zhang Yi
2021-04-15 14:40   ` Christoph Hellwig
2021-04-16  8:00     ` Zhang Yi
2021-04-14 13:47 ` [RFC PATCH v2 7/7] ext4: fix race between blkdev_releasepage() and ext4_put_super() Zhang Yi
2021-04-15 14:52   ` Christoph Hellwig
2021-04-16  8:00     ` Zhang Yi
2021-04-20 13:08       ` Christoph Hellwig
2021-04-21 13:46         ` Jan Kara [this message]
2021-04-21 16:57           ` Theodore Ts'o
2021-04-22  9:04             ` Jan Kara
2021-04-23 11:39               ` Zhang Yi
2021-04-23 16:06                 ` Jan Kara
2021-04-23 14:40               ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210421134634.GT8706@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=adilger.kernel@dilger.ca \
    --cc=hch@infradead.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=yi.zhang@huawei.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-ext4 Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-ext4/0 linux-ext4/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-ext4 linux-ext4/ https://lore.kernel.org/linux-ext4 \
		linux-ext4@vger.kernel.org
	public-inbox-index linux-ext4

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-ext4


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git