Linux-ext4 Archive on lore.kernel.org
 help / color / Atom feed
From: Jan Kara <jack@suse.cz>
To: Zhang Yi <yi.zhang@huawei.com>
Cc: linux-ext4@vger.kernel.org, tytso@mit.edu,
	adilger.kernel@dilger.ca, jack@suse.cz, yukuai3@huawei.com
Subject: Re: [PATCH 1/3] jbd2: protect buffers release with j_checkpoint_mutex
Date: Thu, 8 Apr 2021 15:45:51 +0200
Message-ID: <20210408134551.GC3271@quack2.suse.cz> (raw)
In-Reply-To: <20210408113618.1033785-2-yi.zhang@huawei.com>

On Thu 08-04-21 19:36:16, Zhang Yi wrote:
> There is a race between jbd2_journal_try_to_free_buffers() and
> jbd2_journal_destroy(), so the jbd2_log_do_checkpoint() may still
> missing to detect the buffer write io error flag and lead to filesystem
> inconsistency.
> 
> jbd2_journal_try_to_free_buffers()     ext4_put_super()
>                                         jbd2_journal_destroy()
>   __jbd2_journal_remove_checkpoint()
>   detect buffer write error              jbd2_log_do_checkpoint()
>                                          jbd2_cleanup_journal_tail()
>                                            <--- lead to inconsistency
>   jbd2_journal_abort()
> 
> Fix this issue by add j_checkpoint_mutex to protect journal buffer
> release on jbd2_journal_try_to_free_buffers().
> 
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>

Thanks for the patch Zhang. I agree with your problem analysis but I don't
think the solution is correct:

>  	J_ASSERT(PageLocked(page));
>  
> +	mutex_lock(&journal->j_checkpoint_mutex);

We cannot grab j_checkpoint_mutex inside jbd2_journal_try_to_free_buffers()
(or even ext4_releasepage()) because that function is called withe a page
lock which ranks below the checkpoint mutex - generally page locks are
acquired within a transaction and thus all locks required to start a
transaction (and j_checkpoint_mutex is one of them) rank above the page
lock.

Also even if the lock ordering was OK, grabbing j_checkpoint_mutex for
every page from memory reclaim just to close this rare race seems like a
performance overkill.

What we seem to need is a quick way of marking the journal as "IO error
occured" in __journal_try_to_free_buffer() before actually removing the
buffer from the checkpoint list. Perhaps this marking could even happen
already in __jbd2_journal_remove_checkpoint() and we can reuse it in
jbd2_log_do_checkpoint() for IO error handling as well... And then once we
are in a safer context, we can do:

	if (!is_journal_aborted(journal) && journal_io_error_happened(journal))
		jbd2_journal_abort(...)

								Honza

>  	head = page_buffers(page);
>  	bh = head;
>  	do {
> @@ -2163,6 +2164,7 @@ int jbd2_journal_try_to_free_buffers(journal_t *journal, struct page *page)
>  	if (has_write_io_error)
>  		jbd2_journal_abort(journal, -EIO);
>  
> +	mutex_unlock(&journal->j_checkpoint_mutex);
>  	return ret;
>  }
>  
> -- 
> 2.25.4
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply index

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-08 11:36 [PATCH 0/3] ext4: fix two issue about bdev_try_to_free_page() Zhang Yi
2021-04-08 11:36 ` [PATCH 1/3] jbd2: protect buffers release with j_checkpoint_mutex Zhang Yi
2021-04-08 13:45   ` Jan Kara [this message]
2021-04-08 11:36 ` [PATCH 2/3] jbd2: do not free buffers in jbd2_journal_try_to_free_buffers() Zhang Yi
2021-04-08 11:36 ` [PATCH 3/3] ext4: add rcu to prevent use after free when umount filesystem Zhang Yi
2021-04-08 13:51   ` kernel test robot
2021-04-08 13:56   ` Jan Kara
2021-04-08 14:38     ` Zhang Yi
2021-04-08 14:53       ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210408134551.GC3271@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=adilger.kernel@dilger.ca \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=yi.zhang@huawei.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-ext4 Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-ext4/0 linux-ext4/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-ext4 linux-ext4/ https://lore.kernel.org/linux-ext4 \
		linux-ext4@vger.kernel.org
	public-inbox-index linux-ext4

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-ext4


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git