All of lore.kernel.org
 help / color / mirror / Atom feed
From: "libaokun (A)" <libaokun1@huawei.com>
To: <linux-ext4@vger.kernel.org>
Cc: <tytso@mit.edu>, <adilger.kernel@dilger.ca>, <jack@suse.cz>,
	<linux-kernel@vger.kernel.org>, <yi.zhang@huawei.com>,
	<yebin10@huawei.com>, <yukuai3@huawei.com>,
	<stable@vger.kernel.org>, Hulk Robot <hulkci@huawei.com>,
	Baokun Li <libaokun1@huawei.com>
Subject: Re: [PATCH v3] ext4: fix race condition between ext4_write and ext4_convert_inline_data
Date: Mon, 9 May 2022 09:25:11 +0800	[thread overview]
Message-ID: <83ed1d6f-84f9-0e47-ddb1-b8cafc12338a@huawei.com> (raw)
In-Reply-To: <20220428134031.4153381-1-libaokun1@huawei.com>

A gentle ping.

在 2022/4/28 21:40, Baokun Li 写道:
> Hulk Robot reported a BUG_ON:
>   ==================================================================
>   EXT4-fs error (device loop3): ext4_mb_generate_buddy:805: group 0,
>   block bitmap and bg descriptor inconsistent: 25 vs 31513 free clusters
>   kernel BUG at fs/ext4/ext4_jbd2.c:53!
>   invalid opcode: 0000 [#1] SMP KASAN PTI
>   CPU: 0 PID: 25371 Comm: syz-executor.3 Not tainted 5.10.0+ #1
>   RIP: 0010:ext4_put_nojournal fs/ext4/ext4_jbd2.c:53 [inline]
>   RIP: 0010:__ext4_journal_stop+0x10e/0x110 fs/ext4/ext4_jbd2.c:116
>   [...]
>   Call Trace:
>    ext4_write_inline_data_end+0x59a/0x730 fs/ext4/inline.c:795
>    generic_perform_write+0x279/0x3c0 mm/filemap.c:3344
>    ext4_buffered_write_iter+0x2e3/0x3d0 fs/ext4/file.c:270
>    ext4_file_write_iter+0x30a/0x11c0 fs/ext4/file.c:520
>    do_iter_readv_writev+0x339/0x3c0 fs/read_write.c:732
>    do_iter_write+0x107/0x430 fs/read_write.c:861
>    vfs_writev fs/read_write.c:934 [inline]
>    do_pwritev+0x1e5/0x380 fs/read_write.c:1031
>   [...]
>   ==================================================================
>
> Above issue may happen as follows:
>             cpu1                     cpu2
> __________________________|__________________________
> do_pwritev
>   vfs_writev
>    do_iter_write
>     ext4_file_write_iter
>      ext4_buffered_write_iter
>       generic_perform_write
>        ext4_da_write_begin
>                             vfs_fallocate
>                              ext4_fallocate
>                               ext4_convert_inline_data
>                                ext4_convert_inline_data_nolock
>                                 ext4_destroy_inline_data_nolock
>                                  clear EXT4_STATE_MAY_INLINE_DATA
>                                 ext4_map_blocks
>                                  ext4_ext_map_blocks
>                                   ext4_mb_new_blocks
>                                    ext4_mb_regular_allocator
>                                     ext4_mb_good_group_nolock
>                                      ext4_mb_init_group
>                                       ext4_mb_init_cache
>                                        ext4_mb_generate_buddy  --> error
>         ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA)
>                                  ext4_restore_inline_data
>                                   set EXT4_STATE_MAY_INLINE_DATA
>         ext4_block_write_begin
>        ext4_da_write_end
>         ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA)
>         ext4_write_inline_data_end
>          handle=NULL
>          ext4_journal_stop(handle)
>           __ext4_journal_stop
>            ext4_put_nojournal(handle)
>             ref_cnt = (unsigned long)handle
>             BUG_ON(ref_cnt == 0)  ---> BUG_ON
>
> The lock held by ext4_convert_inline_data is xattr_sem, but the lock
> held by generic_perform_write is i_rwsem. Therefore, the two locks can
> be concurrent.
>
> To solve above issue, we add inode_lock() for ext4_convert_inline_data().
> At the same time, move ext4_convert_inline_data() in front of
> ext4_punch_hole(), remove similar handling from ext4_punch_hole().
>
> Fixes: 0c8d414f163f ("ext4: let fallocate handle inline data correctly")
> Cc: stable@vger.kernel.org
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: Baokun Li <libaokun1@huawei.com>
> ---
> V1->V2:
> 	Increase the range of the inode_lock.
> V2->V3:
> 	Move the lock outside the ext4_convert_inline_data().
> 	And reorganize ext4_fallocate().
>
>   fs/ext4/extents.c | 10 ++++++----
>   fs/ext4/inode.c   |  9 ---------
>   2 files changed, 6 insertions(+), 13 deletions(-)
>
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index e473fde6b64b..474479ce76e0 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -4693,15 +4693,17 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
>   		     FALLOC_FL_INSERT_RANGE))
>   		return -EOPNOTSUPP;
>   
> +	inode_lock(inode);
> +	ret = ext4_convert_inline_data(inode);
> +	inode_unlock(inode);
> +	if (ret)
> +		goto exit;
> +
>   	if (mode & FALLOC_FL_PUNCH_HOLE) {
>   		ret = ext4_punch_hole(file, offset, len);
>   		goto exit;
>   	}
>   
> -	ret = ext4_convert_inline_data(inode);
> -	if (ret)
> -		goto exit;
> -
>   	if (mode & FALLOC_FL_COLLAPSE_RANGE) {
>   		ret = ext4_collapse_range(file, offset, len);
>   		goto exit;
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 646ece9b3455..4779673d733e 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -3967,15 +3967,6 @@ int ext4_punch_hole(struct file *file, loff_t offset, loff_t length)
>   
>   	trace_ext4_punch_hole(inode, offset, length, 0);
>   
> -	ext4_clear_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA);
> -	if (ext4_has_inline_data(inode)) {
> -		filemap_invalidate_lock(mapping);
> -		ret = ext4_convert_inline_data(inode);
> -		filemap_invalidate_unlock(mapping);
> -		if (ret)
> -			return ret;
> -	}
> -
>   	/*
>   	 * Write out all dirty pages to avoid race conditions
>   	 * Then release them.



  parent reply	other threads:[~2022-05-09  1:29 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-28 13:40 [PATCH v3] ext4: fix race condition between ext4_write and ext4_convert_inline_data Baokun Li
2022-04-28 16:57 ` Jan Kara
2022-05-09  1:25 ` libaokun (A) [this message]
2022-05-17  6:01 ` libaokun (A)
2022-05-19  1:29 ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83ed1d6f-84f9-0e47-ddb1-b8cafc12338a@huawei.com \
    --to=libaokun1@huawei.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=hulkci@huawei.com \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=yebin10@huawei.com \
    --cc=yi.zhang@huawei.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.