linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Zhang Yi <yi.zhang@huaweicloud.com>
Cc: linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz,
	yi.zhang@huawei.com, chengzhihao1@huawei.com, yukuai3@huawei.com
Subject: Re: [PATCH v2 2/9] ext4: check the extent status again before inserting delalloc block
Date: Wed, 24 Apr 2024 22:05:28 +0200	[thread overview]
Message-ID: <20240424200528.zvcxcfv3vr6pn5r7@quack3> (raw)
In-Reply-To: <20240410034203.2188357-3-yi.zhang@huaweicloud.com>

On Wed 10-04-24 11:41:56, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@huawei.com>
> 
> Now we lookup extent status entry without holding the i_data_sem before
> inserting delalloc block, it works fine in buffered write path and
> because it holds i_rwsem and folio lock, and the mmap path holds folio
> lock, so the found extent locklessly couldn't be modified concurrently.
> But it could be raced by fallocate since it allocate block whitout
> holding i_rwsem and folio lock.
> 
> ext4_page_mkwrite()             ext4_fallocate()
>  block_page_mkwrite()
>   ext4_da_map_blocks()
>    //find hole in extent status tree
>                                  ext4_alloc_file_blocks()
>                                   ext4_map_blocks()
>                                    //allocate block and unwritten extent
>    ext4_insert_delayed_block()
>     ext4_da_reserve_space()
>      //reserve one more block
>     ext4_es_insert_delayed_block()
>      //drop unwritten extent and add delayed extent by mistake
> 
> Then, the delalloc extent is wrong until writeback, the one more
> reserved block can't be release any more and trigger below warning:
> 
>  EXT4-fs (pmem2): Inode 13 (00000000bbbd4d23): i_reserved_data_blocks(1) not cleared!
> 
> Hold i_data_sem in write mode directly can fix the problem, but it's
> expansive, we should keep the lockless check and check the extent again
> once we need to add an new delalloc block.
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/inode.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 6a41172c06e1..118b0497a954 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -1737,6 +1737,7 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
>  		if (ext4_es_is_hole(&es))
>  			goto add_delayed;
>  
> +found:
>  		/*
>  		 * Delayed extent could be allocated by fallocate.
>  		 * So we need to check it.
> @@ -1781,6 +1782,24 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
>  
>  add_delayed:
>  	down_write(&EXT4_I(inode)->i_data_sem);
> +	/*
> +	 * Lookup extents tree again under i_data_sem, make sure this
> +	 * inserting delalloc range haven't been delayed or allocated
> +	 * whitout holding i_rwsem and folio lock.
> +	 */
> +	if (ext4_es_lookup_extent(inode, iblock, NULL, &es)) {
> +		if (!ext4_es_is_hole(&es)) {
> +			up_write(&EXT4_I(inode)->i_data_sem);
> +			goto found;
> +		}
> +	} else if (!ext4_has_inline_data(inode)) {
> +		retval = ext4_map_query_blocks(NULL, inode, map);
> +		if (retval) {
> +			up_write(&EXT4_I(inode)->i_data_sem);
> +			return retval;
> +		}
> +	}
> +
>  	retval = ext4_insert_delayed_block(inode, map->m_lblk);
>  	up_write(&EXT4_I(inode)->i_data_sem);
>  	if (retval)
> -- 
> 2.39.2
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2024-04-24 20:05 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-10  3:41 [PATCH v2 0/9] ext4: support adding multi-delalloc blocks Zhang Yi
2024-04-10  3:41 ` [PATCH v2 1/9] ext4: factor out a common helper to query extent map Zhang Yi
2024-04-24 20:05   ` Jan Kara
2024-04-10  3:41 ` [PATCH v2 2/9] ext4: check the extent status again before inserting delalloc block Zhang Yi
2024-04-24 20:05   ` Jan Kara [this message]
2024-04-10  3:41 ` [PATCH v2 3/9] ext4: trim delalloc extent Zhang Yi
2024-04-25 15:56   ` Jan Kara
2024-04-26  9:38     ` Zhang Yi
2024-04-29 10:25       ` Jan Kara
2024-04-29 11:28         ` Zhang Yi
2024-04-10  3:41 ` [PATCH v2 4/9] ext4: drop iblock parameter Zhang Yi
2024-04-29  7:34   ` Jan Kara
2024-04-10  3:41 ` [PATCH v2 5/9] ext4: make ext4_es_insert_delayed_block() insert multi-blocks Zhang Yi
2024-04-29  9:16   ` Jan Kara
2024-04-29 12:09     ` Zhang Yi
2024-04-29 16:27       ` Jan Kara
2024-04-29  9:24   ` Jan Kara
2024-04-10  3:42 ` [PATCH v2 6/9] ext4: make ext4_da_reserve_space() reserve multi-clusters Zhang Yi
2024-04-29  9:25   ` Jan Kara
2024-04-10  3:42 ` [PATCH v2 7/9] ext4: factor out check for whether a cluster is allocated Zhang Yi
2024-04-10  3:42 ` [PATCH v2 8/9] ext4: make ext4_insert_delayed_block() insert multi-blocks Zhang Yi
2024-04-29 10:06   ` Jan Kara
2024-04-29 12:54     ` Zhang Yi
2024-04-10  3:42 ` [PATCH v2 9/9] ext4: make ext4_da_map_blocks() buffer_head unaware Zhang Yi
2024-04-29 10:27   ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240424200528.zvcxcfv3vr6pn5r7@quack3 \
    --to=jack@suse.cz \
    --cc=adilger.kernel@dilger.ca \
    --cc=chengzhihao1@huawei.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=yi.zhang@huawei.com \
    --cc=yi.zhang@huaweicloud.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).