From: Jan Kara <jack@suse.cz>
To: Zhang Yi <yi.zhang@huaweicloud.com>
Cc: linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz,
yi.zhang@huawei.com, chengzhihao1@huawei.com, yukuai3@huawei.com
Subject: Re: [PATCH v2 2/9] ext4: check the extent status again before inserting delalloc block
Date: Wed, 24 Apr 2024 22:05:28 +0200 [thread overview]
Message-ID: <20240424200528.zvcxcfv3vr6pn5r7@quack3> (raw)
In-Reply-To: <20240410034203.2188357-3-yi.zhang@huaweicloud.com>
On Wed 10-04-24 11:41:56, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@huawei.com>
>
> Now we lookup extent status entry without holding the i_data_sem before
> inserting delalloc block, it works fine in buffered write path and
> because it holds i_rwsem and folio lock, and the mmap path holds folio
> lock, so the found extent locklessly couldn't be modified concurrently.
> But it could be raced by fallocate since it allocate block whitout
> holding i_rwsem and folio lock.
>
> ext4_page_mkwrite() ext4_fallocate()
> block_page_mkwrite()
> ext4_da_map_blocks()
> //find hole in extent status tree
> ext4_alloc_file_blocks()
> ext4_map_blocks()
> //allocate block and unwritten extent
> ext4_insert_delayed_block()
> ext4_da_reserve_space()
> //reserve one more block
> ext4_es_insert_delayed_block()
> //drop unwritten extent and add delayed extent by mistake
>
> Then, the delalloc extent is wrong until writeback, the one more
> reserved block can't be release any more and trigger below warning:
>
> EXT4-fs (pmem2): Inode 13 (00000000bbbd4d23): i_reserved_data_blocks(1) not cleared!
>
> Hold i_data_sem in write mode directly can fix the problem, but it's
> expansive, we should keep the lockless check and check the extent again
> once we need to add an new delalloc block.
>
> Cc: stable@vger.kernel.org
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Looks good. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/ext4/inode.c | 19 +++++++++++++++++++
> 1 file changed, 19 insertions(+)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 6a41172c06e1..118b0497a954 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -1737,6 +1737,7 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
> if (ext4_es_is_hole(&es))
> goto add_delayed;
>
> +found:
> /*
> * Delayed extent could be allocated by fallocate.
> * So we need to check it.
> @@ -1781,6 +1782,24 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
>
> add_delayed:
> down_write(&EXT4_I(inode)->i_data_sem);
> + /*
> + * Lookup extents tree again under i_data_sem, make sure this
> + * inserting delalloc range haven't been delayed or allocated
> + * whitout holding i_rwsem and folio lock.
> + */
> + if (ext4_es_lookup_extent(inode, iblock, NULL, &es)) {
> + if (!ext4_es_is_hole(&es)) {
> + up_write(&EXT4_I(inode)->i_data_sem);
> + goto found;
> + }
> + } else if (!ext4_has_inline_data(inode)) {
> + retval = ext4_map_query_blocks(NULL, inode, map);
> + if (retval) {
> + up_write(&EXT4_I(inode)->i_data_sem);
> + return retval;
> + }
> + }
> +
> retval = ext4_insert_delayed_block(inode, map->m_lblk);
> up_write(&EXT4_I(inode)->i_data_sem);
> if (retval)
> --
> 2.39.2
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply other threads:[~2024-04-24 20:05 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-10 3:41 [PATCH v2 0/9] ext4: support adding multi-delalloc blocks Zhang Yi
2024-04-10 3:41 ` [PATCH v2 1/9] ext4: factor out a common helper to query extent map Zhang Yi
2024-04-24 20:05 ` Jan Kara
2024-04-10 3:41 ` [PATCH v2 2/9] ext4: check the extent status again before inserting delalloc block Zhang Yi
2024-04-24 20:05 ` Jan Kara [this message]
2024-04-10 3:41 ` [PATCH v2 3/9] ext4: trim delalloc extent Zhang Yi
2024-04-25 15:56 ` Jan Kara
2024-04-26 9:38 ` Zhang Yi
2024-04-29 10:25 ` Jan Kara
2024-04-29 11:28 ` Zhang Yi
2024-04-10 3:41 ` [PATCH v2 4/9] ext4: drop iblock parameter Zhang Yi
2024-04-29 7:34 ` Jan Kara
2024-04-10 3:41 ` [PATCH v2 5/9] ext4: make ext4_es_insert_delayed_block() insert multi-blocks Zhang Yi
2024-04-29 9:16 ` Jan Kara
2024-04-29 12:09 ` Zhang Yi
2024-04-29 16:27 ` Jan Kara
2024-04-29 9:24 ` Jan Kara
2024-04-10 3:42 ` [PATCH v2 6/9] ext4: make ext4_da_reserve_space() reserve multi-clusters Zhang Yi
2024-04-29 9:25 ` Jan Kara
2024-04-10 3:42 ` [PATCH v2 7/9] ext4: factor out check for whether a cluster is allocated Zhang Yi
2024-04-10 3:42 ` [PATCH v2 8/9] ext4: make ext4_insert_delayed_block() insert multi-blocks Zhang Yi
2024-04-29 10:06 ` Jan Kara
2024-04-29 12:54 ` Zhang Yi
2024-04-10 3:42 ` [PATCH v2 9/9] ext4: make ext4_da_map_blocks() buffer_head unaware Zhang Yi
2024-04-29 10:27 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240424200528.zvcxcfv3vr6pn5r7@quack3 \
--to=jack@suse.cz \
--cc=adilger.kernel@dilger.ca \
--cc=chengzhihao1@huawei.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=tytso@mit.edu \
--cc=yi.zhang@huawei.com \
--cc=yi.zhang@huaweicloud.com \
--cc=yukuai3@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).