* [PATCH 0/4] update bytes_may_use timely to avoid false ENOSPC issue @ 2016-07-20 5:56 Wang Xiaoguang 2016-07-20 5:56 ` [PATCH 1/4] btrfs: use correct offset for reloc_inode in prealloc_file_extent_cluster() Wang Xiaoguang ` (4 more replies) 0 siblings, 5 replies; 15+ messages in thread From: Wang Xiaoguang @ 2016-07-20 5:56 UTC (permalink / raw) To: linux-btrfs; +Cc: dsterba, jbacik, holger Currently in btrfs, for data space reservation, it does not update bytes_may_use in btrfs_update_reserved_bytes() and the decrease operation will be delayed to be done in extent_clear_unlock_delalloc(), for fallocate(2), decrease operation is even delayed to be done in end of btrfs_fallocate(), which is too late. Obviously such delay will cause unnecessary pressure to enospc system, in [PATCH 4/4], there is a simpel test script that can reveal such false ENOSPC bug. So in this patch set, it will remove RESERVE_FREE, RESERVE_ALLOC and RESERVE_ALLOC_NO_ACCOUNT, and we always update bytes_may_use timely. I'll also commit a fstests test case for this issue. Wang Xiaoguang (4): btrfs: use correct offset for reloc_inode in prealloc_file_extent_cluster() btrfs: divide btrfs_update_reserved_bytes() into two functions btrfs: introduce new EXTENT_CLEAR_DATA_RESV flag btrfs: update btrfs_space_info's bytes_may_use timely fs/btrfs/ctree.h | 2 +- fs/btrfs/extent-tree.c | 131 ++++++++++++++++++++++++------------------------- fs/btrfs/extent_io.h | 1 + fs/btrfs/file.c | 26 +++++----- fs/btrfs/inode-map.c | 3 +- fs/btrfs/inode.c | 37 ++++++++++---- fs/btrfs/relocation.c | 17 +++++-- 7 files changed, 123 insertions(+), 94 deletions(-) -- 2.9.0 ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 1/4] btrfs: use correct offset for reloc_inode in prealloc_file_extent_cluster() 2016-07-20 5:56 [PATCH 0/4] update bytes_may_use timely to avoid false ENOSPC issue Wang Xiaoguang @ 2016-07-20 5:56 ` Wang Xiaoguang 2016-07-20 13:18 ` Josef Bacik 2016-07-20 5:56 ` [PATCH 2/4] btrfs: divide btrfs_update_reserved_bytes() into two functions Wang Xiaoguang ` (3 subsequent siblings) 4 siblings, 1 reply; 15+ messages in thread From: Wang Xiaoguang @ 2016-07-20 5:56 UTC (permalink / raw) To: linux-btrfs; +Cc: dsterba, jbacik, holger In prealloc_file_extent_cluster(), btrfs_check_data_free_space() uses wrong file offset for reloc_inode, it uses cluster->start and cluster->end, which indeed are extent's bytenr. The correct value should be cluster->[start|end] minus block group's start bytenr. start bytenr cluster->start | | extent | extent | ...| extent | |----------------------------------------------------------------| | block group reloc_inode | Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com> --- fs/btrfs/relocation.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index 0477dca..a0de885 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -3030,12 +3030,14 @@ int prealloc_file_extent_cluster(struct inode *inode, u64 num_bytes; int nr = 0; int ret = 0; + u64 prealloc_start = cluster->start - offset; + u64 prealloc_end = cluster->end - offset; BUG_ON(cluster->start != cluster->boundary[0]); inode_lock(inode); - ret = btrfs_check_data_free_space(inode, cluster->start, - cluster->end + 1 - cluster->start); + ret = btrfs_check_data_free_space(inode, prealloc_start, + prealloc_end + 1 - prealloc_start); if (ret) goto out; @@ -3056,8 +3058,8 @@ int prealloc_file_extent_cluster(struct inode *inode, break; nr++; } - btrfs_free_reserved_data_space(inode, cluster->start, - cluster->end + 1 - cluster->start); + btrfs_free_reserved_data_space(inode, prealloc_start, + prealloc_end + 1 - prealloc_start); out: inode_unlock(inode); return ret; -- 2.9.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 1/4] btrfs: use correct offset for reloc_inode in prealloc_file_extent_cluster() 2016-07-20 5:56 ` [PATCH 1/4] btrfs: use correct offset for reloc_inode in prealloc_file_extent_cluster() Wang Xiaoguang @ 2016-07-20 13:18 ` Josef Bacik 2016-07-21 1:49 ` Wang Xiaoguang 0 siblings, 1 reply; 15+ messages in thread From: Josef Bacik @ 2016-07-20 13:18 UTC (permalink / raw) To: Wang Xiaoguang, linux-btrfs; +Cc: dsterba, holger On 07/20/2016 01:56 AM, Wang Xiaoguang wrote: > In prealloc_file_extent_cluster(), btrfs_check_data_free_space() uses > wrong file offset for reloc_inode, it uses cluster->start and cluster->end, > which indeed are extent's bytenr. The correct value should be > cluster->[start|end] minus block group's start bytenr. > > start bytenr cluster->start > | | extent | extent | ...| extent | > |----------------------------------------------------------------| > | block group reloc_inode | > > Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com> > --- > fs/btrfs/relocation.c | 10 ++++++---- > 1 file changed, 6 insertions(+), 4 deletions(-) > > diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c > index 0477dca..a0de885 100644 > --- a/fs/btrfs/relocation.c > +++ b/fs/btrfs/relocation.c > @@ -3030,12 +3030,14 @@ int prealloc_file_extent_cluster(struct inode *inode, > u64 num_bytes; > int nr = 0; > int ret = 0; > + u64 prealloc_start = cluster->start - offset; > + u64 prealloc_end = cluster->end - offset; > > BUG_ON(cluster->start != cluster->boundary[0]); > inode_lock(inode); > > - ret = btrfs_check_data_free_space(inode, cluster->start, > - cluster->end + 1 - cluster->start); > + ret = btrfs_check_data_free_space(inode, prealloc_start, > + prealloc_end + 1 - prealloc_start); > if (ret) > goto out; > > @@ -3056,8 +3058,8 @@ int prealloc_file_extent_cluster(struct inode *inode, > break; > nr++; > } > - btrfs_free_reserved_data_space(inode, cluster->start, > - cluster->end + 1 - cluster->start); > + btrfs_free_reserved_data_space(inode, prealloc_start, > + prealloc_end + 1 - prealloc_start); > out: > inode_unlock(inode); > return ret; > This ends up being the same amount. Consider this scenario bg bytenr = 4096 cluster->start = 8192 cluster->end = 12287 cluster->end + 1 - cluster->start = 4096 prealloc_start = cluster->start - offset = 0 prealloc_end = cluster->end - offset = 8191 prealloc_end + 1 - prealloc_start = 4096 You shift both by the same amount, which gives you the same answer. Thanks, Josef ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/4] btrfs: use correct offset for reloc_inode in prealloc_file_extent_cluster() 2016-07-20 13:18 ` Josef Bacik @ 2016-07-21 1:49 ` Wang Xiaoguang 2016-07-21 13:05 ` Josef Bacik 0 siblings, 1 reply; 15+ messages in thread From: Wang Xiaoguang @ 2016-07-21 1:49 UTC (permalink / raw) To: Josef Bacik, linux-btrfs; +Cc: dsterba, holger hello, On 07/20/2016 09:18 PM, Josef Bacik wrote: > On 07/20/2016 01:56 AM, Wang Xiaoguang wrote: >> In prealloc_file_extent_cluster(), btrfs_check_data_free_space() uses >> wrong file offset for reloc_inode, it uses cluster->start and >> cluster->end, >> which indeed are extent's bytenr. The correct value should be >> cluster->[start|end] minus block group's start bytenr. >> >> start bytenr cluster->start >> | | extent | extent | ...| extent | >> |----------------------------------------------------------------| >> | block group reloc_inode | >> >> Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com> >> --- >> fs/btrfs/relocation.c | 10 ++++++---- >> 1 file changed, 6 insertions(+), 4 deletions(-) >> >> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c >> index 0477dca..a0de885 100644 >> --- a/fs/btrfs/relocation.c >> +++ b/fs/btrfs/relocation.c >> @@ -3030,12 +3030,14 @@ int prealloc_file_extent_cluster(struct inode >> *inode, >> u64 num_bytes; >> int nr = 0; >> int ret = 0; >> + u64 prealloc_start = cluster->start - offset; >> + u64 prealloc_end = cluster->end - offset; >> >> BUG_ON(cluster->start != cluster->boundary[0]); >> inode_lock(inode); >> >> - ret = btrfs_check_data_free_space(inode, cluster->start, >> - cluster->end + 1 - cluster->start); >> + ret = btrfs_check_data_free_space(inode, prealloc_start, >> + prealloc_end + 1 - prealloc_start); >> if (ret) >> goto out; >> >> @@ -3056,8 +3058,8 @@ int prealloc_file_extent_cluster(struct inode >> *inode, >> break; >> nr++; >> } >> - btrfs_free_reserved_data_space(inode, cluster->start, >> - cluster->end + 1 - cluster->start); >> + btrfs_free_reserved_data_space(inode, prealloc_start, >> + prealloc_end + 1 - prealloc_start); >> out: >> inode_unlock(inode); >> return ret; >> > > This ends up being the same amount. Consider this scenario > > bg bytenr = 4096 > cluster->start = 8192 > cluster->end = 12287 > > cluster->end + 1 - cluster->start = 4096 > > prealloc_start = cluster->start - offset = 0 > prealloc_end = cluster->end - offset = 8191 > > prealloc_end + 1 - prealloc_start = 4096 > > You shift both by the same amount, which gives you the same answer. > Thanks, Thanks for reviewing. Yes, I know the amount of preallocated data space is the same, this patch does not fix any bugs :) For every block group to be balanced, we create a corresponding inode. For this inode, the initial offset should be 0. In your above example, before this patch, it's btrfs_free_reserved_data_space(inode, 8192, 4096); with this patch, it's btrfs_free_reserved_data_space(inode, 4096, 4096). I just want to make btrfs_free_reserved_data_space()'s 'start' argument be offset inside block group, not offset inside whole fs byternr space. I'm not a English native, hope that I have expressed what I want to :) But yes, I'm also OK with removing this patch. Regards, Xiaoguang Wang > > Josef > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/4] btrfs: use correct offset for reloc_inode in prealloc_file_extent_cluster() 2016-07-21 1:49 ` Wang Xiaoguang @ 2016-07-21 13:05 ` Josef Bacik 0 siblings, 0 replies; 15+ messages in thread From: Josef Bacik @ 2016-07-21 13:05 UTC (permalink / raw) To: Wang Xiaoguang, linux-btrfs; +Cc: dsterba, holger On 07/20/2016 09:49 PM, Wang Xiaoguang wrote: > hello, > > On 07/20/2016 09:18 PM, Josef Bacik wrote: >> On 07/20/2016 01:56 AM, Wang Xiaoguang wrote: >>> In prealloc_file_extent_cluster(), btrfs_check_data_free_space() uses >>> wrong file offset for reloc_inode, it uses cluster->start and cluster->end, >>> which indeed are extent's bytenr. The correct value should be >>> cluster->[start|end] minus block group's start bytenr. >>> >>> start bytenr cluster->start >>> | | extent | extent | ...| extent | >>> |----------------------------------------------------------------| >>> | block group reloc_inode | >>> >>> Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com> >>> --- >>> fs/btrfs/relocation.c | 10 ++++++---- >>> 1 file changed, 6 insertions(+), 4 deletions(-) >>> >>> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c >>> index 0477dca..a0de885 100644 >>> --- a/fs/btrfs/relocation.c >>> +++ b/fs/btrfs/relocation.c >>> @@ -3030,12 +3030,14 @@ int prealloc_file_extent_cluster(struct inode *inode, >>> u64 num_bytes; >>> int nr = 0; >>> int ret = 0; >>> + u64 prealloc_start = cluster->start - offset; >>> + u64 prealloc_end = cluster->end - offset; >>> >>> BUG_ON(cluster->start != cluster->boundary[0]); >>> inode_lock(inode); >>> >>> - ret = btrfs_check_data_free_space(inode, cluster->start, >>> - cluster->end + 1 - cluster->start); >>> + ret = btrfs_check_data_free_space(inode, prealloc_start, >>> + prealloc_end + 1 - prealloc_start); >>> if (ret) >>> goto out; >>> >>> @@ -3056,8 +3058,8 @@ int prealloc_file_extent_cluster(struct inode *inode, >>> break; >>> nr++; >>> } >>> - btrfs_free_reserved_data_space(inode, cluster->start, >>> - cluster->end + 1 - cluster->start); >>> + btrfs_free_reserved_data_space(inode, prealloc_start, >>> + prealloc_end + 1 - prealloc_start); >>> out: >>> inode_unlock(inode); >>> return ret; >>> >> >> This ends up being the same amount. Consider this scenario >> >> bg bytenr = 4096 >> cluster->start = 8192 >> cluster->end = 12287 >> >> cluster->end + 1 - cluster->start = 4096 >> >> prealloc_start = cluster->start - offset = 0 >> prealloc_end = cluster->end - offset = 8191 >> >> prealloc_end + 1 - prealloc_start = 4096 >> >> You shift both by the same amount, which gives you the same answer. Thanks, > Thanks for reviewing. > Yes, I know the amount of preallocated data space is the same, this patch > does not fix any bugs :) > > For every block group to be balanced, we create a corresponding inode. > For this inode, the initial offset should be 0. In your above example, > before this patch, it's btrfs_free_reserved_data_space(inode, 8192, 4096); > with this patch, it's btrfs_free_reserved_data_space(inode, 4096, 4096). > > I just want to make btrfs_free_reserved_data_space()'s 'start' argument > be offset inside block group, not offset inside whole fs byternr space. I'm not > a English native, hope that I have expressed what I want to :) > > But yes, I'm also OK with removing this patch. > Oh I see, ok that's fine if we're just trying to make it look sane then go for it. Signed-off-by: Josef Bacik <jbacik@fb.com> Thanks, Josef ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 2/4] btrfs: divide btrfs_update_reserved_bytes() into two functions 2016-07-20 5:56 [PATCH 0/4] update bytes_may_use timely to avoid false ENOSPC issue Wang Xiaoguang 2016-07-20 5:56 ` [PATCH 1/4] btrfs: use correct offset for reloc_inode in prealloc_file_extent_cluster() Wang Xiaoguang @ 2016-07-20 5:56 ` Wang Xiaoguang 2016-07-20 13:21 ` Josef Bacik 2016-07-20 5:56 ` [PATCH 3/4] btrfs: introduce new EXTENT_CLEAR_DATA_RESV flag Wang Xiaoguang ` (2 subsequent siblings) 4 siblings, 1 reply; 15+ messages in thread From: Wang Xiaoguang @ 2016-07-20 5:56 UTC (permalink / raw) To: linux-btrfs; +Cc: dsterba, jbacik, holger This patch divides btrfs_update_reserved_bytes() into btrfs_add_reserved_bytes() and btrfs_free_reserved_bytes(), and next patch will extend btrfs_add_reserved_bytes()to fix some false ENOSPC error, please see later patch for detailed info. Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com> --- fs/btrfs/extent-tree.c | 97 +++++++++++++++++++++++++++++--------------------- 1 file changed, 57 insertions(+), 40 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 82b912a..8eaac39 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -104,9 +104,10 @@ static int find_next_key(struct btrfs_path *path, int level, struct btrfs_key *key); static void dump_space_info(struct btrfs_space_info *info, u64 bytes, int dump_block_groups); -static int btrfs_update_reserved_bytes(struct btrfs_block_group_cache *cache, - u64 num_bytes, int reserve, - int delalloc); +static int btrfs_add_reserved_bytes(struct btrfs_block_group_cache *cache, + u64 num_bytes, int reserve, int delalloc); +static int btrfs_free_reserved_bytes(struct btrfs_block_group_cache *cache, + u64 num_bytes, int delalloc); static int block_rsv_use_bytes(struct btrfs_block_rsv *block_rsv, u64 num_bytes); int btrfs_pin_extent(struct btrfs_root *root, @@ -6297,19 +6298,14 @@ void btrfs_wait_block_group_reservations(struct btrfs_block_group_cache *bg) } /** - * btrfs_update_reserved_bytes - update the block_group and space info counters + * btrfs_add_reserved_bytes - update the block_group and space info counters * @cache: The cache we are manipulating * @num_bytes: The number of bytes in question * @reserve: One of the reservation enums * @delalloc: The blocks are allocated for the delalloc write * - * This is called by the allocator when it reserves space, or by somebody who is - * freeing space that was never actually used on disk. For example if you - * reserve some space for a new leaf in transaction A and before transaction A - * commits you free that leaf, you call this with reserve set to 0 in order to - * clear the reservation. - * - * Metadata reservations should be called with RESERVE_ALLOC so we do the proper + * This is called by the allocator when it reserves space. Metadata + * reservations should be called with RESERVE_ALLOC so we do the proper * ENOSPC accounting. For data we handle the reservation through clearing the * delalloc bits in the io_tree. We have to do this since we could end up * allocating less disk space for the amount of data we have reserved in the @@ -6319,44 +6315,65 @@ void btrfs_wait_block_group_reservations(struct btrfs_block_group_cache *bg) * make the reservation and return -EAGAIN, otherwise this function always * succeeds. */ -static int btrfs_update_reserved_bytes(struct btrfs_block_group_cache *cache, - u64 num_bytes, int reserve, int delalloc) +static int btrfs_add_reserved_bytes(struct btrfs_block_group_cache *cache, + u64 num_bytes, int reserve, int delalloc) { struct btrfs_space_info *space_info = cache->space_info; int ret = 0; spin_lock(&space_info->lock); spin_lock(&cache->lock); - if (reserve != RESERVE_FREE) { - if (cache->ro) { - ret = -EAGAIN; - } else { - cache->reserved += num_bytes; - space_info->bytes_reserved += num_bytes; - if (reserve == RESERVE_ALLOC) { - trace_btrfs_space_reservation(cache->fs_info, - "space_info", space_info->flags, - num_bytes, 0); - space_info->bytes_may_use -= num_bytes; - } - - if (delalloc) - cache->delalloc_bytes += num_bytes; - } + if (cache->ro) { + ret = -EAGAIN; } else { - if (cache->ro) - space_info->bytes_readonly += num_bytes; - cache->reserved -= num_bytes; - space_info->bytes_reserved -= num_bytes; + cache->reserved += num_bytes; + space_info->bytes_reserved += num_bytes; + if (reserve == RESERVE_ALLOC) { + trace_btrfs_space_reservation(cache->fs_info, + "space_info", space_info->flags, + num_bytes, 0); + space_info->bytes_may_use -= num_bytes; + } if (delalloc) - cache->delalloc_bytes -= num_bytes; + cache->delalloc_bytes += num_bytes; } spin_unlock(&cache->lock); spin_unlock(&space_info->lock); return ret; } +/** + * btrfs_free_reserved_bytes - update the block_group and space info counters + * @cache: The cache we are manipulating + * @num_bytes: The number of bytes in question + * @delalloc: The blocks are allocated for the delalloc write + * + * This is called by somebody who is freeing space that was never actually used + * on disk. For example if you reserve some space for a new leaf in transaction + * A and before transaction A commits you free that leaf, you call this with + * reserve set to 0 in order to clear the reservation. + */ + +static int btrfs_free_reserved_bytes(struct btrfs_block_group_cache *cache, + u64 num_bytes, int delalloc) +{ + struct btrfs_space_info *space_info = cache->space_info; + int ret = 0; + + spin_lock(&space_info->lock); + spin_lock(&cache->lock); + if (cache->ro) + space_info->bytes_readonly += num_bytes; + cache->reserved -= num_bytes; + space_info->bytes_reserved -= num_bytes; + + if (delalloc) + cache->delalloc_bytes -= num_bytes; + spin_unlock(&cache->lock); + spin_unlock(&space_info->lock); + return ret; +} void btrfs_prepare_extent_commit(struct btrfs_trans_handle *trans, struct btrfs_root *root) { @@ -6976,7 +6993,7 @@ void btrfs_free_tree_block(struct btrfs_trans_handle *trans, WARN_ON(test_bit(EXTENT_BUFFER_DIRTY, &buf->bflags)); btrfs_add_free_space(cache, buf->start, buf->len); - btrfs_update_reserved_bytes(cache, buf->len, RESERVE_FREE, 0); + btrfs_free_reserved_bytes(cache, buf->len, 0); btrfs_put_block_group(cache); trace_btrfs_reserved_extent_free(root, buf->start, buf->len); pin = 0; @@ -7548,8 +7565,8 @@ checks: search_start - offset); BUG_ON(offset > search_start); - ret = btrfs_update_reserved_bytes(block_group, num_bytes, - alloc_type, delalloc); + ret = btrfs_add_reserved_bytes(block_group, num_bytes, + alloc_type, delalloc); if (ret == -EAGAIN) { btrfs_add_free_space(block_group, offset, num_bytes); goto loop; @@ -7781,7 +7798,7 @@ static int __btrfs_free_reserved_extent(struct btrfs_root *root, if (btrfs_test_opt(root, DISCARD)) ret = btrfs_discard_extent(root, start, len, NULL); btrfs_add_free_space(cache, start, len); - btrfs_update_reserved_bytes(cache, len, RESERVE_FREE, delalloc); + btrfs_free_reserved_bytes(cache, len, delalloc); } btrfs_put_block_group(cache); @@ -8011,8 +8028,8 @@ int btrfs_alloc_logged_file_extent(struct btrfs_trans_handle *trans, if (!block_group) return -EINVAL; - ret = btrfs_update_reserved_bytes(block_group, ins->offset, - RESERVE_ALLOC_NO_ACCOUNT, 0); + ret = btrfs_add_reserved_bytes(block_group, ins->offset, + RESERVE_ALLOC_NO_ACCOUNT, 0); BUG_ON(ret); /* logic error */ ret = alloc_reserved_file_extent(trans, root, 0, root_objectid, 0, owner, offset, ins, 1); -- 2.9.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 2/4] btrfs: divide btrfs_update_reserved_bytes() into two functions 2016-07-20 5:56 ` [PATCH 2/4] btrfs: divide btrfs_update_reserved_bytes() into two functions Wang Xiaoguang @ 2016-07-20 13:21 ` Josef Bacik 0 siblings, 0 replies; 15+ messages in thread From: Josef Bacik @ 2016-07-20 13:21 UTC (permalink / raw) To: Wang Xiaoguang, linux-btrfs; +Cc: dsterba, holger On 07/20/2016 01:56 AM, Wang Xiaoguang wrote: > This patch divides btrfs_update_reserved_bytes() into > btrfs_add_reserved_bytes() and btrfs_free_reserved_bytes(), and > next patch will extend btrfs_add_reserved_bytes()to fix some > false ENOSPC error, please see later patch for detailed info. > > Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com> This appears to be functionally the same Signed-off-by: Josef Bacik <jbacik@fb.com> Thanks, Josef ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 3/4] btrfs: introduce new EXTENT_CLEAR_DATA_RESV flag 2016-07-20 5:56 [PATCH 0/4] update bytes_may_use timely to avoid false ENOSPC issue Wang Xiaoguang 2016-07-20 5:56 ` [PATCH 1/4] btrfs: use correct offset for reloc_inode in prealloc_file_extent_cluster() Wang Xiaoguang 2016-07-20 5:56 ` [PATCH 2/4] btrfs: divide btrfs_update_reserved_bytes() into two functions Wang Xiaoguang @ 2016-07-20 5:56 ` Wang Xiaoguang 2016-07-20 13:22 ` Josef Bacik 2016-07-20 5:56 ` [PATCH 4/4] btrfs: update btrfs_space_info's bytes_may_use timely Wang Xiaoguang 2016-07-20 8:46 ` [PATCH 0/4] update bytes_may_use timely to avoid false ENOSPC issue Holger Hoffstätte 4 siblings, 1 reply; 15+ messages in thread From: Wang Xiaoguang @ 2016-07-20 5:56 UTC (permalink / raw) To: linux-btrfs; +Cc: dsterba, jbacik, holger In next patch, btrfs_clear_bit_hook() will not call btrfs_free_reserved_data_space_noquota() to update btrfs_space_info's bytes_may_use unless it has EXTENT_DO_ACCOUNTING or EXTENT_CLEAR_DATA_RESV, as for the reason, please see the next patch for detailed info. As we know, usually EXTENT_DO_ACCOUNTING is used for error path. In run_delalloc_nocow(), for inode marked as NODATACOW or extent marked as PREALLOC, we also need to update bytes_may_use, but can not pass EXTENT_DO_ACCOUNTING, because it also clears metadata reservation, so here we introduce EXTENT_CLEAR_DATA_RESV flag to indicate btrfs_clear_bit_hook() to update btrfs_space_info's bytes_may_use. Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com> --- fs/btrfs/extent_io.h | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index c0c1c4f..b52ca5d 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -20,6 +20,7 @@ #define EXTENT_DAMAGED (1U << 14) #define EXTENT_NORESERVE (1U << 15) #define EXTENT_QGROUP_RESERVED (1U << 16) +#define EXTENT_CLEAR_DATA_RESV (1U << 17) #define EXTENT_IOBITS (EXTENT_LOCKED | EXTENT_WRITEBACK) #define EXTENT_CTLBITS (EXTENT_DO_ACCOUNTING | EXTENT_FIRST_DELALLOC) -- 2.9.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 3/4] btrfs: introduce new EXTENT_CLEAR_DATA_RESV flag 2016-07-20 5:56 ` [PATCH 3/4] btrfs: introduce new EXTENT_CLEAR_DATA_RESV flag Wang Xiaoguang @ 2016-07-20 13:22 ` Josef Bacik 2016-07-21 1:15 ` Wang Xiaoguang 0 siblings, 1 reply; 15+ messages in thread From: Josef Bacik @ 2016-07-20 13:22 UTC (permalink / raw) To: Wang Xiaoguang, linux-btrfs; +Cc: dsterba, holger On 07/20/2016 01:56 AM, Wang Xiaoguang wrote: > In next patch, btrfs_clear_bit_hook() will not call > btrfs_free_reserved_data_space_noquota() to update btrfs_space_info's > bytes_may_use unless it has EXTENT_DO_ACCOUNTING or EXTENT_CLEAR_DATA_RESV, > as for the reason, please see the next patch for detailed info. > > As we know, usually EXTENT_DO_ACCOUNTING is used for error path. In > run_delalloc_nocow(), for inode marked as NODATACOW or extent marked as > PREALLOC, we also need to update bytes_may_use, but can not pass > EXTENT_DO_ACCOUNTING, because it also clears metadata reservation, so here > we introduce EXTENT_CLEAR_DATA_RESV flag to indicate btrfs_clear_bit_hook() > to update btrfs_space_info's bytes_may_use. > > Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com> There's no point in introducing only a flag in one patch, collapse this into the patch that actually uses it. Thanks, Josef ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 3/4] btrfs: introduce new EXTENT_CLEAR_DATA_RESV flag 2016-07-20 13:22 ` Josef Bacik @ 2016-07-21 1:15 ` Wang Xiaoguang 0 siblings, 0 replies; 15+ messages in thread From: Wang Xiaoguang @ 2016-07-21 1:15 UTC (permalink / raw) To: Josef Bacik, linux-btrfs; +Cc: dsterba, holger hello, On 07/20/2016 09:22 PM, Josef Bacik wrote: > On 07/20/2016 01:56 AM, Wang Xiaoguang wrote: >> In next patch, btrfs_clear_bit_hook() will not call >> btrfs_free_reserved_data_space_noquota() to update btrfs_space_info's >> bytes_may_use unless it has EXTENT_DO_ACCOUNTING or >> EXTENT_CLEAR_DATA_RESV, >> as for the reason, please see the next patch for detailed info. >> >> As we know, usually EXTENT_DO_ACCOUNTING is used for error path. In >> run_delalloc_nocow(), for inode marked as NODATACOW or extent marked as >> PREALLOC, we also need to update bytes_may_use, but can not pass >> EXTENT_DO_ACCOUNTING, because it also clears metadata reservation, so >> here >> we introduce EXTENT_CLEAR_DATA_RESV flag to indicate >> btrfs_clear_bit_hook() >> to update btrfs_space_info's bytes_may_use. >> >> Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com> > > There's no point in introducing only a flag in one patch, collapse > this into the patch that actually uses it. Thanks, OK Regards, Xiaoguang Wang > > Josef > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 4/4] btrfs: update btrfs_space_info's bytes_may_use timely 2016-07-20 5:56 [PATCH 0/4] update bytes_may_use timely to avoid false ENOSPC issue Wang Xiaoguang ` (2 preceding siblings ...) 2016-07-20 5:56 ` [PATCH 3/4] btrfs: introduce new EXTENT_CLEAR_DATA_RESV flag Wang Xiaoguang @ 2016-07-20 5:56 ` Wang Xiaoguang 2016-07-20 13:35 ` Josef Bacik 2016-07-20 8:46 ` [PATCH 0/4] update bytes_may_use timely to avoid false ENOSPC issue Holger Hoffstätte 4 siblings, 1 reply; 15+ messages in thread From: Wang Xiaoguang @ 2016-07-20 5:56 UTC (permalink / raw) To: linux-btrfs; +Cc: dsterba, jbacik, holger This patch can fix some false ENOSPC errors, below test script can reproduce one false ENOSPC error: #!/bin/bash dd if=/dev/zero of=fs.img bs=$((1024*1024)) count=128 dev=$(losetup --show -f fs.img) mkfs.btrfs -f -M $dev mkdir /tmp/mntpoint mount $dev /tmp/mntpoint cd /tmp/mntpoint xfs_io -f -c "falloc 0 $((64*1024*1024))" testfile Above script will fail for ENOSPC reason, but indeed fs still has free space to satisfy this request. Please see call graph: btrfs_fallocate() |-> btrfs_alloc_data_chunk_ondemand() | bytes_may_use += 64M |-> btrfs_prealloc_file_range() |-> btrfs_reserve_extent() |-> btrfs_add_reserved_bytes() | alloc_type is RESERVE_ALLOC_NO_ACCOUNT, so it does not | change bytes_may_use, and bytes_reserved += 64M. Now | bytes_may_use + bytes_reserved == 128M, which is greater | than btrfs_space_info's total_bytes, false enospc occurs. | Note, the bytes_may_use decrease operation will done in | end of btrfs_fallocate(), which is too late. Here is another simple case for buffered write: CPU 1 | CPU 2 | |-> cow_file_range() |-> __btrfs_buffered_write() |-> btrfs_reserve_extent() | | | | | | | | | ..... | |-> btrfs_check_data_free_space() | | | | |-> extent_clear_unlock_delalloc() | In CPU 1, btrfs_reserve_extent()->find_free_extent()-> btrfs_add_reserved_bytes() do not decrease bytes_may_use, the decrease operation will be delayed to be done in extent_clear_unlock_delalloc(). Assume in this case, btrfs_reserve_extent() reserved 128MB data, CPU2's btrfs_check_data_free_space() tries to reserve 100MB data space. If 100MB > data_sinfo->total_bytes - data_sinfo->bytes_used - data_sinfo->bytes_reserved - data_sinfo->bytes_pinned - data_sinfo->bytes_readonly - data_sinfo->bytes_may_use btrfs_check_data_free_space() will try to allcate new data chunk or call btrfs_start_delalloc_roots(), or commit current transaction inorder to reserve some free space, obviously a lot of work. But indeed it's not necessary as long as decreasing bytes_may_use timely, we still have free space, decreasing 128M from bytes_may_use. To fix this issue, this patch chooses to update bytes_may_use for both data and metadata in btrfs_add_reserved_bytes(). For compress path, real extent length may not be equal to file content length, so introduce a ram_bytes argument for btrfs_reserve_extent(), find_free_extent() and btrfs_add_reserved_bytes(), it's becasue bytes_may_use is increased by file content length. Then compress path can update bytes_may_use correctly. Also now we can discard RESERVE_ALLOC_NO_ACCOUNT, RESERVE_ALLOC and RESERVE_FREE. For inode marked as NODATACOW or extent marked as PREALLOC, we can directly call btrfs_free_reserved_data_space() to adjust bytes_may_use. Meanwhile __btrfs_prealloc_file_range() will call btrfs_free_reserved_data_space() internally for both sucessful and failed path, btrfs_prealloc_file_range()'s callers does not need to call btrfs_free_reserved_data_space() any more. Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com> --- fs/btrfs/ctree.h | 2 +- fs/btrfs/extent-tree.c | 56 +++++++++++++++++--------------------------------- fs/btrfs/file.c | 26 +++++++++++++---------- fs/btrfs/inode-map.c | 3 +-- fs/btrfs/inode.c | 37 ++++++++++++++++++++++++--------- fs/btrfs/relocation.c | 11 ++++++++-- 6 files changed, 72 insertions(+), 63 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 4274a7b..7eb2913 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -2556,7 +2556,7 @@ int btrfs_alloc_logged_file_extent(struct btrfs_trans_handle *trans, struct btrfs_root *root, u64 root_objectid, u64 owner, u64 offset, struct btrfs_key *ins); -int btrfs_reserve_extent(struct btrfs_root *root, u64 num_bytes, +int btrfs_reserve_extent(struct btrfs_root *root, u64 ram_bytes, u64 num_bytes, u64 min_alloc_size, u64 empty_size, u64 hint_byte, struct btrfs_key *ins, int is_data, int delalloc); int btrfs_inc_ref(struct btrfs_trans_handle *trans, struct btrfs_root *root, diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 8eaac39..5447973 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -60,21 +60,6 @@ enum { CHUNK_ALLOC_FORCE = 2, }; -/* - * Control how reservations are dealt with. - * - * RESERVE_FREE - freeing a reservation. - * RESERVE_ALLOC - allocating space and we need to update bytes_may_use for - * ENOSPC accounting - * RESERVE_ALLOC_NO_ACCOUNT - allocating space and we should not update - * bytes_may_use as the ENOSPC accounting is done elsewhere - */ -enum { - RESERVE_FREE = 0, - RESERVE_ALLOC = 1, - RESERVE_ALLOC_NO_ACCOUNT = 2, -}; - static int update_block_group(struct btrfs_trans_handle *trans, struct btrfs_root *root, u64 bytenr, u64 num_bytes, int alloc); @@ -105,7 +90,7 @@ static int find_next_key(struct btrfs_path *path, int level, static void dump_space_info(struct btrfs_space_info *info, u64 bytes, int dump_block_groups); static int btrfs_add_reserved_bytes(struct btrfs_block_group_cache *cache, - u64 num_bytes, int reserve, int delalloc); + u64 ram_bytes, u64 num_bytes, int delalloc); static int btrfs_free_reserved_bytes(struct btrfs_block_group_cache *cache, u64 num_bytes, int delalloc); static int block_rsv_use_bytes(struct btrfs_block_rsv *block_rsv, @@ -3491,7 +3476,6 @@ again: dcs = BTRFS_DC_SETUP; else if (ret == -ENOSPC) set_bit(BTRFS_TRANS_CACHE_ENOSPC, &trans->transaction->flags); - btrfs_free_reserved_data_space(inode, 0, num_pages); out_put: iput(inode); @@ -6300,8 +6284,9 @@ void btrfs_wait_block_group_reservations(struct btrfs_block_group_cache *bg) /** * btrfs_add_reserved_bytes - update the block_group and space info counters * @cache: The cache we are manipulating + * @ram_bytes: The number of bytes of file content, and will be same to + * @num_bytes except for the compress path. * @num_bytes: The number of bytes in question - * @reserve: One of the reservation enums * @delalloc: The blocks are allocated for the delalloc write * * This is called by the allocator when it reserves space. Metadata @@ -6316,7 +6301,7 @@ void btrfs_wait_block_group_reservations(struct btrfs_block_group_cache *bg) * succeeds. */ static int btrfs_add_reserved_bytes(struct btrfs_block_group_cache *cache, - u64 num_bytes, int reserve, int delalloc) + u64 ram_bytes, u64 num_bytes, int delalloc) { struct btrfs_space_info *space_info = cache->space_info; int ret = 0; @@ -6328,13 +6313,11 @@ static int btrfs_add_reserved_bytes(struct btrfs_block_group_cache *cache, } else { cache->reserved += num_bytes; space_info->bytes_reserved += num_bytes; - if (reserve == RESERVE_ALLOC) { - trace_btrfs_space_reservation(cache->fs_info, - "space_info", space_info->flags, - num_bytes, 0); - space_info->bytes_may_use -= num_bytes; - } + trace_btrfs_space_reservation(cache->fs_info, + "space_info", space_info->flags, + num_bytes, 0); + space_info->bytes_may_use -= ram_bytes; if (delalloc) cache->delalloc_bytes += num_bytes; } @@ -7218,9 +7201,9 @@ btrfs_release_block_group(struct btrfs_block_group_cache *cache, * the free space extent currently. */ static noinline int find_free_extent(struct btrfs_root *orig_root, - u64 num_bytes, u64 empty_size, - u64 hint_byte, struct btrfs_key *ins, - u64 flags, int delalloc) + u64 ram_bytes, u64 num_bytes, u64 empty_size, + u64 hint_byte, struct btrfs_key *ins, + u64 flags, int delalloc) { int ret = 0; struct btrfs_root *root = orig_root->fs_info->extent_root; @@ -7232,8 +7215,6 @@ static noinline int find_free_extent(struct btrfs_root *orig_root, struct btrfs_space_info *space_info; int loop = 0; int index = __get_raid_index(flags); - int alloc_type = (flags & BTRFS_BLOCK_GROUP_DATA) ? - RESERVE_ALLOC_NO_ACCOUNT : RESERVE_ALLOC; bool failed_cluster_refill = false; bool failed_alloc = false; bool use_cluster = true; @@ -7565,8 +7546,8 @@ checks: search_start - offset); BUG_ON(offset > search_start); - ret = btrfs_add_reserved_bytes(block_group, num_bytes, - alloc_type, delalloc); + ret = btrfs_add_reserved_bytes(block_group, ram_bytes, + num_bytes, delalloc); if (ret == -EAGAIN) { btrfs_add_free_space(block_group, offset, num_bytes); goto loop; @@ -7739,7 +7720,7 @@ again: up_read(&info->groups_sem); } -int btrfs_reserve_extent(struct btrfs_root *root, +int btrfs_reserve_extent(struct btrfs_root *root, u64 ram_bytes, u64 num_bytes, u64 min_alloc_size, u64 empty_size, u64 hint_byte, struct btrfs_key *ins, int is_data, int delalloc) @@ -7751,8 +7732,8 @@ int btrfs_reserve_extent(struct btrfs_root *root, flags = btrfs_get_alloc_profile(root, is_data); again: WARN_ON(num_bytes < root->sectorsize); - ret = find_free_extent(root, num_bytes, empty_size, hint_byte, ins, - flags, delalloc); + ret = find_free_extent(root, ram_bytes, num_bytes, empty_size, + hint_byte, ins, flags, delalloc); if (!ret && !is_data) { btrfs_dec_block_group_reservations(root->fs_info, ins->objectid); @@ -7761,6 +7742,7 @@ again: num_bytes = min(num_bytes >> 1, ins->offset); num_bytes = round_down(num_bytes, root->sectorsize); num_bytes = max(num_bytes, min_alloc_size); + ram_bytes = num_bytes; if (num_bytes == min_alloc_size) final_tried = true; goto again; @@ -8029,7 +8011,7 @@ int btrfs_alloc_logged_file_extent(struct btrfs_trans_handle *trans, return -EINVAL; ret = btrfs_add_reserved_bytes(block_group, ins->offset, - RESERVE_ALLOC_NO_ACCOUNT, 0); + ins->offset, 0); BUG_ON(ret); /* logic error */ ret = alloc_reserved_file_extent(trans, root, 0, root_objectid, 0, owner, offset, ins, 1); @@ -8171,7 +8153,7 @@ struct extent_buffer *btrfs_alloc_tree_block(struct btrfs_trans_handle *trans, if (IS_ERR(block_rsv)) return ERR_CAST(block_rsv); - ret = btrfs_reserve_extent(root, blocksize, blocksize, + ret = btrfs_reserve_extent(root, blocksize, blocksize, blocksize, empty_size, hint, &ins, 0, 0); if (ret) goto out_unuse; diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 2234e88..b4d9258 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -2669,6 +2669,7 @@ static long btrfs_fallocate(struct file *file, int mode, alloc_start = round_down(offset, blocksize); alloc_end = round_up(offset + len, blocksize); + cur_offset = alloc_start; /* Make sure we aren't being give some crap mode */ if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE)) @@ -2761,7 +2762,6 @@ static long btrfs_fallocate(struct file *file, int mode, /* First, check if we exceed the qgroup limit */ INIT_LIST_HEAD(&reserve_list); - cur_offset = alloc_start; while (1) { em = btrfs_get_extent(inode, NULL, 0, cur_offset, alloc_end - cur_offset, 0); @@ -2788,6 +2788,14 @@ static long btrfs_fallocate(struct file *file, int mode, last_byte - cur_offset); if (ret < 0) break; + } else { + /* + * Do not need to reserve unwritten extent for this + * range, free reserved data space first, otherwise + * it'll result in false ENOSPC error. + */ + btrfs_free_reserved_data_space(inode, cur_offset, + last_byte - cur_offset); } free_extent_map(em); cur_offset = last_byte; @@ -2805,6 +2813,9 @@ static long btrfs_fallocate(struct file *file, int mode, range->start, range->len, 1 << inode->i_blkbits, offset + len, &alloc_hint); + else + btrfs_free_reserved_data_space(inode, range->start, + range->len); list_del(&range->list); kfree(range); } @@ -2839,18 +2850,11 @@ out_unlock: unlock_extent_cached(&BTRFS_I(inode)->io_tree, alloc_start, locked_end, &cached_state, GFP_KERNEL); out: - /* - * As we waited the extent range, the data_rsv_map must be empty - * in the range, as written data range will be released from it. - * And for prealloacted extent, it will also be released when - * its metadata is written. - * So this is completely used as cleanup. - */ - btrfs_qgroup_free_data(inode, alloc_start, alloc_end - alloc_start); inode_unlock(inode); /* Let go of our reservation. */ - btrfs_free_reserved_data_space(inode, alloc_start, - alloc_end - alloc_start); + if (ret != 0) + btrfs_free_reserved_data_space(inode, alloc_start, + alloc_end - cur_offset); return ret; } diff --git a/fs/btrfs/inode-map.c b/fs/btrfs/inode-map.c index 70107f7..e59e7d6 100644 --- a/fs/btrfs/inode-map.c +++ b/fs/btrfs/inode-map.c @@ -495,10 +495,9 @@ again: ret = btrfs_prealloc_file_range_trans(inode, trans, 0, 0, prealloc, prealloc, prealloc, &alloc_hint); if (ret) { - btrfs_delalloc_release_space(inode, 0, prealloc); + btrfs_delalloc_release_metadata(inode, prealloc); goto out_put; } - btrfs_free_reserved_data_space(inode, 0, prealloc); ret = btrfs_write_out_ino_cache(root, trans, path, inode); out_put: diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 4421954..e0cee59 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -564,6 +564,8 @@ cont: PAGE_SET_WRITEBACK | page_error_op | PAGE_END_WRITEBACK); + btrfs_free_reserved_data_space_noquota(inode, start, + end - start + 1); goto free_pages_out; } } @@ -739,7 +741,7 @@ retry: lock_extent(io_tree, async_extent->start, async_extent->start + async_extent->ram_size - 1); - ret = btrfs_reserve_extent(root, + ret = btrfs_reserve_extent(root, async_extent->ram_size, async_extent->compressed_size, async_extent->compressed_size, 0, alloc_hint, &ins, 1, 1); @@ -966,7 +968,8 @@ static noinline int cow_file_range(struct inode *inode, EXTENT_DEFRAG, PAGE_UNLOCK | PAGE_CLEAR_DIRTY | PAGE_SET_WRITEBACK | PAGE_END_WRITEBACK); - + btrfs_free_reserved_data_space_noquota(inode, start, + end - start + 1); *nr_written = *nr_written + (end - start + PAGE_SIZE) / PAGE_SIZE; *page_started = 1; @@ -986,7 +989,7 @@ static noinline int cow_file_range(struct inode *inode, unsigned long op; cur_alloc_size = disk_num_bytes; - ret = btrfs_reserve_extent(root, cur_alloc_size, + ret = btrfs_reserve_extent(root, cur_alloc_size, cur_alloc_size, root->sectorsize, 0, alloc_hint, &ins, 1, 1); if (ret < 0) @@ -1485,8 +1488,10 @@ out_check: extent_clear_unlock_delalloc(inode, cur_offset, cur_offset + num_bytes - 1, locked_page, EXTENT_LOCKED | - EXTENT_DELALLOC, PAGE_UNLOCK | - PAGE_SET_PRIVATE2); + EXTENT_DELALLOC | + EXTENT_CLEAR_DATA_RESV, + PAGE_UNLOCK | PAGE_SET_PRIVATE2); + if (!nolock && nocow) btrfs_end_write_no_snapshoting(root); cur_offset = extent_end; @@ -1803,7 +1808,9 @@ static void btrfs_clear_bit_hook(struct inode *inode, return; if (root->root_key.objectid != BTRFS_DATA_RELOC_TREE_OBJECTID - && do_list && !(state->state & EXTENT_NORESERVE)) + && do_list && !(state->state & EXTENT_NORESERVE) + && (*bits & (EXTENT_DO_ACCOUNTING | + EXTENT_CLEAR_DATA_RESV))) btrfs_free_reserved_data_space_noquota(inode, state->start, len); @@ -7214,7 +7221,7 @@ static struct extent_map *btrfs_new_extent_direct(struct inode *inode, int ret; alloc_hint = get_extent_allocation_hint(inode, start, len); - ret = btrfs_reserve_extent(root, len, root->sectorsize, 0, + ret = btrfs_reserve_extent(root, len, len, root->sectorsize, 0, alloc_hint, &ins, 1, 1); if (ret) return ERR_PTR(ret); @@ -7714,6 +7721,13 @@ static int btrfs_get_blocks_direct(struct inode *inode, sector_t iblock, ret = PTR_ERR(em2); goto unlock_err; } + /* + * For inode marked NODATACOW or extent marked PREALLOC, + * use the existing or preallocated extent, so does not + * need to adjust btrfs_space_info's bytes_may_use. + */ + btrfs_free_reserved_data_space_noquota(inode, + start, len); goto unlock; } } @@ -7748,7 +7762,6 @@ unlock: i_size_write(inode, start + len); adjust_dio_outstanding_extents(inode, dio_data, len); - btrfs_free_reserved_data_space(inode, start, len); WARN_ON(dio_data->reserve < len); dio_data->reserve -= len; dio_data->unsubmitted_oe_range_end = start + len; @@ -10269,6 +10282,7 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode, u64 last_alloc = (u64)-1; int ret = 0; bool own_trans = true; + u64 end = start + num_bytes - 1; if (trans) own_trans = false; @@ -10290,8 +10304,8 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode, * sized chunks. */ cur_bytes = min(cur_bytes, last_alloc); - ret = btrfs_reserve_extent(root, cur_bytes, min_size, 0, - *alloc_hint, &ins, 1, 0); + ret = btrfs_reserve_extent(root, cur_bytes, cur_bytes, + min_size, 0, *alloc_hint, &ins, 1, 0); if (ret) { if (own_trans) btrfs_end_transaction(trans, root); @@ -10377,6 +10391,9 @@ next: if (own_trans) btrfs_end_transaction(trans, root); } + if (cur_offset < end) + btrfs_free_reserved_data_space(inode, cur_offset, + end - cur_offset + 1); return ret; } diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index a0de885..f39c4db 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -3032,6 +3032,7 @@ int prealloc_file_extent_cluster(struct inode *inode, int ret = 0; u64 prealloc_start = cluster->start - offset; u64 prealloc_end = cluster->end - offset; + u64 cur_offset; BUG_ON(cluster->start != cluster->boundary[0]); inode_lock(inode); @@ -3041,6 +3042,7 @@ int prealloc_file_extent_cluster(struct inode *inode, if (ret) goto out; + cur_offset = prealloc_start; while (nr < cluster->nr) { start = cluster->boundary[nr] - offset; if (nr + 1 < cluster->nr) @@ -3050,16 +3052,21 @@ int prealloc_file_extent_cluster(struct inode *inode, lock_extent(&BTRFS_I(inode)->io_tree, start, end); num_bytes = end + 1 - start; + if (cur_offset < start) + btrfs_free_reserved_data_space(inode, cur_offset, + start - cur_offset); ret = btrfs_prealloc_file_range(inode, 0, start, num_bytes, num_bytes, end + 1, &alloc_hint); + cur_offset = end + 1; unlock_extent(&BTRFS_I(inode)->io_tree, start, end); if (ret) break; nr++; } - btrfs_free_reserved_data_space(inode, prealloc_start, - prealloc_end + 1 - prealloc_start); + if (cur_offset < prealloc_end) + btrfs_free_reserved_data_space(inode, cur_offset, + prealloc_end + 1 - cur_offset); out: inode_unlock(inode); return ret; -- 2.9.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 4/4] btrfs: update btrfs_space_info's bytes_may_use timely 2016-07-20 5:56 ` [PATCH 4/4] btrfs: update btrfs_space_info's bytes_may_use timely Wang Xiaoguang @ 2016-07-20 13:35 ` Josef Bacik 2016-07-21 1:18 ` Wang Xiaoguang 0 siblings, 1 reply; 15+ messages in thread From: Josef Bacik @ 2016-07-20 13:35 UTC (permalink / raw) To: Wang Xiaoguang, linux-btrfs; +Cc: dsterba, holger On 07/20/2016 01:56 AM, Wang Xiaoguang wrote: > This patch can fix some false ENOSPC errors, below test script can > reproduce one false ENOSPC error: > #!/bin/bash > dd if=/dev/zero of=fs.img bs=$((1024*1024)) count=128 > dev=$(losetup --show -f fs.img) > mkfs.btrfs -f -M $dev > mkdir /tmp/mntpoint > mount $dev /tmp/mntpoint > cd /tmp/mntpoint > xfs_io -f -c "falloc 0 $((64*1024*1024))" testfile > > Above script will fail for ENOSPC reason, but indeed fs still has free > space to satisfy this request. Please see call graph: > btrfs_fallocate() > |-> btrfs_alloc_data_chunk_ondemand() > | bytes_may_use += 64M > |-> btrfs_prealloc_file_range() > |-> btrfs_reserve_extent() > |-> btrfs_add_reserved_bytes() > | alloc_type is RESERVE_ALLOC_NO_ACCOUNT, so it does not > | change bytes_may_use, and bytes_reserved += 64M. Now > | bytes_may_use + bytes_reserved == 128M, which is greater > | than btrfs_space_info's total_bytes, false enospc occurs. > | Note, the bytes_may_use decrease operation will done in > | end of btrfs_fallocate(), which is too late. > > Here is another simple case for buffered write: > CPU 1 | CPU 2 > | > |-> cow_file_range() |-> __btrfs_buffered_write() > |-> btrfs_reserve_extent() | | > | | | > | | | > | ..... | |-> btrfs_check_data_free_space() > | | > | | > |-> extent_clear_unlock_delalloc() | > > In CPU 1, btrfs_reserve_extent()->find_free_extent()-> > btrfs_add_reserved_bytes() do not decrease bytes_may_use, the decrease > operation will be delayed to be done in extent_clear_unlock_delalloc(). > Assume in this case, btrfs_reserve_extent() reserved 128MB data, CPU2's > btrfs_check_data_free_space() tries to reserve 100MB data space. > If > 100MB > data_sinfo->total_bytes - data_sinfo->bytes_used - > data_sinfo->bytes_reserved - data_sinfo->bytes_pinned - > data_sinfo->bytes_readonly - data_sinfo->bytes_may_use > btrfs_check_data_free_space() will try to allcate new data chunk or call > btrfs_start_delalloc_roots(), or commit current transaction inorder to > reserve some free space, obviously a lot of work. But indeed it's not > necessary as long as decreasing bytes_may_use timely, we still have > free space, decreasing 128M from bytes_may_use. > > To fix this issue, this patch chooses to update bytes_may_use for both > data and metadata in btrfs_add_reserved_bytes(). For compress path, real > extent length may not be equal to file content length, so introduce a > ram_bytes argument for btrfs_reserve_extent(), find_free_extent() and > btrfs_add_reserved_bytes(), it's becasue bytes_may_use is increased by > file content length. Then compress path can update bytes_may_use > correctly. Also now we can discard RESERVE_ALLOC_NO_ACCOUNT, RESERVE_ALLOC > and RESERVE_FREE. > > For inode marked as NODATACOW or extent marked as PREALLOC, we can > directly call btrfs_free_reserved_data_space() to adjust bytes_may_use. > > Meanwhile __btrfs_prealloc_file_range() will call > btrfs_free_reserved_data_space() internally for both sucessful and failed > path, btrfs_prealloc_file_range()'s callers does not need to call > btrfs_free_reserved_data_space() any more. > > Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com> > --- > fs/btrfs/ctree.h | 2 +- > fs/btrfs/extent-tree.c | 56 +++++++++++++++++--------------------------------- > fs/btrfs/file.c | 26 +++++++++++++---------- > fs/btrfs/inode-map.c | 3 +-- > fs/btrfs/inode.c | 37 ++++++++++++++++++++++++--------- > fs/btrfs/relocation.c | 11 ++++++++-- > 6 files changed, 72 insertions(+), 63 deletions(-) > > diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h > index 4274a7b..7eb2913 100644 > --- a/fs/btrfs/ctree.h > +++ b/fs/btrfs/ctree.h > @@ -2556,7 +2556,7 @@ int btrfs_alloc_logged_file_extent(struct btrfs_trans_handle *trans, > struct btrfs_root *root, > u64 root_objectid, u64 owner, u64 offset, > struct btrfs_key *ins); > -int btrfs_reserve_extent(struct btrfs_root *root, u64 num_bytes, > +int btrfs_reserve_extent(struct btrfs_root *root, u64 ram_bytes, u64 num_bytes, > u64 min_alloc_size, u64 empty_size, u64 hint_byte, > struct btrfs_key *ins, int is_data, int delalloc); > int btrfs_inc_ref(struct btrfs_trans_handle *trans, struct btrfs_root *root, > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c > index 8eaac39..5447973 100644 > --- a/fs/btrfs/extent-tree.c > +++ b/fs/btrfs/extent-tree.c > @@ -60,21 +60,6 @@ enum { > CHUNK_ALLOC_FORCE = 2, > }; > > -/* > - * Control how reservations are dealt with. > - * > - * RESERVE_FREE - freeing a reservation. > - * RESERVE_ALLOC - allocating space and we need to update bytes_may_use for > - * ENOSPC accounting > - * RESERVE_ALLOC_NO_ACCOUNT - allocating space and we should not update > - * bytes_may_use as the ENOSPC accounting is done elsewhere > - */ > -enum { > - RESERVE_FREE = 0, > - RESERVE_ALLOC = 1, > - RESERVE_ALLOC_NO_ACCOUNT = 2, > -}; > - > static int update_block_group(struct btrfs_trans_handle *trans, > struct btrfs_root *root, u64 bytenr, > u64 num_bytes, int alloc); > @@ -105,7 +90,7 @@ static int find_next_key(struct btrfs_path *path, int level, > static void dump_space_info(struct btrfs_space_info *info, u64 bytes, > int dump_block_groups); > static int btrfs_add_reserved_bytes(struct btrfs_block_group_cache *cache, > - u64 num_bytes, int reserve, int delalloc); > + u64 ram_bytes, u64 num_bytes, int delalloc); > static int btrfs_free_reserved_bytes(struct btrfs_block_group_cache *cache, > u64 num_bytes, int delalloc); > static int block_rsv_use_bytes(struct btrfs_block_rsv *block_rsv, > @@ -3491,7 +3476,6 @@ again: > dcs = BTRFS_DC_SETUP; > else if (ret == -ENOSPC) > set_bit(BTRFS_TRANS_CACHE_ENOSPC, &trans->transaction->flags); > - btrfs_free_reserved_data_space(inode, 0, num_pages); > > out_put: > iput(inode); > @@ -6300,8 +6284,9 @@ void btrfs_wait_block_group_reservations(struct btrfs_block_group_cache *bg) > /** > * btrfs_add_reserved_bytes - update the block_group and space info counters > * @cache: The cache we are manipulating > + * @ram_bytes: The number of bytes of file content, and will be same to > + * @num_bytes except for the compress path. > * @num_bytes: The number of bytes in question > - * @reserve: One of the reservation enums > * @delalloc: The blocks are allocated for the delalloc write > * > * This is called by the allocator when it reserves space. Metadata > @@ -6316,7 +6301,7 @@ void btrfs_wait_block_group_reservations(struct btrfs_block_group_cache *bg) > * succeeds. > */ > static int btrfs_add_reserved_bytes(struct btrfs_block_group_cache *cache, > - u64 num_bytes, int reserve, int delalloc) > + u64 ram_bytes, u64 num_bytes, int delalloc) > { > struct btrfs_space_info *space_info = cache->space_info; > int ret = 0; > @@ -6328,13 +6313,11 @@ static int btrfs_add_reserved_bytes(struct btrfs_block_group_cache *cache, > } else { > cache->reserved += num_bytes; > space_info->bytes_reserved += num_bytes; > - if (reserve == RESERVE_ALLOC) { > - trace_btrfs_space_reservation(cache->fs_info, > - "space_info", space_info->flags, > - num_bytes, 0); > - space_info->bytes_may_use -= num_bytes; > - } > > + trace_btrfs_space_reservation(cache->fs_info, > + "space_info", space_info->flags, > + num_bytes, 0); This needs to be ram_bytes to keep the accounting consistent for tools that use these tracepoints. Thanks, Josef ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 4/4] btrfs: update btrfs_space_info's bytes_may_use timely 2016-07-20 13:35 ` Josef Bacik @ 2016-07-21 1:18 ` Wang Xiaoguang 0 siblings, 0 replies; 15+ messages in thread From: Wang Xiaoguang @ 2016-07-21 1:18 UTC (permalink / raw) To: Josef Bacik, linux-btrfs; +Cc: dsterba, holger hello, On 07/20/2016 09:35 PM, Josef Bacik wrote: > On 07/20/2016 01:56 AM, Wang Xiaoguang wrote: >> This patch can fix some false ENOSPC errors, below test script can >> reproduce one false ENOSPC error: >> #!/bin/bash >> dd if=/dev/zero of=fs.img bs=$((1024*1024)) count=128 >> dev=$(losetup --show -f fs.img) >> mkfs.btrfs -f -M $dev >> mkdir /tmp/mntpoint >> mount $dev /tmp/mntpoint >> cd /tmp/mntpoint >> xfs_io -f -c "falloc 0 $((64*1024*1024))" testfile >> >> Above script will fail for ENOSPC reason, but indeed fs still has free >> space to satisfy this request. Please see call graph: >> btrfs_fallocate() >> |-> btrfs_alloc_data_chunk_ondemand() >> | bytes_may_use += 64M >> |-> btrfs_prealloc_file_range() >> |-> btrfs_reserve_extent() >> |-> btrfs_add_reserved_bytes() >> | alloc_type is RESERVE_ALLOC_NO_ACCOUNT, so it does not >> | change bytes_may_use, and bytes_reserved += 64M. Now >> | bytes_may_use + bytes_reserved == 128M, which is greater >> | than btrfs_space_info's total_bytes, false enospc occurs. >> | Note, the bytes_may_use decrease operation will done in >> | end of btrfs_fallocate(), which is too late. >> >> Here is another simple case for buffered write: >> CPU 1 | CPU 2 >> | >> |-> cow_file_range() |-> __btrfs_buffered_write() >> |-> btrfs_reserve_extent() | | >> | | | >> | | | >> | ..... | |-> >> btrfs_check_data_free_space() >> | | >> | | >> |-> extent_clear_unlock_delalloc() | >> >> In CPU 1, btrfs_reserve_extent()->find_free_extent()-> >> btrfs_add_reserved_bytes() do not decrease bytes_may_use, the decrease >> operation will be delayed to be done in extent_clear_unlock_delalloc(). >> Assume in this case, btrfs_reserve_extent() reserved 128MB data, CPU2's >> btrfs_check_data_free_space() tries to reserve 100MB data space. >> If >> 100MB > data_sinfo->total_bytes - data_sinfo->bytes_used - >> data_sinfo->bytes_reserved - data_sinfo->bytes_pinned - >> data_sinfo->bytes_readonly - data_sinfo->bytes_may_use >> btrfs_check_data_free_space() will try to allcate new data chunk or call >> btrfs_start_delalloc_roots(), or commit current transaction inorder to >> reserve some free space, obviously a lot of work. But indeed it's not >> necessary as long as decreasing bytes_may_use timely, we still have >> free space, decreasing 128M from bytes_may_use. >> >> To fix this issue, this patch chooses to update bytes_may_use for both >> data and metadata in btrfs_add_reserved_bytes(). For compress path, real >> extent length may not be equal to file content length, so introduce a >> ram_bytes argument for btrfs_reserve_extent(), find_free_extent() and >> btrfs_add_reserved_bytes(), it's becasue bytes_may_use is increased by >> file content length. Then compress path can update bytes_may_use >> correctly. Also now we can discard RESERVE_ALLOC_NO_ACCOUNT, >> RESERVE_ALLOC >> and RESERVE_FREE. >> >> For inode marked as NODATACOW or extent marked as PREALLOC, we can >> directly call btrfs_free_reserved_data_space() to adjust bytes_may_use. >> >> Meanwhile __btrfs_prealloc_file_range() will call >> btrfs_free_reserved_data_space() internally for both sucessful and >> failed >> path, btrfs_prealloc_file_range()'s callers does not need to call >> btrfs_free_reserved_data_space() any more. >> >> Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com> >> --- >> fs/btrfs/ctree.h | 2 +- >> fs/btrfs/extent-tree.c | 56 >> +++++++++++++++++--------------------------------- >> fs/btrfs/file.c | 26 +++++++++++++---------- >> fs/btrfs/inode-map.c | 3 +-- >> fs/btrfs/inode.c | 37 ++++++++++++++++++++++++--------- >> fs/btrfs/relocation.c | 11 ++++++++-- >> 6 files changed, 72 insertions(+), 63 deletions(-) >> >> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h >> index 4274a7b..7eb2913 100644 >> --- a/fs/btrfs/ctree.h >> +++ b/fs/btrfs/ctree.h >> @@ -2556,7 +2556,7 @@ int btrfs_alloc_logged_file_extent(struct >> btrfs_trans_handle *trans, >> struct btrfs_root *root, >> u64 root_objectid, u64 owner, u64 offset, >> struct btrfs_key *ins); >> -int btrfs_reserve_extent(struct btrfs_root *root, u64 num_bytes, >> +int btrfs_reserve_extent(struct btrfs_root *root, u64 ram_bytes, u64 >> num_bytes, >> u64 min_alloc_size, u64 empty_size, u64 hint_byte, >> struct btrfs_key *ins, int is_data, int delalloc); >> int btrfs_inc_ref(struct btrfs_trans_handle *trans, struct >> btrfs_root *root, >> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c >> index 8eaac39..5447973 100644 >> --- a/fs/btrfs/extent-tree.c >> +++ b/fs/btrfs/extent-tree.c >> @@ -60,21 +60,6 @@ enum { >> CHUNK_ALLOC_FORCE = 2, >> }; >> >> -/* >> - * Control how reservations are dealt with. >> - * >> - * RESERVE_FREE - freeing a reservation. >> - * RESERVE_ALLOC - allocating space and we need to update >> bytes_may_use for >> - * ENOSPC accounting >> - * RESERVE_ALLOC_NO_ACCOUNT - allocating space and we should not update >> - * bytes_may_use as the ENOSPC accounting is done elsewhere >> - */ >> -enum { >> - RESERVE_FREE = 0, >> - RESERVE_ALLOC = 1, >> - RESERVE_ALLOC_NO_ACCOUNT = 2, >> -}; >> - >> static int update_block_group(struct btrfs_trans_handle *trans, >> struct btrfs_root *root, u64 bytenr, >> u64 num_bytes, int alloc); >> @@ -105,7 +90,7 @@ static int find_next_key(struct btrfs_path *path, >> int level, >> static void dump_space_info(struct btrfs_space_info *info, u64 bytes, >> int dump_block_groups); >> static int btrfs_add_reserved_bytes(struct btrfs_block_group_cache >> *cache, >> - u64 num_bytes, int reserve, int delalloc); >> + u64 ram_bytes, u64 num_bytes, int delalloc); >> static int btrfs_free_reserved_bytes(struct btrfs_block_group_cache >> *cache, >> u64 num_bytes, int delalloc); >> static int block_rsv_use_bytes(struct btrfs_block_rsv *block_rsv, >> @@ -3491,7 +3476,6 @@ again: >> dcs = BTRFS_DC_SETUP; >> else if (ret == -ENOSPC) >> set_bit(BTRFS_TRANS_CACHE_ENOSPC, &trans->transaction->flags); >> - btrfs_free_reserved_data_space(inode, 0, num_pages); >> >> out_put: >> iput(inode); >> @@ -6300,8 +6284,9 @@ void btrfs_wait_block_group_reservations(struct >> btrfs_block_group_cache *bg) >> /** >> * btrfs_add_reserved_bytes - update the block_group and space info >> counters >> * @cache: The cache we are manipulating >> + * @ram_bytes: The number of bytes of file content, and will be >> same to >> + * @num_bytes except for the compress path. >> * @num_bytes: The number of bytes in question >> - * @reserve: One of the reservation enums >> * @delalloc: The blocks are allocated for the delalloc write >> * >> * This is called by the allocator when it reserves space. Metadata >> @@ -6316,7 +6301,7 @@ void btrfs_wait_block_group_reservations(struct >> btrfs_block_group_cache *bg) >> * succeeds. >> */ >> static int btrfs_add_reserved_bytes(struct btrfs_block_group_cache >> *cache, >> - u64 num_bytes, int reserve, int delalloc) >> + u64 ram_bytes, u64 num_bytes, int delalloc) >> { >> struct btrfs_space_info *space_info = cache->space_info; >> int ret = 0; >> @@ -6328,13 +6313,11 @@ static int btrfs_add_reserved_bytes(struct >> btrfs_block_group_cache *cache, >> } else { >> cache->reserved += num_bytes; >> space_info->bytes_reserved += num_bytes; >> - if (reserve == RESERVE_ALLOC) { >> - trace_btrfs_space_reservation(cache->fs_info, >> - "space_info", space_info->flags, >> - num_bytes, 0); >> - space_info->bytes_may_use -= num_bytes; >> - } >> >> + trace_btrfs_space_reservation(cache->fs_info, >> + "space_info", space_info->flags, >> + num_bytes, 0); > > This needs to be ram_bytes to keep the accounting consistent for tools > that use these tracepoints. Thanks, OK, I'll fix this issue in later version. Regards, Xiaoguang Wang > > Josef > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/4] update bytes_may_use timely to avoid false ENOSPC issue 2016-07-20 5:56 [PATCH 0/4] update bytes_may_use timely to avoid false ENOSPC issue Wang Xiaoguang ` (3 preceding siblings ...) 2016-07-20 5:56 ` [PATCH 4/4] btrfs: update btrfs_space_info's bytes_may_use timely Wang Xiaoguang @ 2016-07-20 8:46 ` Holger Hoffstätte 2016-07-21 1:51 ` Wang Xiaoguang 4 siblings, 1 reply; 15+ messages in thread From: Holger Hoffstätte @ 2016-07-20 8:46 UTC (permalink / raw) To: Wang Xiaoguang, linux-btrfs; +Cc: dsterba, jbacik On 07/20/16 07:56, Wang Xiaoguang wrote: > Currently in btrfs, for data space reservation, it does not update > bytes_may_use in btrfs_update_reserved_bytes() and the decrease operation > will be delayed to be done in extent_clear_unlock_delalloc(), for > fallocate(2), decrease operation is even delayed to be done in end > of btrfs_fallocate(), which is too late. Obviously such delay will > cause unnecessary pressure to enospc system, in [PATCH 4/4], there is > a simpel test script that can reveal such false ENOSPC bug. > > So in this patch set, it will remove RESERVE_FREE, RESERVE_ALLOC and > RESERVE_ALLOC_NO_ACCOUNT, and we always update bytes_may_use timely. > > I'll also commit a fstests test case for this issue. > > Wang Xiaoguang (4): > btrfs: use correct offset for reloc_inode in > prealloc_file_extent_cluster() > btrfs: divide btrfs_update_reserved_bytes() into two functions > btrfs: introduce new EXTENT_CLEAR_DATA_RESV flag > btrfs: update btrfs_space_info's bytes_may_use timely > Just like the previous version, for all 4 patches: Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com> via the reproducer script & some very large manual fallocates. thanks! Holger ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/4] update bytes_may_use timely to avoid false ENOSPC issue 2016-07-20 8:46 ` [PATCH 0/4] update bytes_may_use timely to avoid false ENOSPC issue Holger Hoffstätte @ 2016-07-21 1:51 ` Wang Xiaoguang 0 siblings, 0 replies; 15+ messages in thread From: Wang Xiaoguang @ 2016-07-21 1:51 UTC (permalink / raw) To: Holger Hoffstätte, linux-btrfs; +Cc: dsterba, jbacik hello, On 07/20/2016 04:46 PM, Holger Hoffstätte wrote: > On 07/20/16 07:56, Wang Xiaoguang wrote: >> Currently in btrfs, for data space reservation, it does not update >> bytes_may_use in btrfs_update_reserved_bytes() and the decrease operation >> will be delayed to be done in extent_clear_unlock_delalloc(), for >> fallocate(2), decrease operation is even delayed to be done in end >> of btrfs_fallocate(), which is too late. Obviously such delay will >> cause unnecessary pressure to enospc system, in [PATCH 4/4], there is >> a simpel test script that can reveal such false ENOSPC bug. >> >> So in this patch set, it will remove RESERVE_FREE, RESERVE_ALLOC and >> RESERVE_ALLOC_NO_ACCOUNT, and we always update bytes_may_use timely. >> >> I'll also commit a fstests test case for this issue. >> >> Wang Xiaoguang (4): >> btrfs: use correct offset for reloc_inode in >> prealloc_file_extent_cluster() >> btrfs: divide btrfs_update_reserved_bytes() into two functions >> btrfs: introduce new EXTENT_CLEAR_DATA_RESV flag >> btrfs: update btrfs_space_info's bytes_may_use timely >> > Just like the previous version, for all 4 patches: > > Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com> > > via the reproducer script & some very large manual fallocates. Yes, thanks very much for your test. I also run the whole fstests before sending this patch set :) Regards, Xiaoguang Wang > > thanks! > Holger > > > ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2016-07-21 13:05 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-07-20 5:56 [PATCH 0/4] update bytes_may_use timely to avoid false ENOSPC issue Wang Xiaoguang 2016-07-20 5:56 ` [PATCH 1/4] btrfs: use correct offset for reloc_inode in prealloc_file_extent_cluster() Wang Xiaoguang 2016-07-20 13:18 ` Josef Bacik 2016-07-21 1:49 ` Wang Xiaoguang 2016-07-21 13:05 ` Josef Bacik 2016-07-20 5:56 ` [PATCH 2/4] btrfs: divide btrfs_update_reserved_bytes() into two functions Wang Xiaoguang 2016-07-20 13:21 ` Josef Bacik 2016-07-20 5:56 ` [PATCH 3/4] btrfs: introduce new EXTENT_CLEAR_DATA_RESV flag Wang Xiaoguang 2016-07-20 13:22 ` Josef Bacik 2016-07-21 1:15 ` Wang Xiaoguang 2016-07-20 5:56 ` [PATCH 4/4] btrfs: update btrfs_space_info's bytes_may_use timely Wang Xiaoguang 2016-07-20 13:35 ` Josef Bacik 2016-07-21 1:18 ` Wang Xiaoguang 2016-07-20 8:46 ` [PATCH 0/4] update bytes_may_use timely to avoid false ENOSPC issue Holger Hoffstätte 2016-07-21 1:51 ` Wang Xiaoguang
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.