All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] btrfs: wait on v1 cache for log replay
@ 2020-11-05 20:36 Josef Bacik
  2020-11-05 20:43 ` Filipe Manana
  2020-11-05 21:32 ` David Sterba
  0 siblings, 2 replies; 3+ messages in thread
From: Josef Bacik @ 2020-11-05 20:36 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Filipe reported btrfs/159 and btrfs/201 failures with the latest
misc-next, and bisected it to my change to always async the loading of
the v1 cache.  This is because when replaying the log we expect that the
free space cache will be read entirely before we start excluding space
to replay.  However this obviously changed, and thus we ended up
overwriting things that were allocated during replay.  Fix this by
exporting the helper to wait on v1 space cache and use it for this
exclusion step.  I've audited everywhere else and we are OK with all
other callers.  Anywhere that also required reading the space cache in
its entirety used btrfs_cache_block_group() with load_cache_only set,
which waits for the cache to be loaded.  We do not use that here because
we want to start caching the block group even if we aren't using the
free space inode.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
Dave, this can be folded into 

	btrfs: async load free space cache

and it'll be good to go.  I validated it fixed the report, just provided the
changelog to explain what happened.

 fs/btrfs/block-group.c | 2 +-
 fs/btrfs/block-group.h | 2 ++
 fs/btrfs/extent-tree.c | 5 +++++
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index f19fabae4754..35be6dbca5e8 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -435,7 +435,7 @@ static bool space_cache_v1_done(struct btrfs_block_group *cache)
 	return ret;
 }
 
-static void btrfs_wait_space_cache_v1_finished(struct btrfs_block_group *cache,
+void btrfs_wait_space_cache_v1_finished(struct btrfs_block_group *cache,
 				struct btrfs_caching_control *caching_ctl)
 {
 	wait_event(caching_ctl->wait, space_cache_v1_done(cache));
diff --git a/fs/btrfs/block-group.h b/fs/btrfs/block-group.h
index adfd7583a17b..8f74a96074f7 100644
--- a/fs/btrfs/block-group.h
+++ b/fs/btrfs/block-group.h
@@ -268,6 +268,8 @@ void check_system_chunk(struct btrfs_trans_handle *trans, const u64 type);
 u64 btrfs_get_alloc_profile(struct btrfs_fs_info *fs_info, u64 orig_flags);
 void btrfs_put_block_group_cache(struct btrfs_fs_info *info);
 int btrfs_free_block_groups(struct btrfs_fs_info *info);
+void btrfs_wait_space_cache_v1_finished(struct btrfs_block_group *cache,
+				struct btrfs_caching_control *caching_ctl);
 
 static inline u64 btrfs_data_alloc_profile(struct btrfs_fs_info *fs_info)
 {
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index d7a68203cda0..5c82cfdb0944 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -2641,6 +2641,11 @@ static int __exclude_logged_extent(struct btrfs_fs_info *fs_info,
 		BUG_ON(!btrfs_block_group_done(block_group));
 		ret = btrfs_remove_free_space(block_group, start, num_bytes);
 	} else {
+		/*
+		 * We must wait for v1 caching to finish, otherwise we may not
+		 * remove our space.
+		 */
+		btrfs_wait_space_cache_v1_finished(block_group, caching_ctl);
 		mutex_lock(&caching_ctl->mutex);
 
 		if (start >= caching_ctl->progress) {
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] btrfs: wait on v1 cache for log replay
  2020-11-05 20:36 [PATCH] btrfs: wait on v1 cache for log replay Josef Bacik
@ 2020-11-05 20:43 ` Filipe Manana
  2020-11-05 21:32 ` David Sterba
  1 sibling, 0 replies; 3+ messages in thread
From: Filipe Manana @ 2020-11-05 20:43 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Thu, Nov 5, 2020 at 8:37 PM Josef Bacik <josef@toxicpanda.com> wrote:
>
> Filipe reported btrfs/159 and btrfs/201 failures with the latest
> misc-next, and bisected it to my change to always async the loading of
> the v1 cache.  This is because when replaying the log we expect that the
> free space cache will be read entirely before we start excluding space
> to replay.  However this obviously changed, and thus we ended up
> overwriting things that were allocated during replay.  Fix this by
> exporting the helper to wait on v1 space cache and use it for this
> exclusion step.  I've audited everywhere else and we are OK with all
> other callers.  Anywhere that also required reading the space cache in
> its entirety used btrfs_cache_block_group() with load_cache_only set,
> which waits for the cache to be loaded.  We do not use that here because
> we want to start caching the block group even if we aren't using the
> free space inode.
>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

It works and makes sense.

Reviewed-by: Filipe Manana <fdmanana@suse.com>

Thanks!

> ---
> Dave, this can be folded into
>
>         btrfs: async load free space cache
>
> and it'll be good to go.  I validated it fixed the report, just provided the
> changelog to explain what happened.
>
>  fs/btrfs/block-group.c | 2 +-
>  fs/btrfs/block-group.h | 2 ++
>  fs/btrfs/extent-tree.c | 5 +++++
>  3 files changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
> index f19fabae4754..35be6dbca5e8 100644
> --- a/fs/btrfs/block-group.c
> +++ b/fs/btrfs/block-group.c
> @@ -435,7 +435,7 @@ static bool space_cache_v1_done(struct btrfs_block_group *cache)
>         return ret;
>  }
>
> -static void btrfs_wait_space_cache_v1_finished(struct btrfs_block_group *cache,
> +void btrfs_wait_space_cache_v1_finished(struct btrfs_block_group *cache,
>                                 struct btrfs_caching_control *caching_ctl)
>  {
>         wait_event(caching_ctl->wait, space_cache_v1_done(cache));
> diff --git a/fs/btrfs/block-group.h b/fs/btrfs/block-group.h
> index adfd7583a17b..8f74a96074f7 100644
> --- a/fs/btrfs/block-group.h
> +++ b/fs/btrfs/block-group.h
> @@ -268,6 +268,8 @@ void check_system_chunk(struct btrfs_trans_handle *trans, const u64 type);
>  u64 btrfs_get_alloc_profile(struct btrfs_fs_info *fs_info, u64 orig_flags);
>  void btrfs_put_block_group_cache(struct btrfs_fs_info *info);
>  int btrfs_free_block_groups(struct btrfs_fs_info *info);
> +void btrfs_wait_space_cache_v1_finished(struct btrfs_block_group *cache,
> +                               struct btrfs_caching_control *caching_ctl);
>
>  static inline u64 btrfs_data_alloc_profile(struct btrfs_fs_info *fs_info)
>  {
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index d7a68203cda0..5c82cfdb0944 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -2641,6 +2641,11 @@ static int __exclude_logged_extent(struct btrfs_fs_info *fs_info,
>                 BUG_ON(!btrfs_block_group_done(block_group));
>                 ret = btrfs_remove_free_space(block_group, start, num_bytes);
>         } else {
> +               /*
> +                * We must wait for v1 caching to finish, otherwise we may not
> +                * remove our space.
> +                */
> +               btrfs_wait_space_cache_v1_finished(block_group, caching_ctl);
>                 mutex_lock(&caching_ctl->mutex);
>
>                 if (start >= caching_ctl->progress) {
> --
> 2.26.2
>


-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] btrfs: wait on v1 cache for log replay
  2020-11-05 20:36 [PATCH] btrfs: wait on v1 cache for log replay Josef Bacik
  2020-11-05 20:43 ` Filipe Manana
@ 2020-11-05 21:32 ` David Sterba
  1 sibling, 0 replies; 3+ messages in thread
From: David Sterba @ 2020-11-05 21:32 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Thu, Nov 05, 2020 at 03:36:26PM -0500, Josef Bacik wrote:
> Filipe reported btrfs/159 and btrfs/201 failures with the latest
> misc-next, and bisected it to my change to always async the loading of
> the v1 cache.  This is because when replaying the log we expect that the
> free space cache will be read entirely before we start excluding space
> to replay.  However this obviously changed, and thus we ended up
> overwriting things that were allocated during replay.  Fix this by
> exporting the helper to wait on v1 space cache and use it for this
> exclusion step.  I've audited everywhere else and we are OK with all
> other callers.  Anywhere that also required reading the space cache in
> its entirety used btrfs_cache_block_group() with load_cache_only set,
> which waits for the cache to be loaded.  We do not use that here because
> we want to start caching the block group even if we aren't using the
> free space inode.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
> Dave, this can be folded into 
> 
> 	btrfs: async load free space cache
> 
> and it'll be good to go.  I validated it fixed the report, just provided the
> changelog to explain what happened.

Folded to the patch, thanks. I've updated the change with the gist of
the changelog regarding log replay.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-11-05 21:34 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-05 20:36 [PATCH] btrfs: wait on v1 cache for log replay Josef Bacik
2020-11-05 20:43 ` Filipe Manana
2020-11-05 21:32 ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.