All of lore.kernel.org
 help / color / mirror / Atom feed
* (no subject)
@ 2020-10-27 21:07 Boris Burkov
  2020-10-27 21:07 ` [PATCH v5 01/10] btrfs: lift rw mount setup from mount and remount Boris Burkov
                   ` (9 more replies)
  0 siblings, 10 replies; 16+ messages in thread
From: Boris Burkov @ 2020-10-27 21:07 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Boris Burkov

Subject: [PATCH v5 00/10] btrfs: free space tree mounting fixes

This patch set cleans up issues surrounding enabling and disabling various
free space cache features and their reporting in /proc/mounts.  Because the
improvements became somewhat complex, the series starts by lifting rw mount
logic into a single place.

The first patch is a setup patch that unifies very similar logic between a
normal rw mount and a ro->rw remount. This is a useful setup step for adding
more functionality to ro->rw remounts.

The second patch fixes the omission of orphan inode cleanup on a few trees
during ro->rw remount.

The third patch adds enabling the free space tree to ro->rw remount.

The fourth patch adds a method for clearing oneshot mount options after mount.

The fifth patch adds support for clearing the free space tree on ro->rw remount.

The sixth patch sets up for more accurate /proc/mounts by ensuring that
cache_generation > 0 iff space_cache is enabled.

The seventh patch is the more accurate /proc/mounts logic.

The eighth patch is a convenience kernel message that complains when we skip
changing the free space tree on remount.

The ninth patch removes the space cache v1 free space item and free space
inodes when space cache v1 is disabled (nospace_cache or space_cache=v2).

The tenth patch stops re-creating the free space objects when we are not
using space_cache=v1

changes for v5:
Patch 1/10: No change.
Patch 2/10: No change.
Patch 3/10: No change.
Patch 4/10: New.
Patch 5/10: New.
Patch 6/10: Handles remounts for no<->v1. Was Patch 4.
Patch 7/10: No change. Was Patch 5
Patch 8/10: Warns/undoes flags on skipped clear as well as skipped create.
            Was Patch 6.
Patch 9/10: No change. Was Patch 7.
Patch 10/10: No change. Was Patch 8.

changes for v4:
Patch 1/8: New
Patch 2/8: New
Patch 3/8: (was Patch 1) Made much simpler by lifting of Patch 1. Simply add
           free space tree creation to lifted rw mount logic.
Patch 4/8: Split out from old Patch 2. Add an fs_info flag to avoid surprises
           when a different transaction updates the cache_generation.
Patch 5/8: Split out from old Patch 2, but no change logically.
Patch 6/8: Split out from old Patch 1. Formatting fixes.
Patch 7/8: (was Patch 3) Rely on delayed_iput running after orphan cleanup
           (setup in patch 2) to pull iputs out of mount path, per review.
Patch 8/8: No change.

changes for v3:
Patch 1/4: Change failure to warning logging.
Patch 2/4: New; fixes mount option printing.
Patch 3/4: Fix orphan inode vs. delayed iput bug, change remove function
           to take inode as a sink.
Patch 4/4: No changes.

changes for v2:
Patch 1/3: made remount _only_ work in ro->rw case, added comment.
Patch 2/3: added btrfs_ prefix to non-static function, removed bad
           whitespace.

Boris Burkov (10):
  btrfs: lift rw mount setup from mount and remount
  btrfs: cleanup all orphan inodes on ro->rw remount
  btrfs: create free space tree on ro->rw remount
  btrfs: clear oneshot options on mount and remount
  btrfs: clear free space tree on ro->rw remount
  btrfs: keep sb cache_generation consistent with space_cache
  btrfs: use sb state to print space_cache mount option
  btrfs: warn when remount will not change the free space tree
  btrfs: remove free space items when disabling space cache v1
  btrfs: skip space_cache v1 setup when not using it

 fs/btrfs/block-group.c      |  42 +-------
 fs/btrfs/ctree.h            |   3 +
 fs/btrfs/disk-io.c          | 203 ++++++++++++++++++++----------------
 fs/btrfs/disk-io.h          |   2 +
 fs/btrfs/free-space-cache.c | 121 +++++++++++++++++++++
 fs/btrfs/free-space-cache.h |   6 ++
 fs/btrfs/super.c            |  70 ++++++-------
 fs/btrfs/transaction.c      |   2 +
 8 files changed, 287 insertions(+), 162 deletions(-)

-- 
2.24.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v5 01/10] btrfs: lift rw mount setup from mount and remount
  2020-10-27 21:07 Boris Burkov
@ 2020-10-27 21:07 ` Boris Burkov
  2020-10-28 14:08   ` Josef Bacik
  2020-10-27 21:07 ` [PATCH v5 02/10] btrfs: cleanup all orphan inodes on ro->rw remount Boris Burkov
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 16+ messages in thread
From: Boris Burkov @ 2020-10-27 21:07 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Boris Burkov

Mounting rw and remounting from ro to rw naturally share invariants and
functionality which result in a correctly setup rw filesystem. Luckily,
there is even a strong unity in the code which implements them. In
mount's open_ctree, these operations mostly happen after an early return
for ro file systems, and in remount, they happen in a section devoted to
remounting ro->rw, after some remount specific validation passes.

However, there are unfortunately a few differences. There are small
deviations in the order of some of the operations, remount does not
cleanup orphan inodes in root_tree or fs_tree, remount does not create
the free space tree, and remount does not handle "one-shot" mount
options like clear_cache and uuid tree rescan.

Since we want to add building the free space tree to remount, and since
it is possible to leak orphans on a filesystem mounted as ro then
remounted rw (common for the root filesystem when booting), we would
benefit from unifying the logic between the two codepaths.

This patch only lifts the existing common functionality, and leaves a
natural path for fixing the discrepancies.

Signed-off-by: Boris Burkov <boris@bur.io>
---
 fs/btrfs/disk-io.c | 93 ++++++++++++++++++++++++++--------------------
 fs/btrfs/disk-io.h |  1 +
 fs/btrfs/super.c   | 37 +++---------------
 3 files changed, 60 insertions(+), 71 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 3d39f5d47ad3..bff7a3a7be18 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2814,6 +2814,53 @@ static int btrfs_check_uuid_tree(struct btrfs_fs_info *fs_info)
 	return 0;
 }
 
+/*
+ * Mounting logic specific to read-write file systems. Shared by open_ctree
+ * and btrfs_remount when remounting from read-only to read-write.
+ */
+int btrfs_mount_rw(struct btrfs_fs_info *fs_info)
+{
+	int ret;
+
+	ret = btrfs_cleanup_fs_roots(fs_info);
+	if (ret)
+		goto out;
+
+	mutex_lock(&fs_info->cleaner_mutex);
+	ret = btrfs_recover_relocation(fs_info->tree_root);
+	mutex_unlock(&fs_info->cleaner_mutex);
+	if (ret < 0) {
+		btrfs_warn(fs_info, "failed to recover relocation: %d", ret);
+		goto out;
+	}
+
+	ret = btrfs_resume_balance_async(fs_info);
+	if (ret)
+		goto out;
+
+	ret = btrfs_resume_dev_replace_async(fs_info);
+	if (ret) {
+		btrfs_warn(fs_info, "failed to resume dev_replace");
+		goto out;
+	}
+
+	btrfs_qgroup_rescan_resume(fs_info);
+
+	if (!fs_info->uuid_root) {
+		btrfs_info(fs_info, "creating UUID tree");
+		ret = btrfs_create_uuid_tree(fs_info);
+		if (ret) {
+			btrfs_warn(fs_info,
+				   "failed to create the UUID tree %d",
+				   ret);
+			goto out;
+		}
+	}
+
+out:
+	return ret;
+}
+
 int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_devices,
 		      char *options)
 {
@@ -3218,22 +3265,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
 	if (ret)
 		goto fail_qgroup;
 
-	if (!sb_rdonly(sb)) {
-		ret = btrfs_cleanup_fs_roots(fs_info);
-		if (ret)
-			goto fail_qgroup;
-
-		mutex_lock(&fs_info->cleaner_mutex);
-		ret = btrfs_recover_relocation(tree_root);
-		mutex_unlock(&fs_info->cleaner_mutex);
-		if (ret < 0) {
-			btrfs_warn(fs_info, "failed to recover relocation: %d",
-					ret);
-			err = -EINVAL;
-			goto fail_qgroup;
-		}
-	}
-
 	fs_info->fs_root = btrfs_get_fs_root(fs_info, BTRFS_FS_TREE_OBJECTID, true);
 	if (IS_ERR(fs_info->fs_root)) {
 		err = PTR_ERR(fs_info->fs_root);
@@ -3286,35 +3317,17 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
 	}
 	up_read(&fs_info->cleanup_work_sem);
 
-	ret = btrfs_resume_balance_async(fs_info);
-	if (ret) {
-		btrfs_warn(fs_info, "failed to resume balance: %d", ret);
-		close_ctree(fs_info);
-		return ret;
-	}
-
-	ret = btrfs_resume_dev_replace_async(fs_info);
+	btrfs_discard_resume(fs_info);
+	ret = btrfs_mount_rw(fs_info);
 	if (ret) {
-		btrfs_warn(fs_info, "failed to resume device replace: %d", ret);
 		close_ctree(fs_info);
 		return ret;
 	}
 
-	btrfs_qgroup_rescan_resume(fs_info);
-	btrfs_discard_resume(fs_info);
-
-	if (!fs_info->uuid_root) {
-		btrfs_info(fs_info, "creating UUID tree");
-		ret = btrfs_create_uuid_tree(fs_info);
-		if (ret) {
-			btrfs_warn(fs_info,
-				"failed to create the UUID tree: %d", ret);
-			close_ctree(fs_info);
-			return ret;
-		}
-	} else if (btrfs_test_opt(fs_info, RESCAN_UUID_TREE) ||
-		   fs_info->generation !=
-				btrfs_super_uuid_tree_generation(disk_super)) {
+	if (fs_info->uuid_root &&
+	    (btrfs_test_opt(fs_info, RESCAN_UUID_TREE) ||
+	     fs_info->generation !=
+			btrfs_super_uuid_tree_generation(disk_super))) {
 		btrfs_info(fs_info, "checking UUID tree");
 		ret = btrfs_check_uuid_tree(fs_info);
 		if (ret) {
diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h
index fee69ced58b4..b3ee9c19be0c 100644
--- a/fs/btrfs/disk-io.h
+++ b/fs/btrfs/disk-io.h
@@ -50,6 +50,7 @@ struct extent_buffer *btrfs_find_create_tree_block(
 						struct btrfs_fs_info *fs_info,
 						u64 bytenr);
 void btrfs_clean_tree_block(struct extent_buffer *buf);
+int btrfs_mount_rw(struct btrfs_fs_info *fs_info);
 int __cold open_ctree(struct super_block *sb,
 	       struct btrfs_fs_devices *fs_devices,
 	       char *options);
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 8840a4fa81eb..56bfa23bc52f 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1831,7 +1831,6 @@ static inline void btrfs_remount_cleanup(struct btrfs_fs_info *fs_info,
 static int btrfs_remount(struct super_block *sb, int *flags, char *data)
 {
 	struct btrfs_fs_info *fs_info = btrfs_sb(sb);
-	struct btrfs_root *root = fs_info->tree_root;
 	unsigned old_flags = sb->s_flags;
 	unsigned long old_opts = fs_info->mount_opt;
 	unsigned long old_compress_type = fs_info->compress_type;
@@ -1924,39 +1923,15 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data)
 			goto restore;
 		}
 
-		ret = btrfs_cleanup_fs_roots(fs_info);
-		if (ret)
-			goto restore;
-
-		/* recover relocation */
-		mutex_lock(&fs_info->cleaner_mutex);
-		ret = btrfs_recover_relocation(root);
-		mutex_unlock(&fs_info->cleaner_mutex);
-		if (ret)
-			goto restore;
-
-		ret = btrfs_resume_balance_async(fs_info);
+		/*
+		 * NOTE: when remounting with a change that does writes, don't
+		 * put it anywhere above this point, as we are not sure to be
+		 * safe to write until we pass the above checks.
+		 */
+		ret = btrfs_mount_rw(fs_info);
 		if (ret)
 			goto restore;
 
-		ret = btrfs_resume_dev_replace_async(fs_info);
-		if (ret) {
-			btrfs_warn(fs_info, "failed to resume dev_replace");
-			goto restore;
-		}
-
-		btrfs_qgroup_rescan_resume(fs_info);
-
-		if (!fs_info->uuid_root) {
-			btrfs_info(fs_info, "creating UUID tree");
-			ret = btrfs_create_uuid_tree(fs_info);
-			if (ret) {
-				btrfs_warn(fs_info,
-					   "failed to create the UUID tree %d",
-					   ret);
-				goto restore;
-			}
-		}
 		sb->s_flags &= ~SB_RDONLY;
 
 		set_bit(BTRFS_FS_OPEN, &fs_info->flags);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 02/10] btrfs: cleanup all orphan inodes on ro->rw remount
  2020-10-27 21:07 Boris Burkov
  2020-10-27 21:07 ` [PATCH v5 01/10] btrfs: lift rw mount setup from mount and remount Boris Burkov
@ 2020-10-27 21:07 ` Boris Burkov
  2020-10-28 13:52   ` Josef Bacik
  2020-10-27 21:07 ` [PATCH v5 03/10] btrfs: create free space tree " Boris Burkov
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 16+ messages in thread
From: Boris Burkov @ 2020-10-27 21:07 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Boris Burkov

When we mount a rw file system, we clean the orphan inodes in the
filesystem trees, and also on the tree_root and fs_root. However, when
we remount a ro file system rw, we only clean the former. Move the calls
to btrfs_orphan_cleanup() on tree_root and fs_root to the shared rw
mount routine to effectively add them on ro->rw remount.

Signed-off-by: Boris Burkov <boris@bur.io>
---
 fs/btrfs/disk-io.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index bff7a3a7be18..95b9cc5db397 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2826,6 +2826,14 @@ int btrfs_mount_rw(struct btrfs_fs_info *fs_info)
 	if (ret)
 		goto out;
 
+	down_read(&fs_info->cleanup_work_sem);
+	if ((ret = btrfs_orphan_cleanup(fs_info->fs_root)) ||
+	    (ret = btrfs_orphan_cleanup(fs_info->tree_root))) {
+		up_read(&fs_info->cleanup_work_sem);
+		goto out;
+	}
+	up_read(&fs_info->cleanup_work_sem);
+
 	mutex_lock(&fs_info->cleaner_mutex);
 	ret = btrfs_recover_relocation(fs_info->tree_root);
 	mutex_unlock(&fs_info->cleaner_mutex);
@@ -3308,15 +3316,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
 		}
 	}
 
-	down_read(&fs_info->cleanup_work_sem);
-	if ((ret = btrfs_orphan_cleanup(fs_info->fs_root)) ||
-	    (ret = btrfs_orphan_cleanup(fs_info->tree_root))) {
-		up_read(&fs_info->cleanup_work_sem);
-		close_ctree(fs_info);
-		return ret;
-	}
-	up_read(&fs_info->cleanup_work_sem);
-
 	btrfs_discard_resume(fs_info);
 	ret = btrfs_mount_rw(fs_info);
 	if (ret) {
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 03/10] btrfs: create free space tree on ro->rw remount
  2020-10-27 21:07 Boris Burkov
  2020-10-27 21:07 ` [PATCH v5 01/10] btrfs: lift rw mount setup from mount and remount Boris Burkov
  2020-10-27 21:07 ` [PATCH v5 02/10] btrfs: cleanup all orphan inodes on ro->rw remount Boris Burkov
@ 2020-10-27 21:07 ` Boris Burkov
  2020-10-27 21:07 ` [PATCH v5 04/10] btrfs: clear oneshot options on mount and remount Boris Burkov
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Boris Burkov @ 2020-10-27 21:07 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Boris Burkov

When a user attempts to remount a btrfs filesystem with
'mount -o remount,space_cache=v2', that operation silently succeeds.
Unfortunately, this is misleading, because the remount does not create
the free space tree. /proc/mounts will incorrectly show space_cache=v2,
but on the next mount, the file system will revert to the old
space_cache.

For now, we handle only the easier case, where the existing mount is
read-only and the new mount is read-write. In that case, we can create
the free space tree without contending with the block groups changing
as we go.

Signed-off-by: Boris Burkov <boris@bur.io>
---
 fs/btrfs/disk-io.c | 23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 95b9cc5db397..5fe0a2640c8a 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2842,6 +2842,17 @@ int btrfs_mount_rw(struct btrfs_fs_info *fs_info)
 		goto out;
 	}
 
+	if (btrfs_test_opt(fs_info, FREE_SPACE_TREE) &&
+	    !btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE)) {
+		btrfs_info(fs_info, "creating free space tree");
+		ret = btrfs_create_free_space_tree(fs_info);
+		if (ret) {
+			btrfs_warn(fs_info,
+				"failed to create free space tree: %d", ret);
+			goto out;
+		}
+	}
+
 	ret = btrfs_resume_balance_async(fs_info);
 	if (ret)
 		goto out;
@@ -3304,18 +3315,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
 		}
 	}
 
-	if (btrfs_test_opt(fs_info, FREE_SPACE_TREE) &&
-	    !btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE)) {
-		btrfs_info(fs_info, "creating free space tree");
-		ret = btrfs_create_free_space_tree(fs_info);
-		if (ret) {
-			btrfs_warn(fs_info,
-				"failed to create free space tree: %d", ret);
-			close_ctree(fs_info);
-			return ret;
-		}
-	}
-
 	btrfs_discard_resume(fs_info);
 	ret = btrfs_mount_rw(fs_info);
 	if (ret) {
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 04/10] btrfs: clear oneshot options on mount and remount
  2020-10-27 21:07 Boris Burkov
                   ` (2 preceding siblings ...)
  2020-10-27 21:07 ` [PATCH v5 03/10] btrfs: create free space tree " Boris Burkov
@ 2020-10-27 21:07 ` Boris Burkov
  2020-10-28 13:58   ` Josef Bacik
  2020-10-27 21:07 ` [PATCH v5 05/10] btrfs: clear free space tree on ro->rw remount Boris Burkov
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 16+ messages in thread
From: Boris Burkov @ 2020-10-27 21:07 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Boris Burkov

Some options only apply during mount time and are cleared at the end
of mount. For now, the example is USEBACKUPROOT, but CLEAR_CACHE also
fits the bill, and this is a preparation patch for also clearing that
option.

One subtlety is that the current code only resets USEBACKUPROOT on rw
mounts, but the option is meaningfully "consumed" by a ro mount, so it
feels appropriate to clear in that case as well. A subsequent read-write
remount would not go through open_ctree, which is the only place that
checks the option, so the change should be benign.

Signed-off-by: Boris Burkov <boris@bur.io>
---
 fs/btrfs/disk-io.c | 20 +++++++++++++-------
 fs/btrfs/disk-io.h |  1 +
 fs/btrfs/super.c   |  1 +
 3 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 5fe0a2640c8a..987e40e12bb4 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2814,6 +2814,16 @@ static int btrfs_check_uuid_tree(struct btrfs_fs_info *fs_info)
 	return 0;
 }
 
+/*
+ * Some options only have meaning at mount time and shouldn't persist across
+ * remounts, or be displayed. Clear these at the end of mount and remount
+ * code paths.
+ */
+void btrfs_clear_oneshot_options(struct btrfs_fs_info *fs_info)
+{
+	btrfs_clear_opt(fs_info->mount_opt, USEBACKUPROOT);
+}
+
 /*
  * Mounting logic specific to read-write file systems. Shared by open_ctree
  * and btrfs_remount when remounting from read-only to read-write.
@@ -3293,7 +3303,7 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
 	}
 
 	if (sb_rdonly(sb))
-		return 0;
+		goto clear_oneshot;
 
 	if (btrfs_test_opt(fs_info, CLEAR_CACHE) &&
 	    btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE)) {
@@ -3337,12 +3347,8 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
 	}
 	set_bit(BTRFS_FS_OPEN, &fs_info->flags);
 
-	/*
-	 * backuproot only affect mount behavior, and if open_ctree succeeded,
-	 * no need to keep the flag
-	 */
-	btrfs_clear_opt(fs_info->mount_opt, USEBACKUPROOT);
-
+clear_oneshot:
+	btrfs_clear_oneshot_options(fs_info);
 	return 0;
 
 fail_qgroup:
diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h
index b3ee9c19be0c..02bee384f744 100644
--- a/fs/btrfs/disk-io.h
+++ b/fs/btrfs/disk-io.h
@@ -50,6 +50,7 @@ struct extent_buffer *btrfs_find_create_tree_block(
 						struct btrfs_fs_info *fs_info,
 						u64 bytenr);
 void btrfs_clean_tree_block(struct extent_buffer *buf);
+void btrfs_clear_oneshot_options(struct btrfs_fs_info *fs_info);
 int btrfs_mount_rw(struct btrfs_fs_info *fs_info);
 int __cold open_ctree(struct super_block *sb,
 	       struct btrfs_fs_devices *fs_devices,
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 56bfa23bc52f..85a0cf56cec5 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1945,6 +1945,7 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data)
 
 	wake_up_process(fs_info->transaction_kthread);
 	btrfs_remount_cleanup(fs_info, old_opts);
+	btrfs_clear_oneshot_options(fs_info);
 	clear_bit(BTRFS_FS_STATE_REMOUNTING, &fs_info->fs_state);
 
 	return 0;
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 05/10] btrfs: clear free space tree on ro->rw remount
  2020-10-27 21:07 Boris Burkov
                   ` (3 preceding siblings ...)
  2020-10-27 21:07 ` [PATCH v5 04/10] btrfs: clear oneshot options on mount and remount Boris Burkov
@ 2020-10-27 21:07 ` Boris Burkov
  2020-10-27 21:08 ` [PATCH v5 06/10] btrfs: keep sb cache_generation consistent with space_cache Boris Burkov
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Boris Burkov @ 2020-10-27 21:07 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Boris Burkov

A user might want to revert to v1 or nospace_cache on a root filesystem,
and much like turning on the free space tree, that can only be done
remounting from ro->rw. Support clearing the free space tree on such
mounts by moving it into the shared remount logic.

Since the CLEAR_CACHE option sticks around across remounts, this change
would result in clearing the tree for ever on every remount, which is
not desirable. To fix that, add CLEAR_CACHE to the oneshot options we
clear at mount end, which has the other bonus of not cluttering the
/proc/mounts output with clear_cache.

Signed-off-by: Boris Burkov <boris@bur.io>
---
 fs/btrfs/disk-io.c | 43 ++++++++++++++++++++++---------------------
 1 file changed, 22 insertions(+), 21 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 987e40e12bb4..cf26ef5efb9f 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2831,6 +2831,28 @@ void btrfs_clear_oneshot_options(struct btrfs_fs_info *fs_info)
 int btrfs_mount_rw(struct btrfs_fs_info *fs_info)
 {
 	int ret;
+	bool cache_opt = btrfs_test_opt(fs_info, SPACE_CACHE);
+	bool clear_free_space_tree = false;
+
+	if (btrfs_test_opt(fs_info, CLEAR_CACHE) &&
+	    btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE)) {
+		clear_free_space_tree = true;
+	} else if (btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE) &&
+		   !btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE_VALID)) {
+		btrfs_warn(fs_info, "free space tree is invalid");
+		clear_free_space_tree = true;
+	}
+
+	if (clear_free_space_tree) {
+		btrfs_info(fs_info, "clearing free space tree");
+		ret = btrfs_clear_free_space_tree(fs_info);
+		if (ret) {
+			btrfs_warn(fs_info,
+				   "failed to clear free space tree: %d", ret);
+			close_ctree(fs_info);
+			return ret;
+		}
+	}
 
 	ret = btrfs_cleanup_fs_roots(fs_info);
 	if (ret)
@@ -2905,7 +2927,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
 	struct btrfs_root *chunk_root;
 	int ret;
 	int err = -EINVAL;
-	int clear_free_space_tree = 0;
 	int level;
 
 	ret = init_mount_fs_info(fs_info, sb);
@@ -3305,26 +3326,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
 	if (sb_rdonly(sb))
 		goto clear_oneshot;
 
-	if (btrfs_test_opt(fs_info, CLEAR_CACHE) &&
-	    btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE)) {
-		clear_free_space_tree = 1;
-	} else if (btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE) &&
-		   !btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE_VALID)) {
-		btrfs_warn(fs_info, "free space tree is invalid");
-		clear_free_space_tree = 1;
-	}
-
-	if (clear_free_space_tree) {
-		btrfs_info(fs_info, "clearing free space tree");
-		ret = btrfs_clear_free_space_tree(fs_info);
-		if (ret) {
-			btrfs_warn(fs_info,
-				   "failed to clear free space tree: %d", ret);
-			close_ctree(fs_info);
-			return ret;
-		}
-	}
-
 	btrfs_discard_resume(fs_info);
 	ret = btrfs_mount_rw(fs_info);
 	if (ret) {
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 06/10] btrfs: keep sb cache_generation consistent with space_cache
  2020-10-27 21:07 Boris Burkov
                   ` (4 preceding siblings ...)
  2020-10-27 21:07 ` [PATCH v5 05/10] btrfs: clear free space tree on ro->rw remount Boris Burkov
@ 2020-10-27 21:08 ` Boris Burkov
  2020-10-27 21:08 ` [PATCH v5 07/10] btrfs: use sb state to print space_cache mount option Boris Burkov
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Boris Burkov @ 2020-10-27 21:08 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Boris Burkov

When mounting, btrfs uses the cache_generation in the super block to
determine if space cache v1 is in use. However, by mounting with
nospace_cache or space_cache=v2, it is possible to disable space cache
v1, which does not result in un-setting cache_generation back to 0.

In order to base some logic, like mount option printing in /proc/mounts,
on the current state of the space cache rather than just the values of
the mount option, keep the value of cache_generation consistent with the
status of space cache v1.

We ensure that cache_generation > 0 iff the file system is using
space_cache v1. This requires committing a transaction on any mount
which changes whether we are using v1. (v1->nospace_cache, v1->v2,
nospace_cache->v1, v2->v1).

Since the mechanism for writing out the cache generation is transaction
commit, but we want some finer grained control over when we un-set it,
we can't just rely on the SPACE_CACHE mount option, and introduce an
fs_info flag that mount can use when it wants to unset the generation.

Signed-off-by: Boris Burkov <boris@bur.io>
---
 fs/btrfs/ctree.h            |  3 +++
 fs/btrfs/disk-io.c          |  7 +++++++
 fs/btrfs/free-space-cache.c | 28 ++++++++++++++++++++++++++++
 fs/btrfs/free-space-cache.h |  3 +++
 fs/btrfs/super.c            | 10 +++++++---
 fs/btrfs/transaction.c      |  2 ++
 6 files changed, 50 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index aac3d6f4e35b..0ee85bf82cab 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -559,6 +559,9 @@ enum {
 
 	/* Indicate that the discard workqueue can service discards. */
 	BTRFS_FS_DISCARD_RUNNING,
+
+	/* Indicate that we need to cleanup space cache v1 */
+	BTRFS_FS_CLEANUP_SPACE_CACHE_V1,
 };
 
 /*
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index cf26ef5efb9f..e4e4982e1817 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2885,6 +2885,12 @@ int btrfs_mount_rw(struct btrfs_fs_info *fs_info)
 		}
 	}
 
+	if (cache_opt != btrfs_free_space_cache_v1_active(fs_info)) {
+		ret = btrfs_set_free_space_cache_v1_active(fs_info, cache_opt);
+		if (ret)
+			goto out;
+	}
+
 	ret = btrfs_resume_balance_async(fs_info);
 	if (ret)
 		goto out;
@@ -3346,6 +3352,7 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
 			return ret;
 		}
 	}
+
 	set_bit(BTRFS_FS_OPEN, &fs_info->flags);
 
 clear_oneshot:
diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index af0013d3df63..3acf935536ea 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -3992,6 +3992,34 @@ int btrfs_write_out_ino_cache(struct btrfs_root *root,
 	return ret;
 }
 
+bool btrfs_free_space_cache_v1_active(struct btrfs_fs_info *fs_info)
+{
+	return btrfs_super_cache_generation(fs_info->super_copy);
+}
+
+int btrfs_set_free_space_cache_v1_active(struct btrfs_fs_info *fs_info,
+					 bool active)
+{
+	struct btrfs_trans_handle *trans;
+	int ret;
+
+	/*
+	 * update_super_roots will appropriately set
+	 * fs_info->super_copy->cache_generation based on the SPACE_CACHE
+	 * option, so all we have to do is trigger a transaction commit.
+	 */
+	trans = btrfs_start_transaction(fs_info->tree_root, 0);
+	if (IS_ERR(trans))
+		return PTR_ERR(trans);
+
+	if (!active)
+		set_bit(BTRFS_FS_CLEANUP_SPACE_CACHE_V1, &fs_info->flags);
+
+	ret = btrfs_commit_transaction(trans);
+	clear_bit(BTRFS_FS_CLEANUP_SPACE_CACHE_V1, &fs_info->flags);
+	return ret;
+}
+
 #ifdef CONFIG_BTRFS_FS_RUN_SANITY_TESTS
 /*
  * Use this if you need to make a bitmap or extent entry specifically, it
diff --git a/fs/btrfs/free-space-cache.h b/fs/btrfs/free-space-cache.h
index e3d5e0ad8f8e..5c546898ded9 100644
--- a/fs/btrfs/free-space-cache.h
+++ b/fs/btrfs/free-space-cache.h
@@ -148,6 +148,9 @@ int btrfs_trim_block_group_bitmaps(struct btrfs_block_group *block_group,
 				   u64 *trimmed, u64 start, u64 end, u64 minlen,
 				   u64 maxlen, bool async);
 
+bool btrfs_free_space_cache_v1_active(struct btrfs_fs_info *fs_info);
+int btrfs_set_free_space_cache_v1_active(struct btrfs_fs_info *fs_info,
+					 bool active);
 /* Support functions for running our sanity tests */
 #ifdef CONFIG_BTRFS_FS_RUN_SANITY_TESTS
 int test_add_free_space_entry(struct btrfs_block_group *cache,
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 85a0cf56cec5..58e54cd3c2c6 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -511,7 +511,6 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
 {
 	substring_t args[MAX_OPT_ARGS];
 	char *p, *num;
-	u64 cache_gen;
 	int intarg;
 	int ret = 0;
 	char *compress_type;
@@ -521,10 +520,9 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
 	bool saved_compress_force;
 	int no_compress = 0;
 
-	cache_gen = btrfs_super_cache_generation(info->super_copy);
 	if (btrfs_fs_compat_ro(info, FREE_SPACE_TREE))
 		btrfs_set_opt(info->mount_opt, FREE_SPACE_TREE);
-	else if (cache_gen)
+	else if (btrfs_free_space_cache_v1_active(info))
 		btrfs_set_opt(info->mount_opt, SPACE_CACHE);
 
 	/*
@@ -1810,6 +1808,8 @@ static inline void btrfs_remount_begin(struct btrfs_fs_info *fs_info,
 static inline void btrfs_remount_cleanup(struct btrfs_fs_info *fs_info,
 					 unsigned long old_opts)
 {
+	bool cache_opt = btrfs_test_opt(fs_info, SPACE_CACHE);
+
 	/*
 	 * We need to cleanup all defragable inodes if the autodefragment is
 	 * close or the filesystem is read only.
@@ -1826,6 +1826,10 @@ static inline void btrfs_remount_cleanup(struct btrfs_fs_info *fs_info,
 	else if (btrfs_raw_test_opt(old_opts, DISCARD_ASYNC) &&
 		 !btrfs_test_opt(fs_info, DISCARD_ASYNC))
 		btrfs_discard_cleanup(fs_info);
+
+	/* If we toggled space cache */
+	if (cache_opt != btrfs_free_space_cache_v1_active(fs_info))
+		btrfs_set_free_space_cache_v1_active(fs_info, cache_opt);
 }
 
 static int btrfs_remount(struct super_block *sb, int *flags, char *data)
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 52ada47aff50..c350fc59784b 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -1761,6 +1761,8 @@ static void update_super_roots(struct btrfs_fs_info *fs_info)
 	super->root_level = root_item->level;
 	if (btrfs_test_opt(fs_info, SPACE_CACHE))
 		super->cache_generation = root_item->generation;
+	else if (test_bit(BTRFS_FS_CLEANUP_SPACE_CACHE_V1, &fs_info->flags))
+		super->cache_generation = 0;
 	if (test_bit(BTRFS_FS_UPDATE_UUID_TREE_GEN, &fs_info->flags))
 		super->uuid_tree_generation = root_item->generation;
 }
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 07/10] btrfs: use sb state to print space_cache mount option
  2020-10-27 21:07 Boris Burkov
                   ` (5 preceding siblings ...)
  2020-10-27 21:08 ` [PATCH v5 06/10] btrfs: keep sb cache_generation consistent with space_cache Boris Burkov
@ 2020-10-27 21:08 ` Boris Burkov
  2020-10-27 21:08 ` [PATCH v5 08/10] btrfs: warn when remount will not change the free space tree Boris Burkov
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Boris Burkov @ 2020-10-27 21:08 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Boris Burkov

To make the contents of /proc/mounts better match the actual state of
the file system, base the display of the space cache mount options off
the contents of the super block rather than the last mount options
passed in. Since there are many scenarios where the mount will ignore a
space cache option, simply showing the passed in option is misleading.

For example, if we mount with -o remount,space_cache=v2 on a read-write
file system without an existing free space tree, we won't build a free
space tree, but /proc/mounts will read space_cache=v2 (until we mount
again and it goes away)

cache_generation is set iff space_cache=v1, FREE_SPACE_TREE is set iff
space_cache=v2, and if neither is the case, we print nospace_cache.

Signed-off-by: Boris Burkov <boris@bur.io>
---
 fs/btrfs/super.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 58e54cd3c2c6..6533758f882a 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1427,9 +1427,9 @@ static int btrfs_show_options(struct seq_file *seq, struct dentry *dentry)
 		seq_puts(seq, ",discard=async");
 	if (!(info->sb->s_flags & SB_POSIXACL))
 		seq_puts(seq, ",noacl");
-	if (btrfs_test_opt(info, SPACE_CACHE))
+	if (btrfs_free_space_cache_v1_active(info))
 		seq_puts(seq, ",space_cache");
-	else if (btrfs_test_opt(info, FREE_SPACE_TREE))
+	else if (btrfs_fs_compat_ro(info, FREE_SPACE_TREE))
 		seq_puts(seq, ",space_cache=v2");
 	else
 		seq_puts(seq, ",nospace_cache");
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 08/10] btrfs: warn when remount will not change the free space tree
  2020-10-27 21:07 Boris Burkov
                   ` (6 preceding siblings ...)
  2020-10-27 21:08 ` [PATCH v5 07/10] btrfs: use sb state to print space_cache mount option Boris Burkov
@ 2020-10-27 21:08 ` Boris Burkov
  2020-10-27 21:08 ` [PATCH v5 09/10] btrfs: remove free space items when disabling space cache v1 Boris Burkov
  2020-10-27 21:08 ` [PATCH v5 10/10] btrfs: skip space_cache v1 setup when not using it Boris Burkov
  9 siblings, 0 replies; 16+ messages in thread
From: Boris Burkov @ 2020-10-27 21:08 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Boris Burkov

If the remount is ro->ro, rw->ro, or rw->rw, we will not create or
clear the free space tree. This can be surprising, so print a warning
to dmesg to make the failure more visible. It is also important to
ensure that the space cache options (SPACE_CACHE, FREE_SPACE_TREE) are
consistent, so ensure those are set to properly match the current on
disk state (which won't be changing).

Signed-off-by: Boris Burkov <boris@bur.io>
---
 fs/btrfs/super.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 6533758f882a..ed28985a6762 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1865,6 +1865,24 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data)
 	btrfs_resize_thread_pool(fs_info,
 		fs_info->thread_pool_size, old_thread_pool_size);
 
+	if (btrfs_test_opt(fs_info, FREE_SPACE_TREE) !=
+	    btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE) &&
+	    ((!sb_rdonly(sb) || *flags & SB_RDONLY))) {
+		btrfs_warn(fs_info,
+	   "remount supports changing free space tree only from ro to rw");
+		/*
+		 * Make sure free space cache options match the state on disk
+		 */
+		if (btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE)) {
+			btrfs_set_opt(fs_info->mount_opt, FREE_SPACE_TREE);
+			btrfs_clear_opt(fs_info->mount_opt, SPACE_CACHE);
+		}
+		if (btrfs_free_space_cache_v1_active(fs_info)) {
+			btrfs_clear_opt(fs_info->mount_opt, FREE_SPACE_TREE);
+			btrfs_set_opt(fs_info->mount_opt, SPACE_CACHE);
+		}
+	}
+
 	if ((bool)(*flags & SB_RDONLY) == sb_rdonly(sb))
 		goto out;
 
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 09/10] btrfs: remove free space items when disabling space cache v1
  2020-10-27 21:07 Boris Burkov
                   ` (7 preceding siblings ...)
  2020-10-27 21:08 ` [PATCH v5 08/10] btrfs: warn when remount will not change the free space tree Boris Burkov
@ 2020-10-27 21:08 ` Boris Burkov
  2020-10-29 11:07   ` Wang Yugui
  2020-10-27 21:08 ` [PATCH v5 10/10] btrfs: skip space_cache v1 setup when not using it Boris Burkov
  9 siblings, 1 reply; 16+ messages in thread
From: Boris Burkov @ 2020-10-27 21:08 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Boris Burkov

When the file system transitions from space cache v1 to v2 or to
nospace_cache, it removes the old cached data, but does not remove
the FREE_SPACE items nor the free space inodes they point to. This
doesn't cause any issues besides being a bit inefficient, since these
items no longer do anything useful.

To fix it, when we are mounting, and plan to disable the space cache,
destroy each block group's free space item and free space inode.
The code to remove the items is lifted from the existing use case of
removing the block group, with a light adaptation to handle whether or
not we have already looked up the free space inode.

References: https://github.com/btrfs/btrfs-todo/issues/5
Signed-off-by: Boris Burkov <boris@bur.io>
---
 fs/btrfs/block-group.c      |  39 +-------------
 fs/btrfs/free-space-cache.c | 101 ++++++++++++++++++++++++++++++++++--
 fs/btrfs/free-space-cache.h |   3 ++
 3 files changed, 102 insertions(+), 41 deletions(-)

diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index c0f1d6818df7..8938b11a3339 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -892,8 +892,6 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans,
 	struct btrfs_path *path;
 	struct btrfs_block_group *block_group;
 	struct btrfs_free_cluster *cluster;
-	struct btrfs_root *tree_root = fs_info->tree_root;
-	struct btrfs_key key;
 	struct inode *inode;
 	struct kobject *kobj = NULL;
 	int ret;
@@ -971,42 +969,9 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans,
 	spin_unlock(&trans->transaction->dirty_bgs_lock);
 	mutex_unlock(&trans->transaction->cache_write_mutex);
 
-	if (!IS_ERR(inode)) {
-		ret = btrfs_orphan_add(trans, BTRFS_I(inode));
-		if (ret) {
-			btrfs_add_delayed_iput(inode);
-			goto out;
-		}
-		clear_nlink(inode);
-		/* One for the block groups ref */
-		spin_lock(&block_group->lock);
-		if (block_group->iref) {
-			block_group->iref = 0;
-			block_group->inode = NULL;
-			spin_unlock(&block_group->lock);
-			iput(inode);
-		} else {
-			spin_unlock(&block_group->lock);
-		}
-		/* One for our lookup ref */
-		btrfs_add_delayed_iput(inode);
-	}
-
-	key.objectid = BTRFS_FREE_SPACE_OBJECTID;
-	key.type = 0;
-	key.offset = block_group->start;
-
-	ret = btrfs_search_slot(trans, tree_root, &key, path, -1, 1);
-	if (ret < 0)
+	ret = btrfs_remove_free_space_inode(trans, inode, block_group);
+	if (ret)
 		goto out;
-	if (ret > 0)
-		btrfs_release_path(path);
-	if (ret == 0) {
-		ret = btrfs_del_item(trans, tree_root, path);
-		if (ret)
-			goto out;
-		btrfs_release_path(path);
-	}
 
 	spin_lock(&fs_info->block_group_cache_lock);
 	rb_erase(&block_group->cache_node,
diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index 3acf935536ea..5d357786c9da 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -207,6 +207,65 @@ int create_free_space_inode(struct btrfs_trans_handle *trans,
 					 ino, block_group->start);
 }
 
+/*
+ * inode is an optional sink: if it is NULL, btrfs_remove_free_space_inode
+ * handles lookup, otherwise it takes ownership and iputs the inode.
+ * Don't reuse an inode pointer after passing it into this function.
+ */
+int btrfs_remove_free_space_inode(struct btrfs_trans_handle *trans,
+				  struct inode *inode,
+				  struct btrfs_block_group *block_group)
+{
+	struct btrfs_path *path;
+	struct btrfs_key key;
+	int ret = 0;
+
+	path = btrfs_alloc_path();
+	if (!path)
+		return -ENOMEM;
+
+	if (!inode)
+		inode = lookup_free_space_inode(block_group, path);
+	if (IS_ERR(inode)) {
+		if (PTR_ERR(inode) != -ENOENT)
+			ret = PTR_ERR(inode);
+		goto out;
+	}
+	ret = btrfs_orphan_add(trans, BTRFS_I(inode));
+	if (ret) {
+		btrfs_add_delayed_iput(inode);
+		goto out;
+	}
+	clear_nlink(inode);
+	/* One for the block groups ref */
+	spin_lock(&block_group->lock);
+	if (block_group->iref) {
+		block_group->iref = 0;
+		block_group->inode = NULL;
+		spin_unlock(&block_group->lock);
+		iput(inode);
+	} else {
+		spin_unlock(&block_group->lock);
+	}
+	/* One for the lookup ref */
+	btrfs_add_delayed_iput(inode);
+
+	key.objectid = BTRFS_FREE_SPACE_OBJECTID;
+	key.type = 0;
+	key.offset = block_group->start;
+	ret = btrfs_search_slot(trans, trans->fs_info->tree_root, &key, path,
+				-1, 1);
+	if (ret) {
+		if (ret > 0)
+			ret = 0;
+		goto out;
+	}
+	ret = btrfs_del_item(trans, trans->fs_info->tree_root, path);
+out:
+	btrfs_free_path(path);
+	return ret;
+}
+
 int btrfs_check_trunc_cache_free_space(struct btrfs_fs_info *fs_info,
 				       struct btrfs_block_rsv *rsv)
 {
@@ -3997,6 +4056,28 @@ bool btrfs_free_space_cache_v1_active(struct btrfs_fs_info *fs_info)
 	return btrfs_super_cache_generation(fs_info->super_copy);
 }
 
+static int cleanup_free_space_cache_v1(struct btrfs_fs_info *fs_info,
+				       struct btrfs_trans_handle *trans)
+{
+	struct btrfs_block_group *block_group;
+	struct rb_node *node;
+	int ret;
+
+	btrfs_info(fs_info, "cleaning free space cache v1");
+
+	node = rb_first(&fs_info->block_group_cache_tree);
+	while (node) {
+		block_group = rb_entry(node, struct btrfs_block_group,
+				       cache_node);
+		ret = btrfs_remove_free_space_inode(trans, NULL, block_group);
+		if (ret)
+			goto out;
+		node = rb_next(node);
+	}
+out:
+	return ret;
+}
+
 int btrfs_set_free_space_cache_v1_active(struct btrfs_fs_info *fs_info,
 					 bool active)
 {
@@ -4004,18 +4085,30 @@ int btrfs_set_free_space_cache_v1_active(struct btrfs_fs_info *fs_info,
 	int ret;
 
 	/*
-	 * update_super_roots will appropriately set
-	 * fs_info->super_copy->cache_generation based on the SPACE_CACHE
-	 * option, so all we have to do is trigger a transaction commit.
+	 * update_super_roots will appropriately set or unset
+	 * fs_info->super_copy->cache_generation based on SPACE_CACHE and
+	 * BTRFS_FS_CLEANUP_SPACE_CACHE_V1. For this reason, we need a
+	 * transaction commit whether we are enabling space cache v1 and don't
+	 * have any other work to do, or are disabling it and removing free
+	 * space inodes.
 	 */
 	trans = btrfs_start_transaction(fs_info->tree_root, 0);
 	if (IS_ERR(trans))
 		return PTR_ERR(trans);
 
-	if (!active)
+	if (!active) {
 		set_bit(BTRFS_FS_CLEANUP_SPACE_CACHE_V1, &fs_info->flags);
+		ret = cleanup_free_space_cache_v1(fs_info, trans);
+		if (ret)
+			goto abort;
+	}
 
 	ret = btrfs_commit_transaction(trans);
+	goto out;
+abort:
+	btrfs_abort_transaction(trans, ret);
+	btrfs_end_transaction(trans);
+out:
 	clear_bit(BTRFS_FS_CLEANUP_SPACE_CACHE_V1, &fs_info->flags);
 	return ret;
 }
diff --git a/fs/btrfs/free-space-cache.h b/fs/btrfs/free-space-cache.h
index 5c546898ded9..8df4b2925eca 100644
--- a/fs/btrfs/free-space-cache.h
+++ b/fs/btrfs/free-space-cache.h
@@ -84,6 +84,9 @@ struct inode *lookup_free_space_inode(struct btrfs_block_group *block_group,
 int create_free_space_inode(struct btrfs_trans_handle *trans,
 			    struct btrfs_block_group *block_group,
 			    struct btrfs_path *path);
+int btrfs_remove_free_space_inode(struct btrfs_trans_handle *trans,
+				  struct inode *inode,
+				  struct btrfs_block_group *block_group);
 
 int btrfs_check_trunc_cache_free_space(struct btrfs_fs_info *fs_info,
 				       struct btrfs_block_rsv *rsv);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 10/10] btrfs: skip space_cache v1 setup when not using it
  2020-10-27 21:07 Boris Burkov
                   ` (8 preceding siblings ...)
  2020-10-27 21:08 ` [PATCH v5 09/10] btrfs: remove free space items when disabling space cache v1 Boris Burkov
@ 2020-10-27 21:08 ` Boris Burkov
  9 siblings, 0 replies; 16+ messages in thread
From: Boris Burkov @ 2020-10-27 21:08 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Boris Burkov

If we are not using space cache v1, we should not create the free space
object or free space inodes. This comes up when we delete the existing
free space objects/inodes when migrating to v2, only to see them get
recreated for every dirtied block group.

Signed-off-by: Boris Burkov <boris@bur.io>
---
 fs/btrfs/block-group.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 8938b11a3339..59a130fdcd5c 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -2325,6 +2325,9 @@ static int cache_save_setup(struct btrfs_block_group *block_group,
 	int retries = 0;
 	int ret = 0;
 
+	if (!btrfs_test_opt(fs_info, SPACE_CACHE))
+		return 0;
+
 	/*
 	 * If this block group is smaller than 100 megs don't bother caching the
 	 * block group.
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 02/10] btrfs: cleanup all orphan inodes on ro->rw remount
  2020-10-27 21:07 ` [PATCH v5 02/10] btrfs: cleanup all orphan inodes on ro->rw remount Boris Burkov
@ 2020-10-28 13:52   ` Josef Bacik
  0 siblings, 0 replies; 16+ messages in thread
From: Josef Bacik @ 2020-10-28 13:52 UTC (permalink / raw)
  To: Boris Burkov, linux-btrfs, kernel-team

On 10/27/20 5:07 PM, Boris Burkov wrote:
> When we mount a rw file system, we clean the orphan inodes in the
> filesystem trees, and also on the tree_root and fs_root. However, when
> we remount a ro file system rw, we only clean the former. Move the calls
> to btrfs_orphan_cleanup() on tree_root and fs_root to the shared rw
> mount routine to effectively add them on ro->rw remount.
> 
> Signed-off-by: Boris Burkov <boris@bur.io>
> ---
>   fs/btrfs/disk-io.c | 17 ++++++++---------
>   1 file changed, 8 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index bff7a3a7be18..95b9cc5db397 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -2826,6 +2826,14 @@ int btrfs_mount_rw(struct btrfs_fs_info *fs_info)
>   	if (ret)
>   		goto out;
>   
> +	down_read(&fs_info->cleanup_work_sem);
> +	if ((ret = btrfs_orphan_cleanup(fs_info->fs_root)) ||
> +	    (ret = btrfs_orphan_cleanup(fs_info->tree_root))) {
> +		up_read(&fs_info->cleanup_work_sem);
> +		goto out;
> +	}
> +	up_read(&fs_info->cleanup_work_sem);

This triggers

ERROR:ASSIGN_IN_IF: do not use assignment in if condition
#10: FILE: fs/btrfs/disk-io.c:2910:
+       if ((ret = btrfs_orphan_cleanup(fs_info->fs_root)) ||

total: 1 errors, 0 warnings, 29 lines checked

I'm not sure I agree with this check, Dave what's your opinion?  Thanks,

Josef

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 04/10] btrfs: clear oneshot options on mount and remount
  2020-10-27 21:07 ` [PATCH v5 04/10] btrfs: clear oneshot options on mount and remount Boris Burkov
@ 2020-10-28 13:58   ` Josef Bacik
  0 siblings, 0 replies; 16+ messages in thread
From: Josef Bacik @ 2020-10-28 13:58 UTC (permalink / raw)
  To: Boris Burkov, linux-btrfs, kernel-team

On 10/27/20 5:07 PM, Boris Burkov wrote:
> Some options only apply during mount time and are cleared at the end
> of mount. For now, the example is USEBACKUPROOT, but CLEAR_CACHE also
> fits the bill, and this is a preparation patch for also clearing that
> option.
> 
> One subtlety is that the current code only resets USEBACKUPROOT on rw
> mounts, but the option is meaningfully "consumed" by a ro mount, so it
> feels appropriate to clear in that case as well. A subsequent read-write
> remount would not go through open_ctree, which is the only place that
> checks the option, so the change should be benign.
> 
> Signed-off-by: Boris Burkov <boris@bur.io>

This one fails to apply cleanly on misc-next.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 01/10] btrfs: lift rw mount setup from mount and remount
  2020-10-27 21:07 ` [PATCH v5 01/10] btrfs: lift rw mount setup from mount and remount Boris Burkov
@ 2020-10-28 14:08   ` Josef Bacik
  0 siblings, 0 replies; 16+ messages in thread
From: Josef Bacik @ 2020-10-28 14:08 UTC (permalink / raw)
  To: Boris Burkov, linux-btrfs, kernel-team

On 10/27/20 5:07 PM, Boris Burkov wrote:
> Mounting rw and remounting from ro to rw naturally share invariants and
> functionality which result in a correctly setup rw filesystem. Luckily,
> there is even a strong unity in the code which implements them. In
> mount's open_ctree, these operations mostly happen after an early return
> for ro file systems, and in remount, they happen in a section devoted to
> remounting ro->rw, after some remount specific validation passes.
> 
> However, there are unfortunately a few differences. There are small
> deviations in the order of some of the operations, remount does not
> cleanup orphan inodes in root_tree or fs_tree, remount does not create
> the free space tree, and remount does not handle "one-shot" mount
> options like clear_cache and uuid tree rescan.
> 
> Since we want to add building the free space tree to remount, and since
> it is possible to leak orphans on a filesystem mounted as ro then
> remounted rw (common for the root filesystem when booting), we would
> benefit from unifying the logic between the two codepaths.
> 
> This patch only lifts the existing common functionality, and leaves a
> natural path for fixing the discrepancies.
> 
> Signed-off-by: Boris Burkov <boris@bur.io>
> ---
>   fs/btrfs/disk-io.c | 93 ++++++++++++++++++++++++++--------------------
>   fs/btrfs/disk-io.h |  1 +
>   fs/btrfs/super.c   | 37 +++---------------
>   3 files changed, 60 insertions(+), 71 deletions(-)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 3d39f5d47ad3..bff7a3a7be18 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -2814,6 +2814,53 @@ static int btrfs_check_uuid_tree(struct btrfs_fs_info *fs_info)
>   	return 0;
>   }
>   
> +/*
> + * Mounting logic specific to read-write file systems. Shared by open_ctree
> + * and btrfs_remount when remounting from read-only to read-write.
> + */
> +int btrfs_mount_rw(struct btrfs_fs_info *fs_info)
> +{
> +	int ret;
> +
> +	ret = btrfs_cleanup_fs_roots(fs_info);
> +	if (ret)
> +		goto out;
> +
> +	mutex_lock(&fs_info->cleaner_mutex);
> +	ret = btrfs_recover_relocation(fs_info->tree_root);
> +	mutex_unlock(&fs_info->cleaner_mutex);
> +	if (ret < 0) {
> +		btrfs_warn(fs_info, "failed to recover relocation: %d", ret);
> +		goto out;
> +	}
> +
> +	ret = btrfs_resume_balance_async(fs_info);
> +	if (ret)
> +		goto out;
> +
> +	ret = btrfs_resume_dev_replace_async(fs_info);
> +	if (ret) {
> +		btrfs_warn(fs_info, "failed to resume dev_replace");
> +		goto out;
> +	}
> +
> +	btrfs_qgroup_rescan_resume(fs_info);
> +
> +	if (!fs_info->uuid_root) {
> +		btrfs_info(fs_info, "creating UUID tree");
> +		ret = btrfs_create_uuid_tree(fs_info);
> +		if (ret) {
> +			btrfs_warn(fs_info,
> +				   "failed to create the UUID tree %d",
> +				   ret);
> +			goto out;
> +		}
> +	}
> +
> +out:
> +	return ret;
> +}
> +
>   int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_devices,
>   		      char *options)
>   {
> @@ -3218,22 +3265,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
>   	if (ret)
>   		goto fail_qgroup;
>   
> -	if (!sb_rdonly(sb)) {
> -		ret = btrfs_cleanup_fs_roots(fs_info);
> -		if (ret)
> -			goto fail_qgroup;
> -
> -		mutex_lock(&fs_info->cleaner_mutex);
> -		ret = btrfs_recover_relocation(tree_root);
> -		mutex_unlock(&fs_info->cleaner_mutex);
> -		if (ret < 0) {
> -			btrfs_warn(fs_info, "failed to recover relocation: %d",
> -					ret);
> -			err = -EINVAL;
> -			goto fail_qgroup;
> -		}
> -	}
> -
>   	fs_info->fs_root = btrfs_get_fs_root(fs_info, BTRFS_FS_TREE_OBJECTID, true);
>   	if (IS_ERR(fs_info->fs_root)) {
>   		err = PTR_ERR(fs_info->fs_root);
> @@ -3286,35 +3317,17 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
>   	}
>   	up_read(&fs_info->cleanup_work_sem);
>   
> -	ret = btrfs_resume_balance_async(fs_info);
> -	if (ret) {
> -		btrfs_warn(fs_info, "failed to resume balance: %d", ret);
> -		close_ctree(fs_info);
> -		return ret;
> -	}
> -
> -	ret = btrfs_resume_dev_replace_async(fs_info);
> +	btrfs_discard_resume(fs_info);
> +	ret = btrfs_mount_rw(fs_info);

You've swapped the order of discard_resume and the mount_rw stuff, which 
can be problematic because the tree log blocks will be marked as free, 
so we could discard them while replaying the log.  The discard_resume 
needs to be moved.  I'd even argue it needs to not be resumed unless 
we're rw, but I haven't looked at the code that closely.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 09/10] btrfs: remove free space items when disabling space cache v1
  2020-10-27 21:08 ` [PATCH v5 09/10] btrfs: remove free space items when disabling space cache v1 Boris Burkov
@ 2020-10-29 11:07   ` Wang Yugui
  2020-10-29 12:25     ` Wang Yugui
  0 siblings, 1 reply; 16+ messages in thread
From: Wang Yugui @ 2020-10-29 11:07 UTC (permalink / raw)
  To: Boris Burkov; +Cc: linux-btrfs, kernel-team

Hi,

'mkfs.btrfs -R free-space-tree'  still create 'space cache v1'?

#mkfs.btrfs-5.9 -R free-space-tree /dev/sdb1 -f

1st time mount
#mount /dev/sdb1 /btrfs/
#dmesg
BTRFS info (device sdb1): cleaning free space cache v1
#umount

2nd time mount
#mount /dev/sdb1 /btrfs/
#dmesg
NO BTRFS info (device sdb1): cleaning free space cache v1
#umount

Best Regards
Wang Yugui (wangyugui@e16-tech.com)
2020/10/29

> When the file system transitions from space cache v1 to v2 or to
> nospace_cache, it removes the old cached data, but does not remove
> the FREE_SPACE items nor the free space inodes they point to. This
> doesn't cause any issues besides being a bit inefficient, since these
> items no longer do anything useful.
> 
> To fix it, when we are mounting, and plan to disable the space cache,
> destroy each block group's free space item and free space inode.
> The code to remove the items is lifted from the existing use case of
> removing the block group, with a light adaptation to handle whether or
> not we have already looked up the free space inode.
> 
> References: https://github.com/btrfs/btrfs-todo/issues/5
> Signed-off-by: Boris Burkov <boris@bur.io>
> ---
>  fs/btrfs/block-group.c      |  39 +-------------
>  fs/btrfs/free-space-cache.c | 101 ++++++++++++++++++++++++++++++++++--
>  fs/btrfs/free-space-cache.h |   3 ++
>  3 files changed, 102 insertions(+), 41 deletions(-)
> 
> diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
> index c0f1d6818df7..8938b11a3339 100644
> --- a/fs/btrfs/block-group.c
> +++ b/fs/btrfs/block-group.c
> @@ -892,8 +892,6 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans,
>  	struct btrfs_path *path;
>  	struct btrfs_block_group *block_group;
>  	struct btrfs_free_cluster *cluster;
> -	struct btrfs_root *tree_root = fs_info->tree_root;
> -	struct btrfs_key key;
>  	struct inode *inode;
>  	struct kobject *kobj = NULL;
>  	int ret;
> @@ -971,42 +969,9 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans,
>  	spin_unlock(&trans->transaction->dirty_bgs_lock);
>  	mutex_unlock(&trans->transaction->cache_write_mutex);
>  
> -	if (!IS_ERR(inode)) {
> -		ret = btrfs_orphan_add(trans, BTRFS_I(inode));
> -		if (ret) {
> -			btrfs_add_delayed_iput(inode);
> -			goto out;
> -		}
> -		clear_nlink(inode);
> -		/* One for the block groups ref */
> -		spin_lock(&block_group->lock);
> -		if (block_group->iref) {
> -			block_group->iref = 0;
> -			block_group->inode = NULL;
> -			spin_unlock(&block_group->lock);
> -			iput(inode);
> -		} else {
> -			spin_unlock(&block_group->lock);
> -		}
> -		/* One for our lookup ref */
> -		btrfs_add_delayed_iput(inode);
> -	}
> -
> -	key.objectid = BTRFS_FREE_SPACE_OBJECTID;
> -	key.type = 0;
> -	key.offset = block_group->start;
> -
> -	ret = btrfs_search_slot(trans, tree_root, &key, path, -1, 1);
> -	if (ret < 0)
> +	ret = btrfs_remove_free_space_inode(trans, inode, block_group);
> +	if (ret)
>  		goto out;
> -	if (ret > 0)
> -		btrfs_release_path(path);
> -	if (ret == 0) {
> -		ret = btrfs_del_item(trans, tree_root, path);
> -		if (ret)
> -			goto out;
> -		btrfs_release_path(path);
> -	}
>  
>  	spin_lock(&fs_info->block_group_cache_lock);
>  	rb_erase(&block_group->cache_node,
> diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
> index 3acf935536ea..5d357786c9da 100644
> --- a/fs/btrfs/free-space-cache.c
> +++ b/fs/btrfs/free-space-cache.c
> @@ -207,6 +207,65 @@ int create_free_space_inode(struct btrfs_trans_handle *trans,
>  					 ino, block_group->start);
>  }
>  
> +/*
> + * inode is an optional sink: if it is NULL, btrfs_remove_free_space_inode
> + * handles lookup, otherwise it takes ownership and iputs the inode.
> + * Don't reuse an inode pointer after passing it into this function.
> + */
> +int btrfs_remove_free_space_inode(struct btrfs_trans_handle *trans,
> +				  struct inode *inode,
> +				  struct btrfs_block_group *block_group)
> +{
> +	struct btrfs_path *path;
> +	struct btrfs_key key;
> +	int ret = 0;
> +
> +	path = btrfs_alloc_path();
> +	if (!path)
> +		return -ENOMEM;
> +
> +	if (!inode)
> +		inode = lookup_free_space_inode(block_group, path);
> +	if (IS_ERR(inode)) {
> +		if (PTR_ERR(inode) != -ENOENT)
> +			ret = PTR_ERR(inode);
> +		goto out;
> +	}
> +	ret = btrfs_orphan_add(trans, BTRFS_I(inode));
> +	if (ret) {
> +		btrfs_add_delayed_iput(inode);
> +		goto out;
> +	}
> +	clear_nlink(inode);
> +	/* One for the block groups ref */
> +	spin_lock(&block_group->lock);
> +	if (block_group->iref) {
> +		block_group->iref = 0;
> +		block_group->inode = NULL;
> +		spin_unlock(&block_group->lock);
> +		iput(inode);
> +	} else {
> +		spin_unlock(&block_group->lock);
> +	}
> +	/* One for the lookup ref */
> +	btrfs_add_delayed_iput(inode);
> +
> +	key.objectid = BTRFS_FREE_SPACE_OBJECTID;
> +	key.type = 0;
> +	key.offset = block_group->start;
> +	ret = btrfs_search_slot(trans, trans->fs_info->tree_root, &key, path,
> +				-1, 1);
> +	if (ret) {
> +		if (ret > 0)
> +			ret = 0;
> +		goto out;
> +	}
> +	ret = btrfs_del_item(trans, trans->fs_info->tree_root, path);
> +out:
> +	btrfs_free_path(path);
> +	return ret;
> +}
> +
>  int btrfs_check_trunc_cache_free_space(struct btrfs_fs_info *fs_info,
>  				       struct btrfs_block_rsv *rsv)
>  {
> @@ -3997,6 +4056,28 @@ bool btrfs_free_space_cache_v1_active(struct btrfs_fs_info *fs_info)
>  	return btrfs_super_cache_generation(fs_info->super_copy);
>  }
>  
> +static int cleanup_free_space_cache_v1(struct btrfs_fs_info *fs_info,
> +				       struct btrfs_trans_handle *trans)
> +{
> +	struct btrfs_block_group *block_group;
> +	struct rb_node *node;
> +	int ret;
> +
> +	btrfs_info(fs_info, "cleaning free space cache v1");
> +
> +	node = rb_first(&fs_info->block_group_cache_tree);
> +	while (node) {
> +		block_group = rb_entry(node, struct btrfs_block_group,
> +				       cache_node);
> +		ret = btrfs_remove_free_space_inode(trans, NULL, block_group);
> +		if (ret)
> +			goto out;
> +		node = rb_next(node);
> +	}
> +out:
> +	return ret;
> +}
> +
>  int btrfs_set_free_space_cache_v1_active(struct btrfs_fs_info *fs_info,
>  					 bool active)
>  {
> @@ -4004,18 +4085,30 @@ int btrfs_set_free_space_cache_v1_active(struct btrfs_fs_info *fs_info,
>  	int ret;
>  
>  	/*
> -	 * update_super_roots will appropriately set
> -	 * fs_info->super_copy->cache_generation based on the SPACE_CACHE
> -	 * option, so all we have to do is trigger a transaction commit.
> +	 * update_super_roots will appropriately set or unset
> +	 * fs_info->super_copy->cache_generation based on SPACE_CACHE and
> +	 * BTRFS_FS_CLEANUP_SPACE_CACHE_V1. For this reason, we need a
> +	 * transaction commit whether we are enabling space cache v1 and don't
> +	 * have any other work to do, or are disabling it and removing free
> +	 * space inodes.
>  	 */
>  	trans = btrfs_start_transaction(fs_info->tree_root, 0);
>  	if (IS_ERR(trans))
>  		return PTR_ERR(trans);
>  
> -	if (!active)
> +	if (!active) {
>  		set_bit(BTRFS_FS_CLEANUP_SPACE_CACHE_V1, &fs_info->flags);
> +		ret = cleanup_free_space_cache_v1(fs_info, trans);
> +		if (ret)
> +			goto abort;
> +	}
>  
>  	ret = btrfs_commit_transaction(trans);
> +	goto out;
> +abort:
> +	btrfs_abort_transaction(trans, ret);
> +	btrfs_end_transaction(trans);
> +out:
>  	clear_bit(BTRFS_FS_CLEANUP_SPACE_CACHE_V1, &fs_info->flags);
>  	return ret;
>  }
> diff --git a/fs/btrfs/free-space-cache.h b/fs/btrfs/free-space-cache.h
> index 5c546898ded9..8df4b2925eca 100644
> --- a/fs/btrfs/free-space-cache.h
> +++ b/fs/btrfs/free-space-cache.h
> @@ -84,6 +84,9 @@ struct inode *lookup_free_space_inode(struct btrfs_block_group *block_group,
>  int create_free_space_inode(struct btrfs_trans_handle *trans,
>  			    struct btrfs_block_group *block_group,
>  			    struct btrfs_path *path);
> +int btrfs_remove_free_space_inode(struct btrfs_trans_handle *trans,
> +				  struct inode *inode,
> +				  struct btrfs_block_group *block_group);
>  
>  int btrfs_check_trunc_cache_free_space(struct btrfs_fs_info *fs_info,
>  				       struct btrfs_block_rsv *rsv);
> -- 
> 2.24.1



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 09/10] btrfs: remove free space items when disabling space cache v1
  2020-10-29 11:07   ` Wang Yugui
@ 2020-10-29 12:25     ` Wang Yugui
  0 siblings, 0 replies; 16+ messages in thread
From: Wang Yugui @ 2020-10-29 12:25 UTC (permalink / raw)
  To: Boris Burkov; +Cc: linux-btrfs, kernel-team

Hi,

After mkfs.btrfs with/without -R free-space-tree, 
the result of 'btrfs inspect-internal dump-super' is

cache_generation        18446744073709551615	=> -1LL
uuid_tree_generation    0

No needs of 'cleaning free space cache v1'  when 
cache_generation is -1LL?

Best Regards
Wang Yugui (wangyugui@e16-tech.com)
2020/10/29

> Hi,
> 
> 'mkfs.btrfs -R free-space-tree'  still create 'space cache v1'?
> 
> #mkfs.btrfs-5.9 -R free-space-tree /dev/sdb1 -f
> 
> 1st time mount
> #mount /dev/sdb1 /btrfs/
> #dmesg
> BTRFS info (device sdb1): cleaning free space cache v1
> #umount
> 
> 2nd time mount
> #mount /dev/sdb1 /btrfs/
> #dmesg
> NO BTRFS info (device sdb1): cleaning free space cache v1
> #umount
> 
> Best Regards
> Wang Yugui (wangyugui@e16-tech.com)
> 2020/10/29
> 
> > When the file system transitions from space cache v1 to v2 or to
> > nospace_cache, it removes the old cached data, but does not remove
> > the FREE_SPACE items nor the free space inodes they point to. This
> > doesn't cause any issues besides being a bit inefficient, since these
> > items no longer do anything useful.
> > 
> > To fix it, when we are mounting, and plan to disable the space cache,
> > destroy each block group's free space item and free space inode.
> > The code to remove the items is lifted from the existing use case of
> > removing the block group, with a light adaptation to handle whether or
> > not we have already looked up the free space inode.
> > 
> > References: https://github.com/btrfs/btrfs-todo/issues/5
> > Signed-off-by: Boris Burkov <boris@bur.io>
> > ---
> >  fs/btrfs/block-group.c      |  39 +-------------
> >  fs/btrfs/free-space-cache.c | 101 ++++++++++++++++++++++++++++++++++--
> >  fs/btrfs/free-space-cache.h |   3 ++
> >  3 files changed, 102 insertions(+), 41 deletions(-)
> > 
> > diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
> > index c0f1d6818df7..8938b11a3339 100644
> > --- a/fs/btrfs/block-group.c
> > +++ b/fs/btrfs/block-group.c
> > @@ -892,8 +892,6 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans,
> >  	struct btrfs_path *path;
> >  	struct btrfs_block_group *block_group;
> >  	struct btrfs_free_cluster *cluster;
> > -	struct btrfs_root *tree_root = fs_info->tree_root;
> > -	struct btrfs_key key;
> >  	struct inode *inode;
> >  	struct kobject *kobj = NULL;
> >  	int ret;
> > @@ -971,42 +969,9 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans,
> >  	spin_unlock(&trans->transaction->dirty_bgs_lock);
> >  	mutex_unlock(&trans->transaction->cache_write_mutex);
> >  
> > -	if (!IS_ERR(inode)) {
> > -		ret = btrfs_orphan_add(trans, BTRFS_I(inode));
> > -		if (ret) {
> > -			btrfs_add_delayed_iput(inode);
> > -			goto out;
> > -		}
> > -		clear_nlink(inode);
> > -		/* One for the block groups ref */
> > -		spin_lock(&block_group->lock);
> > -		if (block_group->iref) {
> > -			block_group->iref = 0;
> > -			block_group->inode = NULL;
> > -			spin_unlock(&block_group->lock);
> > -			iput(inode);
> > -		} else {
> > -			spin_unlock(&block_group->lock);
> > -		}
> > -		/* One for our lookup ref */
> > -		btrfs_add_delayed_iput(inode);
> > -	}
> > -
> > -	key.objectid = BTRFS_FREE_SPACE_OBJECTID;
> > -	key.type = 0;
> > -	key.offset = block_group->start;
> > -
> > -	ret = btrfs_search_slot(trans, tree_root, &key, path, -1, 1);
> > -	if (ret < 0)
> > +	ret = btrfs_remove_free_space_inode(trans, inode, block_group);
> > +	if (ret)
> >  		goto out;
> > -	if (ret > 0)
> > -		btrfs_release_path(path);
> > -	if (ret == 0) {
> > -		ret = btrfs_del_item(trans, tree_root, path);
> > -		if (ret)
> > -			goto out;
> > -		btrfs_release_path(path);
> > -	}
> >  
> >  	spin_lock(&fs_info->block_group_cache_lock);
> >  	rb_erase(&block_group->cache_node,
> > diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
> > index 3acf935536ea..5d357786c9da 100644
> > --- a/fs/btrfs/free-space-cache.c
> > +++ b/fs/btrfs/free-space-cache.c
> > @@ -207,6 +207,65 @@ int create_free_space_inode(struct btrfs_trans_handle *trans,
> >  					 ino, block_group->start);
> >  }
> >  
> > +/*
> > + * inode is an optional sink: if it is NULL, btrfs_remove_free_space_inode
> > + * handles lookup, otherwise it takes ownership and iputs the inode.
> > + * Don't reuse an inode pointer after passing it into this function.
> > + */
> > +int btrfs_remove_free_space_inode(struct btrfs_trans_handle *trans,
> > +				  struct inode *inode,
> > +				  struct btrfs_block_group *block_group)
> > +{
> > +	struct btrfs_path *path;
> > +	struct btrfs_key key;
> > +	int ret = 0;
> > +
> > +	path = btrfs_alloc_path();
> > +	if (!path)
> > +		return -ENOMEM;
> > +
> > +	if (!inode)
> > +		inode = lookup_free_space_inode(block_group, path);
> > +	if (IS_ERR(inode)) {
> > +		if (PTR_ERR(inode) != -ENOENT)
> > +			ret = PTR_ERR(inode);
> > +		goto out;
> > +	}
> > +	ret = btrfs_orphan_add(trans, BTRFS_I(inode));
> > +	if (ret) {
> > +		btrfs_add_delayed_iput(inode);
> > +		goto out;
> > +	}
> > +	clear_nlink(inode);
> > +	/* One for the block groups ref */
> > +	spin_lock(&block_group->lock);
> > +	if (block_group->iref) {
> > +		block_group->iref = 0;
> > +		block_group->inode = NULL;
> > +		spin_unlock(&block_group->lock);
> > +		iput(inode);
> > +	} else {
> > +		spin_unlock(&block_group->lock);
> > +	}
> > +	/* One for the lookup ref */
> > +	btrfs_add_delayed_iput(inode);
> > +
> > +	key.objectid = BTRFS_FREE_SPACE_OBJECTID;
> > +	key.type = 0;
> > +	key.offset = block_group->start;
> > +	ret = btrfs_search_slot(trans, trans->fs_info->tree_root, &key, path,
> > +				-1, 1);
> > +	if (ret) {
> > +		if (ret > 0)
> > +			ret = 0;
> > +		goto out;
> > +	}
> > +	ret = btrfs_del_item(trans, trans->fs_info->tree_root, path);
> > +out:
> > +	btrfs_free_path(path);
> > +	return ret;
> > +}
> > +
> >  int btrfs_check_trunc_cache_free_space(struct btrfs_fs_info *fs_info,
> >  				       struct btrfs_block_rsv *rsv)
> >  {
> > @@ -3997,6 +4056,28 @@ bool btrfs_free_space_cache_v1_active(struct btrfs_fs_info *fs_info)
> >  	return btrfs_super_cache_generation(fs_info->super_copy);
> >  }
> >  
> > +static int cleanup_free_space_cache_v1(struct btrfs_fs_info *fs_info,
> > +				       struct btrfs_trans_handle *trans)
> > +{
> > +	struct btrfs_block_group *block_group;
> > +	struct rb_node *node;
> > +	int ret;
> > +
> > +	btrfs_info(fs_info, "cleaning free space cache v1");
> > +
> > +	node = rb_first(&fs_info->block_group_cache_tree);
> > +	while (node) {
> > +		block_group = rb_entry(node, struct btrfs_block_group,
> > +				       cache_node);
> > +		ret = btrfs_remove_free_space_inode(trans, NULL, block_group);
> > +		if (ret)
> > +			goto out;
> > +		node = rb_next(node);
> > +	}
> > +out:
> > +	return ret;
> > +}
> > +
> >  int btrfs_set_free_space_cache_v1_active(struct btrfs_fs_info *fs_info,
> >  					 bool active)
> >  {
> > @@ -4004,18 +4085,30 @@ int btrfs_set_free_space_cache_v1_active(struct btrfs_fs_info *fs_info,
> >  	int ret;
> >  
> >  	/*
> > -	 * update_super_roots will appropriately set
> > -	 * fs_info->super_copy->cache_generation based on the SPACE_CACHE
> > -	 * option, so all we have to do is trigger a transaction commit.
> > +	 * update_super_roots will appropriately set or unset
> > +	 * fs_info->super_copy->cache_generation based on SPACE_CACHE and
> > +	 * BTRFS_FS_CLEANUP_SPACE_CACHE_V1. For this reason, we need a
> > +	 * transaction commit whether we are enabling space cache v1 and don't
> > +	 * have any other work to do, or are disabling it and removing free
> > +	 * space inodes.
> >  	 */
> >  	trans = btrfs_start_transaction(fs_info->tree_root, 0);
> >  	if (IS_ERR(trans))
> >  		return PTR_ERR(trans);
> >  
> > -	if (!active)
> > +	if (!active) {
> >  		set_bit(BTRFS_FS_CLEANUP_SPACE_CACHE_V1, &fs_info->flags);
> > +		ret = cleanup_free_space_cache_v1(fs_info, trans);
> > +		if (ret)
> > +			goto abort;
> > +	}
> >  
> >  	ret = btrfs_commit_transaction(trans);
> > +	goto out;
> > +abort:
> > +	btrfs_abort_transaction(trans, ret);
> > +	btrfs_end_transaction(trans);
> > +out:
> >  	clear_bit(BTRFS_FS_CLEANUP_SPACE_CACHE_V1, &fs_info->flags);
> >  	return ret;
> >  }
> > diff --git a/fs/btrfs/free-space-cache.h b/fs/btrfs/free-space-cache.h
> > index 5c546898ded9..8df4b2925eca 100644
> > --- a/fs/btrfs/free-space-cache.h
> > +++ b/fs/btrfs/free-space-cache.h
> > @@ -84,6 +84,9 @@ struct inode *lookup_free_space_inode(struct btrfs_block_group *block_group,
> >  int create_free_space_inode(struct btrfs_trans_handle *trans,
> >  			    struct btrfs_block_group *block_group,
> >  			    struct btrfs_path *path);
> > +int btrfs_remove_free_space_inode(struct btrfs_trans_handle *trans,
> > +				  struct inode *inode,
> > +				  struct btrfs_block_group *block_group);
> >  
> >  int btrfs_check_trunc_cache_free_space(struct btrfs_fs_info *fs_info,
> >  				       struct btrfs_block_rsv *rsv);
> > -- 
> > 2.24.1
> 



^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-10-29 12:25 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-27 21:07 Boris Burkov
2020-10-27 21:07 ` [PATCH v5 01/10] btrfs: lift rw mount setup from mount and remount Boris Burkov
2020-10-28 14:08   ` Josef Bacik
2020-10-27 21:07 ` [PATCH v5 02/10] btrfs: cleanup all orphan inodes on ro->rw remount Boris Burkov
2020-10-28 13:52   ` Josef Bacik
2020-10-27 21:07 ` [PATCH v5 03/10] btrfs: create free space tree " Boris Burkov
2020-10-27 21:07 ` [PATCH v5 04/10] btrfs: clear oneshot options on mount and remount Boris Burkov
2020-10-28 13:58   ` Josef Bacik
2020-10-27 21:07 ` [PATCH v5 05/10] btrfs: clear free space tree on ro->rw remount Boris Burkov
2020-10-27 21:08 ` [PATCH v5 06/10] btrfs: keep sb cache_generation consistent with space_cache Boris Burkov
2020-10-27 21:08 ` [PATCH v5 07/10] btrfs: use sb state to print space_cache mount option Boris Burkov
2020-10-27 21:08 ` [PATCH v5 08/10] btrfs: warn when remount will not change the free space tree Boris Burkov
2020-10-27 21:08 ` [PATCH v5 09/10] btrfs: remove free space items when disabling space cache v1 Boris Burkov
2020-10-29 11:07   ` Wang Yugui
2020-10-29 12:25     ` Wang Yugui
2020-10-27 21:08 ` [PATCH v5 10/10] btrfs: skip space_cache v1 setup when not using it Boris Burkov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.