linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 00/10] btrfs: free space tree mounting fixes
@ 2020-10-29 23:57 Boris Burkov
  2020-10-29 23:57 ` [PATCH v6 01/10] btrfs: lift rw mount setup from mount and remount Boris Burkov
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: Boris Burkov @ 2020-10-29 23:57 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

This patch set cleans up issues surrounding enabling and disabling various
free space cache features and their reporting in /proc/mounts.  Because the
improvements became somewhat complex, the series starts by lifting rw mount
logic into a single place.

The first patch is a setup patch that unifies very similar logic between a
normal rw mount and a ro->rw remount. This is a useful setup step for adding
more functionality to ro->rw remounts.

The second patch fixes the omission of orphan inode cleanup on a few trees
during ro->rw remount.

The third patch adds enabling the free space tree to ro->rw remount.

The fourth patch adds a method for clearing oneshot mount options after mount.

The fifth patch adds support for clearing the free space tree on ro->rw remount.

The sixth patch sets up for more accurate /proc/mounts by ensuring that
cache_generation > 0 iff space_cache is enabled.

The seventh patch is the more accurate /proc/mounts logic.

The eighth patch is a convenience kernel message that complains when we skip
changing the free space tree on remount.

The ninth patch removes the space cache v1 free space item and free space
inodes when space cache v1 is disabled (nospace_cache or space_cache=v2).

The tenth patch stops re-creating the free space objects when we are not
using space_cache=v1

changes for v6:
Rebase all patches on newer misc-next
Patch 1/10: Fix unnecessary re-ordering of discard resume and most of mount_rw.
Patch 2/10: No change.
Patch 3/10: No change.
Patch 4/10: Handle unclean rebase.
Patch 5/10: No change.
Patch 6/10: No change.
Patch 7/10: No change.
Patch 8/10: No change.
Patch 9/10: No change.
Patch 10/10: No change.

changes for v5:
Patch 1/10: No change.
Patch 2/10: No change.
Patch 3/10: No change.
Patch 4/10: New.
Patch 5/10: New.
Patch 6/10: Handles remounts for no<->v1. Was Patch 4.
Patch 7/10: No change. Was Patch 5
Patch 8/10: Warns/undoes flags on skipped clear as well as skipped create.
            Was Patch 6.
Patch 9/10: No change. Was Patch 7.
Patch 10/10: No change. Was Patch 8.

changes for v4:
Patch 1/8: New
Patch 2/8: New
Patch 3/8: (was Patch 1) Made much simpler by lifting of Patch 1. Simply add
           free space tree creation to lifted rw mount logic.
Patch 4/8: Split out from old Patch 2. Add an fs_info flag to avoid surprises
           when a different transaction updates the cache_generation.
Patch 5/8: Split out from old Patch 2, but no change logically.
Patch 6/8: Split out from old Patch 1. Formatting fixes.
Patch 7/8: (was Patch 3) Rely on delayed_iput running after orphan cleanup
           (setup in patch 2) to pull iputs out of mount path, per review.
Patch 8/8: No change.

changes for v3:
Patch 1/4: Change failure to warning logging.
Patch 2/4: New; fixes mount option printing.
Patch 3/4: Fix orphan inode vs. delayed iput bug, change remove function
           to take inode as a sink.
Patch 4/4: No changes.

changes for v2:
Patch 1/3: made remount _only_ work in ro->rw case, added comment.
Patch 2/3: added btrfs_ prefix to non-static function, removed bad
           whitespace.


Boris Burkov (10):
  btrfs: lift rw mount setup from mount and remount
  btrfs: cleanup all orphan inodes on ro->rw remount
  btrfs: create free space tree on ro->rw remount
  btrfs: clear oneshot options on mount and remount
  btrfs: clear free space tree on ro->rw remount
  btrfs: keep sb cache_generation consistent with space_cache
  btrfs: use sb state to print space_cache mount option
  btrfs: warn when remount will not change the free space tree
  btrfs: remove free space items when disabling space cache v1
  btrfs: skip space_cache v1 setup when not using it

 fs/btrfs/block-group.c      |  42 +-------
 fs/btrfs/ctree.h            |   3 +
 fs/btrfs/disk-io.c          | 195 +++++++++++++++++++++---------------
 fs/btrfs/disk-io.h          |   2 +
 fs/btrfs/free-space-cache.c | 121 ++++++++++++++++++++++
 fs/btrfs/free-space-cache.h |   6 ++
 fs/btrfs/super.c            |  70 +++++++------
 fs/btrfs/transaction.c      |   2 +
 8 files changed, 286 insertions(+), 155 deletions(-)

-- 
2.24.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v6 01/10] btrfs: lift rw mount setup from mount and remount
  2020-10-29 23:57 [PATCH v6 00/10] btrfs: free space tree mounting fixes Boris Burkov
@ 2020-10-29 23:57 ` Boris Burkov
  2020-10-29 23:57 ` [PATCH v6 02/10] btrfs: cleanup all orphan inodes on ro->rw remount Boris Burkov
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Boris Burkov @ 2020-10-29 23:57 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Mounting rw and remounting from ro to rw naturally share invariants and
functionality which result in a correctly setup rw filesystem. Luckily,
there is even a strong unity in the code which implements them. In
mount's open_ctree, these operations mostly happen after an early return
for ro file systems, and in remount, they happen in a section devoted to
remounting ro->rw, after some remount specific validation passes.

However, there are unfortunately a few differences. There are small
deviations in the order of some of the operations, remount does not
cleanup orphan inodes in root_tree or fs_tree, remount does not create
the free space tree, and remount does not handle "one-shot" mount
options like clear_cache and uuid tree rescan.

Since we want to add building the free space tree to remount, and since
it is possible to leak orphans on a filesystem mounted as ro then
remounted rw (common for the root filesystem when booting), we would
benefit from unifying the logic between the two codepaths.

This patch only lifts the existing common functionality, and leaves a
natural path for fixing the discrepancies.

Signed-off-by: Boris Burkov <boris@bur.io>
---
 fs/btrfs/disk-io.c | 91 ++++++++++++++++++++++++++--------------------
 fs/btrfs/disk-io.h |  1 +
 fs/btrfs/super.c   | 37 +++----------------
 3 files changed, 59 insertions(+), 70 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 601a7ab2adb4..b741a711bad5 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2894,6 +2894,53 @@ static int btrfs_check_uuid_tree(struct btrfs_fs_info *fs_info)
 	return 0;
 }
 
+/*
+ * Mounting logic specific to read-write file systems. Shared by open_ctree
+ * and btrfs_remount when remounting from read-only to read-write.
+ */
+int btrfs_mount_rw(struct btrfs_fs_info *fs_info)
+{
+	int ret;
+
+	ret = btrfs_cleanup_fs_roots(fs_info);
+	if (ret)
+		goto out;
+
+	mutex_lock(&fs_info->cleaner_mutex);
+	ret = btrfs_recover_relocation(fs_info->tree_root);
+	mutex_unlock(&fs_info->cleaner_mutex);
+	if (ret < 0) {
+		btrfs_warn(fs_info, "failed to recover relocation: %d", ret);
+		goto out;
+	}
+
+	ret = btrfs_resume_balance_async(fs_info);
+	if (ret)
+		goto out;
+
+	ret = btrfs_resume_dev_replace_async(fs_info);
+	if (ret) {
+		btrfs_warn(fs_info, "failed to resume dev_replace");
+		goto out;
+	}
+
+	btrfs_qgroup_rescan_resume(fs_info);
+
+	if (!fs_info->uuid_root) {
+		btrfs_info(fs_info, "creating UUID tree");
+		ret = btrfs_create_uuid_tree(fs_info);
+		if (ret) {
+			btrfs_warn(fs_info,
+				   "failed to create the UUID tree %d",
+				   ret);
+			goto out;
+		}
+	}
+
+out:
+	return ret;
+}
+
 int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_devices,
 		      char *options)
 {
@@ -3298,22 +3345,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
 	if (ret)
 		goto fail_qgroup;
 
-	if (!sb_rdonly(sb)) {
-		ret = btrfs_cleanup_fs_roots(fs_info);
-		if (ret)
-			goto fail_qgroup;
-
-		mutex_lock(&fs_info->cleaner_mutex);
-		ret = btrfs_recover_relocation(tree_root);
-		mutex_unlock(&fs_info->cleaner_mutex);
-		if (ret < 0) {
-			btrfs_warn(fs_info, "failed to recover relocation: %d",
-					ret);
-			err = -EINVAL;
-			goto fail_qgroup;
-		}
-	}
-
 	fs_info->fs_root = btrfs_get_fs_root(fs_info, BTRFS_FS_TREE_OBJECTID, true);
 	if (IS_ERR(fs_info->fs_root)) {
 		err = PTR_ERR(fs_info->fs_root);
@@ -3366,35 +3397,17 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
 	}
 	up_read(&fs_info->cleanup_work_sem);
 
-	ret = btrfs_resume_balance_async(fs_info);
-	if (ret) {
-		btrfs_warn(fs_info, "failed to resume balance: %d", ret);
-		close_ctree(fs_info);
-		return ret;
-	}
-
-	ret = btrfs_resume_dev_replace_async(fs_info);
+	ret = btrfs_mount_rw(fs_info);
 	if (ret) {
-		btrfs_warn(fs_info, "failed to resume device replace: %d", ret);
 		close_ctree(fs_info);
 		return ret;
 	}
-
-	btrfs_qgroup_rescan_resume(fs_info);
 	btrfs_discard_resume(fs_info);
 
-	if (!fs_info->uuid_root) {
-		btrfs_info(fs_info, "creating UUID tree");
-		ret = btrfs_create_uuid_tree(fs_info);
-		if (ret) {
-			btrfs_warn(fs_info,
-				"failed to create the UUID tree: %d", ret);
-			close_ctree(fs_info);
-			return ret;
-		}
-	} else if (btrfs_test_opt(fs_info, RESCAN_UUID_TREE) ||
-		   fs_info->generation !=
-				btrfs_super_uuid_tree_generation(disk_super)) {
+	if (fs_info->uuid_root &&
+	    (btrfs_test_opt(fs_info, RESCAN_UUID_TREE) ||
+	     fs_info->generation !=
+			btrfs_super_uuid_tree_generation(disk_super))) {
 		btrfs_info(fs_info, "checking UUID tree");
 		ret = btrfs_check_uuid_tree(fs_info);
 		if (ret) {
diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h
index 182540bdcea0..a4e6cdbe3a48 100644
--- a/fs/btrfs/disk-io.h
+++ b/fs/btrfs/disk-io.h
@@ -50,6 +50,7 @@ struct extent_buffer *btrfs_find_create_tree_block(
 						struct btrfs_fs_info *fs_info,
 						u64 bytenr);
 void btrfs_clean_tree_block(struct extent_buffer *buf);
+int btrfs_mount_rw(struct btrfs_fs_info *fs_info);
 int __cold open_ctree(struct super_block *sb,
 	       struct btrfs_fs_devices *fs_devices,
 	       char *options);
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 1ffa50bae1dd..22299fc13b6d 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1880,7 +1880,6 @@ static inline void btrfs_remount_cleanup(struct btrfs_fs_info *fs_info,
 static int btrfs_remount(struct super_block *sb, int *flags, char *data)
 {
 	struct btrfs_fs_info *fs_info = btrfs_sb(sb);
-	struct btrfs_root *root = fs_info->tree_root;
 	unsigned old_flags = sb->s_flags;
 	unsigned long old_opts = fs_info->mount_opt;
 	unsigned long old_compress_type = fs_info->compress_type;
@@ -1973,39 +1972,15 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data)
 			goto restore;
 		}
 
-		ret = btrfs_cleanup_fs_roots(fs_info);
-		if (ret)
-			goto restore;
-
-		/* recover relocation */
-		mutex_lock(&fs_info->cleaner_mutex);
-		ret = btrfs_recover_relocation(root);
-		mutex_unlock(&fs_info->cleaner_mutex);
-		if (ret)
-			goto restore;
-
-		ret = btrfs_resume_balance_async(fs_info);
+		/*
+		 * NOTE: when remounting with a change that does writes, don't
+		 * put it anywhere above this point, as we are not sure to be
+		 * safe to write until we pass the above checks.
+		 */
+		ret = btrfs_mount_rw(fs_info);
 		if (ret)
 			goto restore;
 
-		ret = btrfs_resume_dev_replace_async(fs_info);
-		if (ret) {
-			btrfs_warn(fs_info, "failed to resume dev_replace");
-			goto restore;
-		}
-
-		btrfs_qgroup_rescan_resume(fs_info);
-
-		if (!fs_info->uuid_root) {
-			btrfs_info(fs_info, "creating UUID tree");
-			ret = btrfs_create_uuid_tree(fs_info);
-			if (ret) {
-				btrfs_warn(fs_info,
-					   "failed to create the UUID tree %d",
-					   ret);
-				goto restore;
-			}
-		}
 		sb->s_flags &= ~SB_RDONLY;
 
 		set_bit(BTRFS_FS_OPEN, &fs_info->flags);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v6 02/10] btrfs: cleanup all orphan inodes on ro->rw remount
  2020-10-29 23:57 [PATCH v6 00/10] btrfs: free space tree mounting fixes Boris Burkov
  2020-10-29 23:57 ` [PATCH v6 01/10] btrfs: lift rw mount setup from mount and remount Boris Burkov
@ 2020-10-29 23:57 ` Boris Burkov
  2020-10-29 23:57 ` [PATCH v6 03/10] btrfs: create free space tree " Boris Burkov
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Boris Burkov @ 2020-10-29 23:57 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

When we mount a rw file system, we clean the orphan inodes in the
filesystem trees, and also on the tree_root and fs_root. However, when
we remount a ro file system rw, we only clean the former. Move the calls
to btrfs_orphan_cleanup() on tree_root and fs_root to the shared rw
mount routine to effectively add them on ro->rw remount.

Signed-off-by: Boris Burkov <boris@bur.io>
---
 fs/btrfs/disk-io.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index b741a711bad5..e42548287161 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2906,6 +2906,14 @@ int btrfs_mount_rw(struct btrfs_fs_info *fs_info)
 	if (ret)
 		goto out;
 
+	down_read(&fs_info->cleanup_work_sem);
+	if ((ret = btrfs_orphan_cleanup(fs_info->fs_root)) ||
+	    (ret = btrfs_orphan_cleanup(fs_info->tree_root))) {
+		up_read(&fs_info->cleanup_work_sem);
+		goto out;
+	}
+	up_read(&fs_info->cleanup_work_sem);
+
 	mutex_lock(&fs_info->cleaner_mutex);
 	ret = btrfs_recover_relocation(fs_info->tree_root);
 	mutex_unlock(&fs_info->cleaner_mutex);
@@ -3388,15 +3396,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
 		}
 	}
 
-	down_read(&fs_info->cleanup_work_sem);
-	if ((ret = btrfs_orphan_cleanup(fs_info->fs_root)) ||
-	    (ret = btrfs_orphan_cleanup(fs_info->tree_root))) {
-		up_read(&fs_info->cleanup_work_sem);
-		close_ctree(fs_info);
-		return ret;
-	}
-	up_read(&fs_info->cleanup_work_sem);
-
 	ret = btrfs_mount_rw(fs_info);
 	if (ret) {
 		close_ctree(fs_info);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v6 03/10] btrfs: create free space tree on ro->rw remount
  2020-10-29 23:57 [PATCH v6 00/10] btrfs: free space tree mounting fixes Boris Burkov
  2020-10-29 23:57 ` [PATCH v6 01/10] btrfs: lift rw mount setup from mount and remount Boris Burkov
  2020-10-29 23:57 ` [PATCH v6 02/10] btrfs: cleanup all orphan inodes on ro->rw remount Boris Burkov
@ 2020-10-29 23:57 ` Boris Burkov
  2020-10-29 23:57 ` [PATCH v6 04/10] btrfs: clear oneshot options on mount and remount Boris Burkov
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Boris Burkov @ 2020-10-29 23:57 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

When a user attempts to remount a btrfs filesystem with
'mount -o remount,space_cache=v2', that operation silently succeeds.
Unfortunately, this is misleading, because the remount does not create
the free space tree. /proc/mounts will incorrectly show space_cache=v2,
but on the next mount, the file system will revert to the old
space_cache.

For now, we handle only the easier case, where the existing mount is
read-only and the new mount is read-write. In that case, we can create
the free space tree without contending with the block groups changing
as we go.

Signed-off-by: Boris Burkov <boris@bur.io>
---
 fs/btrfs/disk-io.c | 23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index e42548287161..69a878d8cc79 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2922,6 +2922,17 @@ int btrfs_mount_rw(struct btrfs_fs_info *fs_info)
 		goto out;
 	}
 
+	if (btrfs_test_opt(fs_info, FREE_SPACE_TREE) &&
+	    !btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE)) {
+		btrfs_info(fs_info, "creating free space tree");
+		ret = btrfs_create_free_space_tree(fs_info);
+		if (ret) {
+			btrfs_warn(fs_info,
+				"failed to create free space tree: %d", ret);
+			goto out;
+		}
+	}
+
 	ret = btrfs_resume_balance_async(fs_info);
 	if (ret)
 		goto out;
@@ -3384,18 +3395,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
 		}
 	}
 
-	if (btrfs_test_opt(fs_info, FREE_SPACE_TREE) &&
-	    !btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE)) {
-		btrfs_info(fs_info, "creating free space tree");
-		ret = btrfs_create_free_space_tree(fs_info);
-		if (ret) {
-			btrfs_warn(fs_info,
-				"failed to create free space tree: %d", ret);
-			close_ctree(fs_info);
-			return ret;
-		}
-	}
-
 	ret = btrfs_mount_rw(fs_info);
 	if (ret) {
 		close_ctree(fs_info);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v6 04/10] btrfs: clear oneshot options on mount and remount
  2020-10-29 23:57 [PATCH v6 00/10] btrfs: free space tree mounting fixes Boris Burkov
                   ` (2 preceding siblings ...)
  2020-10-29 23:57 ` [PATCH v6 03/10] btrfs: create free space tree " Boris Burkov
@ 2020-10-29 23:57 ` Boris Burkov
  2020-10-29 23:57 ` [PATCH v6 05/10] btrfs: clear free space tree on ro->rw remount Boris Burkov
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Boris Burkov @ 2020-10-29 23:57 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Some options only apply during mount time and are cleared at the end
of mount. For now, the example is USEBACKUPROOT, but CLEAR_CACHE also
fits the bill, and this is a preparation patch for also clearing that
option.

One subtlety is that the current code only resets USEBACKUPROOT on rw
mounts, but the option is meaningfully "consumed" by a ro mount, so it
feels appropriate to clear in that case as well. A subsequent read-write
remount would not go through open_ctree, which is the only place that
checks the option, so the change should be benign.

Signed-off-by: Boris Burkov <boris@bur.io>
---
 fs/btrfs/disk-io.c | 14 +++++++++++++-
 fs/btrfs/disk-io.h |  1 +
 fs/btrfs/super.c   |  1 +
 3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 69a878d8cc79..63f1e37f0e22 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2894,6 +2894,16 @@ static int btrfs_check_uuid_tree(struct btrfs_fs_info *fs_info)
 	return 0;
 }
 
+/*
+ * Some options only have meaning at mount time and shouldn't persist across
+ * remounts, or be displayed. Clear these at the end of mount and remount
+ * code paths.
+ */
+void btrfs_clear_oneshot_options(struct btrfs_fs_info *fs_info)
+{
+	btrfs_clear_opt(fs_info->mount_opt, USEBACKUPROOT);
+}
+
 /*
  * Mounting logic specific to read-write file systems. Shared by open_ctree
  * and btrfs_remount when remounting from read-only to read-write.
@@ -3373,7 +3383,7 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
 	}
 
 	if (sb_rdonly(sb))
-		return 0;
+		goto clear_oneshot;
 
 	if (btrfs_test_opt(fs_info, CLEAR_CACHE) &&
 	    btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE)) {
@@ -3417,6 +3427,8 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
 	}
 	set_bit(BTRFS_FS_OPEN, &fs_info->flags);
 
+clear_oneshot:
+	btrfs_clear_oneshot_options(fs_info);
 	return 0;
 
 fail_qgroup:
diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h
index a4e6cdbe3a48..22df6a570aaf 100644
--- a/fs/btrfs/disk-io.h
+++ b/fs/btrfs/disk-io.h
@@ -50,6 +50,7 @@ struct extent_buffer *btrfs_find_create_tree_block(
 						struct btrfs_fs_info *fs_info,
 						u64 bytenr);
 void btrfs_clean_tree_block(struct extent_buffer *buf);
+void btrfs_clear_oneshot_options(struct btrfs_fs_info *fs_info);
 int btrfs_mount_rw(struct btrfs_fs_info *fs_info);
 int __cold open_ctree(struct super_block *sb,
 	       struct btrfs_fs_devices *fs_devices,
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 22299fc13b6d..3d56b98e152d 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1994,6 +1994,7 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data)
 
 	wake_up_process(fs_info->transaction_kthread);
 	btrfs_remount_cleanup(fs_info, old_opts);
+	btrfs_clear_oneshot_options(fs_info);
 	clear_bit(BTRFS_FS_STATE_REMOUNTING, &fs_info->fs_state);
 
 	return 0;
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v6 05/10] btrfs: clear free space tree on ro->rw remount
  2020-10-29 23:57 [PATCH v6 00/10] btrfs: free space tree mounting fixes Boris Burkov
                   ` (3 preceding siblings ...)
  2020-10-29 23:57 ` [PATCH v6 04/10] btrfs: clear oneshot options on mount and remount Boris Burkov
@ 2020-10-29 23:57 ` Boris Burkov
  2020-10-29 23:57 ` [PATCH v6 06/10] btrfs: keep sb cache_generation consistent with space_cache Boris Burkov
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Boris Burkov @ 2020-10-29 23:57 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

A user might want to revert to v1 or nospace_cache on a root filesystem,
and much like turning on the free space tree, that can only be done
remounting from ro->rw. Support clearing the free space tree on such
mounts by moving it into the shared remount logic.

Since the CLEAR_CACHE option sticks around across remounts, this change
would result in clearing the tree for ever on every remount, which is
not desirable. To fix that, add CLEAR_CACHE to the oneshot options we
clear at mount end, which has the other bonus of not cluttering the
/proc/mounts output with clear_cache.

Signed-off-by: Boris Burkov <boris@bur.io>
---
 fs/btrfs/disk-io.c | 43 ++++++++++++++++++++++---------------------
 1 file changed, 22 insertions(+), 21 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 63f1e37f0e22..d6b73701563c 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2911,6 +2911,28 @@ void btrfs_clear_oneshot_options(struct btrfs_fs_info *fs_info)
 int btrfs_mount_rw(struct btrfs_fs_info *fs_info)
 {
 	int ret;
+	bool cache_opt = btrfs_test_opt(fs_info, SPACE_CACHE);
+	bool clear_free_space_tree = false;
+
+	if (btrfs_test_opt(fs_info, CLEAR_CACHE) &&
+	    btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE)) {
+		clear_free_space_tree = true;
+	} else if (btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE) &&
+		   !btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE_VALID)) {
+		btrfs_warn(fs_info, "free space tree is invalid");
+		clear_free_space_tree = true;
+	}
+
+	if (clear_free_space_tree) {
+		btrfs_info(fs_info, "clearing free space tree");
+		ret = btrfs_clear_free_space_tree(fs_info);
+		if (ret) {
+			btrfs_warn(fs_info,
+				   "failed to clear free space tree: %d", ret);
+			close_ctree(fs_info);
+			return ret;
+		}
+	}
 
 	ret = btrfs_cleanup_fs_roots(fs_info);
 	if (ret)
@@ -2985,7 +3007,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
 	struct btrfs_root *chunk_root;
 	int ret;
 	int err = -EINVAL;
-	int clear_free_space_tree = 0;
 	int level;
 
 	ret = init_mount_fs_info(fs_info, sb);
@@ -3385,26 +3406,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
 	if (sb_rdonly(sb))
 		goto clear_oneshot;
 
-	if (btrfs_test_opt(fs_info, CLEAR_CACHE) &&
-	    btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE)) {
-		clear_free_space_tree = 1;
-	} else if (btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE) &&
-		   !btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE_VALID)) {
-		btrfs_warn(fs_info, "free space tree is invalid");
-		clear_free_space_tree = 1;
-	}
-
-	if (clear_free_space_tree) {
-		btrfs_info(fs_info, "clearing free space tree");
-		ret = btrfs_clear_free_space_tree(fs_info);
-		if (ret) {
-			btrfs_warn(fs_info,
-				   "failed to clear free space tree: %d", ret);
-			close_ctree(fs_info);
-			return ret;
-		}
-	}
-
 	ret = btrfs_mount_rw(fs_info);
 	if (ret) {
 		close_ctree(fs_info);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v6 06/10] btrfs: keep sb cache_generation consistent with space_cache
  2020-10-29 23:57 [PATCH v6 00/10] btrfs: free space tree mounting fixes Boris Burkov
                   ` (4 preceding siblings ...)
  2020-10-29 23:57 ` [PATCH v6 05/10] btrfs: clear free space tree on ro->rw remount Boris Burkov
@ 2020-10-29 23:57 ` Boris Burkov
  2020-10-29 23:57 ` [PATCH v6 07/10] btrfs: use sb state to print space_cache mount option Boris Burkov
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Boris Burkov @ 2020-10-29 23:57 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

When mounting, btrfs uses the cache_generation in the super block to
determine if space cache v1 is in use. However, by mounting with
nospace_cache or space_cache=v2, it is possible to disable space cache
v1, which does not result in un-setting cache_generation back to 0.

In order to base some logic, like mount option printing in /proc/mounts,
on the current state of the space cache rather than just the values of
the mount option, keep the value of cache_generation consistent with the
status of space cache v1.

We ensure that cache_generation > 0 iff the file system is using
space_cache v1. This requires committing a transaction on any mount
which changes whether we are using v1. (v1->nospace_cache, v1->v2,
nospace_cache->v1, v2->v1).

Since the mechanism for writing out the cache generation is transaction
commit, but we want some finer grained control over when we un-set it,
we can't just rely on the SPACE_CACHE mount option, and introduce an
fs_info flag that mount can use when it wants to unset the generation.

Signed-off-by: Boris Burkov <boris@bur.io>
---
 fs/btrfs/ctree.h            |  3 +++
 fs/btrfs/disk-io.c          |  7 +++++++
 fs/btrfs/free-space-cache.c | 28 ++++++++++++++++++++++++++++
 fs/btrfs/free-space-cache.h |  3 +++
 fs/btrfs/super.c            | 10 +++++++---
 fs/btrfs/transaction.c      |  2 ++
 6 files changed, 50 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 8a83bce3225c..a99918b7a2af 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -559,6 +559,9 @@ enum {
 
 	/* Indicate that the discard workqueue can service discards. */
 	BTRFS_FS_DISCARD_RUNNING,
+
+	/* Indicate that we need to cleanup space cache v1 */
+	BTRFS_FS_CLEANUP_SPACE_CACHE_V1,
 };
 
 /*
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index d6b73701563c..a256b1a064a8 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2965,6 +2965,12 @@ int btrfs_mount_rw(struct btrfs_fs_info *fs_info)
 		}
 	}
 
+	if (cache_opt != btrfs_free_space_cache_v1_active(fs_info)) {
+		ret = btrfs_set_free_space_cache_v1_active(fs_info, cache_opt);
+		if (ret)
+			goto out;
+	}
+
 	ret = btrfs_resume_balance_async(fs_info);
 	if (ret)
 		goto out;
@@ -3426,6 +3432,7 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
 			return ret;
 		}
 	}
+
 	set_bit(BTRFS_FS_OPEN, &fs_info->flags);
 
 clear_oneshot:
diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index 5ea36a06e514..5185e798cc57 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -3992,6 +3992,34 @@ int btrfs_write_out_ino_cache(struct btrfs_root *root,
 	return ret;
 }
 
+bool btrfs_free_space_cache_v1_active(struct btrfs_fs_info *fs_info)
+{
+	return btrfs_super_cache_generation(fs_info->super_copy);
+}
+
+int btrfs_set_free_space_cache_v1_active(struct btrfs_fs_info *fs_info,
+					 bool active)
+{
+	struct btrfs_trans_handle *trans;
+	int ret;
+
+	/*
+	 * update_super_roots will appropriately set
+	 * fs_info->super_copy->cache_generation based on the SPACE_CACHE
+	 * option, so all we have to do is trigger a transaction commit.
+	 */
+	trans = btrfs_start_transaction(fs_info->tree_root, 0);
+	if (IS_ERR(trans))
+		return PTR_ERR(trans);
+
+	if (!active)
+		set_bit(BTRFS_FS_CLEANUP_SPACE_CACHE_V1, &fs_info->flags);
+
+	ret = btrfs_commit_transaction(trans);
+	clear_bit(BTRFS_FS_CLEANUP_SPACE_CACHE_V1, &fs_info->flags);
+	return ret;
+}
+
 #ifdef CONFIG_BTRFS_FS_RUN_SANITY_TESTS
 /*
  * Use this if you need to make a bitmap or extent entry specifically, it
diff --git a/fs/btrfs/free-space-cache.h b/fs/btrfs/free-space-cache.h
index e3d5e0ad8f8e..5c546898ded9 100644
--- a/fs/btrfs/free-space-cache.h
+++ b/fs/btrfs/free-space-cache.h
@@ -148,6 +148,9 @@ int btrfs_trim_block_group_bitmaps(struct btrfs_block_group *block_group,
 				   u64 *trimmed, u64 start, u64 end, u64 minlen,
 				   u64 maxlen, bool async);
 
+bool btrfs_free_space_cache_v1_active(struct btrfs_fs_info *fs_info);
+int btrfs_set_free_space_cache_v1_active(struct btrfs_fs_info *fs_info,
+					 bool active);
 /* Support functions for running our sanity tests */
 #ifdef CONFIG_BTRFS_FS_RUN_SANITY_TESTS
 int test_add_free_space_entry(struct btrfs_block_group *cache,
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 3d56b98e152d..95465c5a3166 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -547,7 +547,6 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
 {
 	substring_t args[MAX_OPT_ARGS];
 	char *p, *num;
-	u64 cache_gen;
 	int intarg;
 	int ret = 0;
 	char *compress_type;
@@ -557,10 +556,9 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
 	bool saved_compress_force;
 	int no_compress = 0;
 
-	cache_gen = btrfs_super_cache_generation(info->super_copy);
 	if (btrfs_fs_compat_ro(info, FREE_SPACE_TREE))
 		btrfs_set_opt(info->mount_opt, FREE_SPACE_TREE);
-	else if (cache_gen)
+	else if (btrfs_free_space_cache_v1_active(info))
 		btrfs_set_opt(info->mount_opt, SPACE_CACHE);
 
 	/*
@@ -1859,6 +1857,8 @@ static inline void btrfs_remount_begin(struct btrfs_fs_info *fs_info,
 static inline void btrfs_remount_cleanup(struct btrfs_fs_info *fs_info,
 					 unsigned long old_opts)
 {
+	bool cache_opt = btrfs_test_opt(fs_info, SPACE_CACHE);
+
 	/*
 	 * We need to cleanup all defragable inodes if the autodefragment is
 	 * close or the filesystem is read only.
@@ -1875,6 +1875,10 @@ static inline void btrfs_remount_cleanup(struct btrfs_fs_info *fs_info,
 	else if (btrfs_raw_test_opt(old_opts, DISCARD_ASYNC) &&
 		 !btrfs_test_opt(fs_info, DISCARD_ASYNC))
 		btrfs_discard_cleanup(fs_info);
+
+	/* If we toggled space cache */
+	if (cache_opt != btrfs_free_space_cache_v1_active(fs_info))
+		btrfs_set_free_space_cache_v1_active(fs_info, cache_opt);
 }
 
 static int btrfs_remount(struct super_block *sb, int *flags, char *data)
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 8f70d7135497..0e85482b6713 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -1761,6 +1761,8 @@ static void update_super_roots(struct btrfs_fs_info *fs_info)
 	super->root_level = root_item->level;
 	if (btrfs_test_opt(fs_info, SPACE_CACHE))
 		super->cache_generation = root_item->generation;
+	else if (test_bit(BTRFS_FS_CLEANUP_SPACE_CACHE_V1, &fs_info->flags))
+		super->cache_generation = 0;
 	if (test_bit(BTRFS_FS_UPDATE_UUID_TREE_GEN, &fs_info->flags))
 		super->uuid_tree_generation = root_item->generation;
 }
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v6 07/10] btrfs: use sb state to print space_cache mount option
  2020-10-29 23:57 [PATCH v6 00/10] btrfs: free space tree mounting fixes Boris Burkov
                   ` (5 preceding siblings ...)
  2020-10-29 23:57 ` [PATCH v6 06/10] btrfs: keep sb cache_generation consistent with space_cache Boris Burkov
@ 2020-10-29 23:57 ` Boris Burkov
  2020-10-29 23:57 ` [PATCH v6 08/10] btrfs: warn when remount will not change the free space tree Boris Burkov
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Boris Burkov @ 2020-10-29 23:57 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

To make the contents of /proc/mounts better match the actual state of
the file system, base the display of the space cache mount options off
the contents of the super block rather than the last mount options
passed in. Since there are many scenarios where the mount will ignore a
space cache option, simply showing the passed in option is misleading.

For example, if we mount with -o remount,space_cache=v2 on a read-write
file system without an existing free space tree, we won't build a free
space tree, but /proc/mounts will read space_cache=v2 (until we mount
again and it goes away)

cache_generation is set iff space_cache=v1, FREE_SPACE_TREE is set iff
space_cache=v2, and if neither is the case, we print nospace_cache.

Signed-off-by: Boris Burkov <boris@bur.io>
---
 fs/btrfs/super.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 95465c5a3166..527ab305ac68 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1476,9 +1476,9 @@ static int btrfs_show_options(struct seq_file *seq, struct dentry *dentry)
 		seq_puts(seq, ",discard=async");
 	if (!(info->sb->s_flags & SB_POSIXACL))
 		seq_puts(seq, ",noacl");
-	if (btrfs_test_opt(info, SPACE_CACHE))
+	if (btrfs_free_space_cache_v1_active(info))
 		seq_puts(seq, ",space_cache");
-	else if (btrfs_test_opt(info, FREE_SPACE_TREE))
+	else if (btrfs_fs_compat_ro(info, FREE_SPACE_TREE))
 		seq_puts(seq, ",space_cache=v2");
 	else
 		seq_puts(seq, ",nospace_cache");
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v6 08/10] btrfs: warn when remount will not change the free space tree
  2020-10-29 23:57 [PATCH v6 00/10] btrfs: free space tree mounting fixes Boris Burkov
                   ` (6 preceding siblings ...)
  2020-10-29 23:57 ` [PATCH v6 07/10] btrfs: use sb state to print space_cache mount option Boris Burkov
@ 2020-10-29 23:57 ` Boris Burkov
  2020-10-29 23:57 ` [PATCH v6 09/10] btrfs: remove free space items when disabling space cache v1 Boris Burkov
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Boris Burkov @ 2020-10-29 23:57 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

If the remount is ro->ro, rw->ro, or rw->rw, we will not create or
clear the free space tree. This can be surprising, so print a warning
to dmesg to make the failure more visible. It is also important to
ensure that the space cache options (SPACE_CACHE, FREE_SPACE_TREE) are
consistent, so ensure those are set to properly match the current on
disk state (which won't be changing).

Signed-off-by: Boris Burkov <boris@bur.io>
---
 fs/btrfs/super.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 527ab305ac68..ed5c80f92f78 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1914,6 +1914,24 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data)
 	btrfs_resize_thread_pool(fs_info,
 		fs_info->thread_pool_size, old_thread_pool_size);
 
+	if (btrfs_test_opt(fs_info, FREE_SPACE_TREE) !=
+	    btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE) &&
+	    ((!sb_rdonly(sb) || *flags & SB_RDONLY))) {
+		btrfs_warn(fs_info,
+	   "remount supports changing free space tree only from ro to rw");
+		/*
+		 * Make sure free space cache options match the state on disk
+		 */
+		if (btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE)) {
+			btrfs_set_opt(fs_info->mount_opt, FREE_SPACE_TREE);
+			btrfs_clear_opt(fs_info->mount_opt, SPACE_CACHE);
+		}
+		if (btrfs_free_space_cache_v1_active(fs_info)) {
+			btrfs_clear_opt(fs_info->mount_opt, FREE_SPACE_TREE);
+			btrfs_set_opt(fs_info->mount_opt, SPACE_CACHE);
+		}
+	}
+
 	if ((bool)(*flags & SB_RDONLY) == sb_rdonly(sb))
 		goto out;
 
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v6 09/10] btrfs: remove free space items when disabling space cache v1
  2020-10-29 23:57 [PATCH v6 00/10] btrfs: free space tree mounting fixes Boris Burkov
                   ` (7 preceding siblings ...)
  2020-10-29 23:57 ` [PATCH v6 08/10] btrfs: warn when remount will not change the free space tree Boris Burkov
@ 2020-10-29 23:57 ` Boris Burkov
  2020-10-29 23:57 ` [PATCH v6 10/10] btrfs: skip space_cache v1 setup when not using it Boris Burkov
  2020-11-06 16:39 ` [PATCH v6 00/10] btrfs: free space tree mounting fixes Josef Bacik
  10 siblings, 0 replies; 12+ messages in thread
From: Boris Burkov @ 2020-10-29 23:57 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

When the file system transitions from space cache v1 to v2 or to
nospace_cache, it removes the old cached data, but does not remove
the FREE_SPACE items nor the free space inodes they point to. This
doesn't cause any issues besides being a bit inefficient, since these
items no longer do anything useful.

To fix it, when we are mounting, and plan to disable the space cache,
destroy each block group's free space item and free space inode.
The code to remove the items is lifted from the existing use case of
removing the block group, with a light adaptation to handle whether or
not we have already looked up the free space inode.

References: https://github.com/btrfs/btrfs-todo/issues/5
Signed-off-by: Boris Burkov <boris@bur.io>
---
 fs/btrfs/block-group.c      |  39 +-------------
 fs/btrfs/free-space-cache.c | 101 ++++++++++++++++++++++++++++++++++--
 fs/btrfs/free-space-cache.h |   3 ++
 3 files changed, 102 insertions(+), 41 deletions(-)

diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index bb6685711824..2dde5c6f1da8 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -892,8 +892,6 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans,
 	struct btrfs_path *path;
 	struct btrfs_block_group *block_group;
 	struct btrfs_free_cluster *cluster;
-	struct btrfs_root *tree_root = fs_info->tree_root;
-	struct btrfs_key key;
 	struct inode *inode;
 	struct kobject *kobj = NULL;
 	int ret;
@@ -971,42 +969,9 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans,
 	spin_unlock(&trans->transaction->dirty_bgs_lock);
 	mutex_unlock(&trans->transaction->cache_write_mutex);
 
-	if (!IS_ERR(inode)) {
-		ret = btrfs_orphan_add(trans, BTRFS_I(inode));
-		if (ret) {
-			btrfs_add_delayed_iput(inode);
-			goto out;
-		}
-		clear_nlink(inode);
-		/* One for the block groups ref */
-		spin_lock(&block_group->lock);
-		if (block_group->iref) {
-			block_group->iref = 0;
-			block_group->inode = NULL;
-			spin_unlock(&block_group->lock);
-			iput(inode);
-		} else {
-			spin_unlock(&block_group->lock);
-		}
-		/* One for our lookup ref */
-		btrfs_add_delayed_iput(inode);
-	}
-
-	key.objectid = BTRFS_FREE_SPACE_OBJECTID;
-	key.type = 0;
-	key.offset = block_group->start;
-
-	ret = btrfs_search_slot(trans, tree_root, &key, path, -1, 1);
-	if (ret < 0)
+	ret = btrfs_remove_free_space_inode(trans, inode, block_group);
+	if (ret)
 		goto out;
-	if (ret > 0)
-		btrfs_release_path(path);
-	if (ret == 0) {
-		ret = btrfs_del_item(trans, tree_root, path);
-		if (ret)
-			goto out;
-		btrfs_release_path(path);
-	}
 
 	spin_lock(&fs_info->block_group_cache_lock);
 	rb_erase(&block_group->cache_node,
diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index 5185e798cc57..660cc722ef3f 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -207,6 +207,65 @@ int create_free_space_inode(struct btrfs_trans_handle *trans,
 					 ino, block_group->start);
 }
 
+/*
+ * inode is an optional sink: if it is NULL, btrfs_remove_free_space_inode
+ * handles lookup, otherwise it takes ownership and iputs the inode.
+ * Don't reuse an inode pointer after passing it into this function.
+ */
+int btrfs_remove_free_space_inode(struct btrfs_trans_handle *trans,
+				  struct inode *inode,
+				  struct btrfs_block_group *block_group)
+{
+	struct btrfs_path *path;
+	struct btrfs_key key;
+	int ret = 0;
+
+	path = btrfs_alloc_path();
+	if (!path)
+		return -ENOMEM;
+
+	if (!inode)
+		inode = lookup_free_space_inode(block_group, path);
+	if (IS_ERR(inode)) {
+		if (PTR_ERR(inode) != -ENOENT)
+			ret = PTR_ERR(inode);
+		goto out;
+	}
+	ret = btrfs_orphan_add(trans, BTRFS_I(inode));
+	if (ret) {
+		btrfs_add_delayed_iput(inode);
+		goto out;
+	}
+	clear_nlink(inode);
+	/* One for the block groups ref */
+	spin_lock(&block_group->lock);
+	if (block_group->iref) {
+		block_group->iref = 0;
+		block_group->inode = NULL;
+		spin_unlock(&block_group->lock);
+		iput(inode);
+	} else {
+		spin_unlock(&block_group->lock);
+	}
+	/* One for the lookup ref */
+	btrfs_add_delayed_iput(inode);
+
+	key.objectid = BTRFS_FREE_SPACE_OBJECTID;
+	key.type = 0;
+	key.offset = block_group->start;
+	ret = btrfs_search_slot(trans, trans->fs_info->tree_root, &key, path,
+				-1, 1);
+	if (ret) {
+		if (ret > 0)
+			ret = 0;
+		goto out;
+	}
+	ret = btrfs_del_item(trans, trans->fs_info->tree_root, path);
+out:
+	btrfs_free_path(path);
+	return ret;
+}
+
 int btrfs_check_trunc_cache_free_space(struct btrfs_fs_info *fs_info,
 				       struct btrfs_block_rsv *rsv)
 {
@@ -3997,6 +4056,28 @@ bool btrfs_free_space_cache_v1_active(struct btrfs_fs_info *fs_info)
 	return btrfs_super_cache_generation(fs_info->super_copy);
 }
 
+static int cleanup_free_space_cache_v1(struct btrfs_fs_info *fs_info,
+				       struct btrfs_trans_handle *trans)
+{
+	struct btrfs_block_group *block_group;
+	struct rb_node *node;
+	int ret;
+
+	btrfs_info(fs_info, "cleaning free space cache v1");
+
+	node = rb_first(&fs_info->block_group_cache_tree);
+	while (node) {
+		block_group = rb_entry(node, struct btrfs_block_group,
+				       cache_node);
+		ret = btrfs_remove_free_space_inode(trans, NULL, block_group);
+		if (ret)
+			goto out;
+		node = rb_next(node);
+	}
+out:
+	return ret;
+}
+
 int btrfs_set_free_space_cache_v1_active(struct btrfs_fs_info *fs_info,
 					 bool active)
 {
@@ -4004,18 +4085,30 @@ int btrfs_set_free_space_cache_v1_active(struct btrfs_fs_info *fs_info,
 	int ret;
 
 	/*
-	 * update_super_roots will appropriately set
-	 * fs_info->super_copy->cache_generation based on the SPACE_CACHE
-	 * option, so all we have to do is trigger a transaction commit.
+	 * update_super_roots will appropriately set or unset
+	 * fs_info->super_copy->cache_generation based on SPACE_CACHE and
+	 * BTRFS_FS_CLEANUP_SPACE_CACHE_V1. For this reason, we need a
+	 * transaction commit whether we are enabling space cache v1 and don't
+	 * have any other work to do, or are disabling it and removing free
+	 * space inodes.
 	 */
 	trans = btrfs_start_transaction(fs_info->tree_root, 0);
 	if (IS_ERR(trans))
 		return PTR_ERR(trans);
 
-	if (!active)
+	if (!active) {
 		set_bit(BTRFS_FS_CLEANUP_SPACE_CACHE_V1, &fs_info->flags);
+		ret = cleanup_free_space_cache_v1(fs_info, trans);
+		if (ret)
+			goto abort;
+	}
 
 	ret = btrfs_commit_transaction(trans);
+	goto out;
+abort:
+	btrfs_abort_transaction(trans, ret);
+	btrfs_end_transaction(trans);
+out:
 	clear_bit(BTRFS_FS_CLEANUP_SPACE_CACHE_V1, &fs_info->flags);
 	return ret;
 }
diff --git a/fs/btrfs/free-space-cache.h b/fs/btrfs/free-space-cache.h
index 5c546898ded9..8df4b2925eca 100644
--- a/fs/btrfs/free-space-cache.h
+++ b/fs/btrfs/free-space-cache.h
@@ -84,6 +84,9 @@ struct inode *lookup_free_space_inode(struct btrfs_block_group *block_group,
 int create_free_space_inode(struct btrfs_trans_handle *trans,
 			    struct btrfs_block_group *block_group,
 			    struct btrfs_path *path);
+int btrfs_remove_free_space_inode(struct btrfs_trans_handle *trans,
+				  struct inode *inode,
+				  struct btrfs_block_group *block_group);
 
 int btrfs_check_trunc_cache_free_space(struct btrfs_fs_info *fs_info,
 				       struct btrfs_block_rsv *rsv);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v6 10/10] btrfs: skip space_cache v1 setup when not using it
  2020-10-29 23:57 [PATCH v6 00/10] btrfs: free space tree mounting fixes Boris Burkov
                   ` (8 preceding siblings ...)
  2020-10-29 23:57 ` [PATCH v6 09/10] btrfs: remove free space items when disabling space cache v1 Boris Burkov
@ 2020-10-29 23:57 ` Boris Burkov
  2020-11-06 16:39 ` [PATCH v6 00/10] btrfs: free space tree mounting fixes Josef Bacik
  10 siblings, 0 replies; 12+ messages in thread
From: Boris Burkov @ 2020-10-29 23:57 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

If we are not using space cache v1, we should not create the free space
object or free space inodes. This comes up when we delete the existing
free space objects/inodes when migrating to v2, only to see them get
recreated for every dirtied block group.

Signed-off-by: Boris Burkov <boris@bur.io>
---
 fs/btrfs/block-group.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 2dde5c6f1da8..13a16158f4c7 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -2374,6 +2374,9 @@ static int cache_save_setup(struct btrfs_block_group *block_group,
 	int retries = 0;
 	int ret = 0;
 
+	if (!btrfs_test_opt(fs_info, SPACE_CACHE))
+		return 0;
+
 	/*
 	 * If this block group is smaller than 100 megs don't bother caching the
 	 * block group.
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v6 00/10] btrfs: free space tree mounting fixes
  2020-10-29 23:57 [PATCH v6 00/10] btrfs: free space tree mounting fixes Boris Burkov
                   ` (9 preceding siblings ...)
  2020-10-29 23:57 ` [PATCH v6 10/10] btrfs: skip space_cache v1 setup when not using it Boris Burkov
@ 2020-11-06 16:39 ` Josef Bacik
  10 siblings, 0 replies; 12+ messages in thread
From: Josef Bacik @ 2020-11-06 16:39 UTC (permalink / raw)
  To: Boris Burkov, linux-btrfs, kernel-team

On 10/29/20 7:57 PM, Boris Burkov wrote:
> This patch set cleans up issues surrounding enabling and disabling various
> free space cache features and their reporting in /proc/mounts.  Because the
> improvements became somewhat complex, the series starts by lifting rw mount
> logic into a single place.
> 
> The first patch is a setup patch that unifies very similar logic between a
> normal rw mount and a ro->rw remount. This is a useful setup step for adding
> more functionality to ro->rw remounts.
> 
> The second patch fixes the omission of orphan inode cleanup on a few trees
> during ro->rw remount.
> 
> The third patch adds enabling the free space tree to ro->rw remount.
> 
> The fourth patch adds a method for clearing oneshot mount options after mount.
> 
> The fifth patch adds support for clearing the free space tree on ro->rw remount.
> 
> The sixth patch sets up for more accurate /proc/mounts by ensuring that
> cache_generation > 0 iff space_cache is enabled.
> 
> The seventh patch is the more accurate /proc/mounts logic.
> 
> The eighth patch is a convenience kernel message that complains when we skip
> changing the free space tree on remount.
> 
> The ninth patch removes the space cache v1 free space item and free space
> inodes when space cache v1 is disabled (nospace_cache or space_cache=v2).
> 
> The tenth patch stops re-creating the free space objects when we are not
> using space_cache=v1
> 

Looks good, you can add

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

To the series, thanks,

Josef

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-11-06 16:39 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-29 23:57 [PATCH v6 00/10] btrfs: free space tree mounting fixes Boris Burkov
2020-10-29 23:57 ` [PATCH v6 01/10] btrfs: lift rw mount setup from mount and remount Boris Burkov
2020-10-29 23:57 ` [PATCH v6 02/10] btrfs: cleanup all orphan inodes on ro->rw remount Boris Burkov
2020-10-29 23:57 ` [PATCH v6 03/10] btrfs: create free space tree " Boris Burkov
2020-10-29 23:57 ` [PATCH v6 04/10] btrfs: clear oneshot options on mount and remount Boris Burkov
2020-10-29 23:57 ` [PATCH v6 05/10] btrfs: clear free space tree on ro->rw remount Boris Burkov
2020-10-29 23:57 ` [PATCH v6 06/10] btrfs: keep sb cache_generation consistent with space_cache Boris Burkov
2020-10-29 23:57 ` [PATCH v6 07/10] btrfs: use sb state to print space_cache mount option Boris Burkov
2020-10-29 23:57 ` [PATCH v6 08/10] btrfs: warn when remount will not change the free space tree Boris Burkov
2020-10-29 23:57 ` [PATCH v6 09/10] btrfs: remove free space items when disabling space cache v1 Boris Burkov
2020-10-29 23:57 ` [PATCH v6 10/10] btrfs: skip space_cache v1 setup when not using it Boris Burkov
2020-11-06 16:39 ` [PATCH v6 00/10] btrfs: free space tree mounting fixes Josef Bacik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).