All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] btrfs: qgroup: address the performance penalty for subvolume dropping
@ 2021-08-31  9:48 Qu Wenruo
  2021-08-31  9:48 ` [PATCH 1/5] btrfs: sysfs: introduce qgroup global attribute groups Qu Wenruo
                   ` (5 more replies)
  0 siblings, 6 replies; 11+ messages in thread
From: Qu Wenruo @ 2021-08-31  9:48 UTC (permalink / raw)
  To: linux-btrfs

Btrfs qgroup has a long history of bringing huge performance penalty,
from subvolume dropping to balance.

Although we solved the problem for balance, but the subvolume dropping
problem is still unresolved, as we really need to do all the costly
backref for all the involved subtrees, or qgroup numbers will be
inconsistent.

But the performance penalty is sometimes too big, so big that it's
better just to disable qgroup, do the drop, then do the rescan.

This patchset will address the problem by introducing a user
configurable sysfs interface, to allow certain high subtree dropping to
mark qgroup inconsistent, and skip the whole accounting.

The following things are needed for this objective:

- New qgroups attributes

  Instead of plain qgroup kobjects, we need extra attributes like
  drop_subtree_threshold.

  This patchset will introduce two new attributes to the existing
  qgroups kobject:
  * qgroups_flags
    To indicate the qgroup status flags like ON, RESCAN, INCONSISTENT.

  * drop_subtree_threshold
    To show the subtree dropping level threshold.
    The default value is BTRFS_MAX_LEVEL (8), which means all subtree
    dropping will go through the qgroup accounting, while costly it will
    try to keep qgroup numbers as consistent as possible.

    Users can specify values like 3, meaning any subtree which is at
    level 3 or higher will mark qgroup inconsistent and skip all the
    costly accounting.

    This only affects subvolume dropping.

- Skip qgroup accounting when the numbers are already inconsistent

  But still keeps the qgroup relationship correct, thus users can keep
  its qgroup organization while do the rescan later.


This sysfs interface needs user space tools to monitor and set the
values for each btrfs.

Currently the target user space tool is snapper, which by default
utilizes qgroups for its space-aware snapshots reclaim mechanism.

Qu Wenruo (5):
  btrfs: sysfs: introduce qgroup global attribute groups
  btrfs: introduce BTRFS_QGROUP_STATUS_FLAGS_MASK for later expansion
  btrfs: introduce BTRFS_QGROUP_RUNTIME_FLAG_CANCEL_RESCAN
  btrfs: introduce BTRFS_QGROUP_RUNTIME_FLAG_NO_ACCOUNTING to skip
    qgroup accounting
  btrfs: skip subtree scan if it's too high to avoid low stall in
    btrfs_commit_transaction()

 fs/btrfs/ctree.h                |   1 +
 fs/btrfs/disk-io.c              |   1 +
 fs/btrfs/qgroup.c               |  87 +++++++++++++++++++-------
 fs/btrfs/qgroup.h               |   3 +
 fs/btrfs/sysfs.c                | 106 ++++++++++++++++++++++++++++++--
 include/uapi/linux/btrfs_tree.h |   4 ++
 6 files changed, 176 insertions(+), 26 deletions(-)

-- 
2.33.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/5] btrfs: sysfs: introduce qgroup global attribute groups
  2021-08-31  9:48 [PATCH 0/5] btrfs: qgroup: address the performance penalty for subvolume dropping Qu Wenruo
@ 2021-08-31  9:48 ` Qu Wenruo
  2021-09-02 16:28   ` David Sterba
  2021-08-31  9:49 ` [PATCH 2/5] btrfs: introduce BTRFS_QGROUP_STATUS_FLAGS_MASK for later expansion Qu Wenruo
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 11+ messages in thread
From: Qu Wenruo @ 2021-08-31  9:48 UTC (permalink / raw)
  To: linux-btrfs

Although we already have info kobject for each qgroup, we don't have
global qgroup info attributes to show things like qgroup flags.

Add this qgroups attribute groups, and the first member is qgroup_flags,
which is a read-only attribute to show human readable qgroup flags.

The path is:
 /sys/fs/btrfs/<uuid>/qgroups/qgroup_flags

The output would look like:
 ON | INCONSISTENT

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/sysfs.c | 67 +++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 63 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 25a6f587852b..72edc6011d01 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -1825,6 +1825,62 @@ int btrfs_sysfs_add_mounted(struct btrfs_fs_info *fs_info)
 	return error;
 }
 
+static ssize_t qgroup_flags_show(struct kobject *qgroups_kobj,
+				 struct kobj_attribute *a,
+				 char *buf)
+{
+	struct btrfs_fs_info *fs_info = to_fs_info(qgroups_kobj->parent);
+	u64 qgroup_flags;
+	bool first = true;
+	int ret = 0;
+
+	spin_lock(&fs_info->qgroup_lock);
+	qgroup_flags = fs_info->qgroup_flags & BTRFS_QGROUP_STATUS_FLAGS_MASK;
+	spin_unlock(&fs_info->qgroup_lock);
+
+	if (qgroup_flags & BTRFS_QGROUP_STATUS_FLAG_ON) {
+		ret += scnprintf(buf + ret, PAGE_SIZE - ret, "%s%s",
+				(first ? "" : " | "), "ON");
+		first = false;
+	}
+	if (qgroup_flags & BTRFS_QGROUP_STATUS_FLAG_RESCAN) {
+		ret += scnprintf(buf + ret, PAGE_SIZE - ret, "%s%s",
+				(first ? "" : " | "), "RESCAN");
+		first = false;
+	}
+	if (qgroup_flags & BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT) {
+		ret += scnprintf(buf + ret, PAGE_SIZE - ret, "%s%s",
+				(first ? "" : " | "), "INCONSISTENT");
+		first = false;
+	}
+	ret += scnprintf(buf + ret, PAGE_SIZE - ret, "\n");
+	return ret;
+}
+
+BTRFS_ATTR(qgroups, qgroup_flags, qgroup_flags_show);
+
+/*
+ * Qgroups global info
+ *
+ * Path: /sys/fs/btrfs/<uuid>/qgroups/
+ */
+static struct attribute *qgroups_attrs[] = {
+	BTRFS_ATTR_PTR(qgroups, qgroup_flags),
+	NULL
+};
+ATTRIBUTE_GROUPS(qgroups);
+
+static void qgroups_release(struct kobject *kobj)
+{
+	kfree(kobj);
+}
+
+static struct kobj_type qgroups_ktype = {
+	.sysfs_ops = &kobj_sysfs_ops,
+	.default_groups = qgroups_groups,
+	.release = qgroups_release,
+};
+
 static inline struct btrfs_fs_info *qgroup_kobj_to_fs_info(struct kobject *kobj)
 {
 	return to_fs_info(kobj->parent->parent);
@@ -1950,11 +2006,14 @@ int btrfs_sysfs_add_qgroups(struct btrfs_fs_info *fs_info)
 	if (fs_info->qgroups_kobj)
 		return 0;
 
-	fs_info->qgroups_kobj = kobject_create_and_add("qgroups", fsid_kobj);
-	if (!fs_info->qgroups_kobj) {
-		ret = -ENOMEM;
+	fs_info->qgroups_kobj = kmalloc(sizeof(struct kobject), GFP_KERNEL);
+	if (!fs_info->qgroups_kobj)
+		return -ENOMEM;
+
+	ret = kobject_init_and_add(fs_info->qgroups_kobj, &qgroups_ktype,
+				   fsid_kobj, "qgroups");
+	if (ret < 0)
 		goto out;
-	}
 	rbtree_postorder_for_each_entry_safe(qgroup, next,
 					     &fs_info->qgroup_tree, node) {
 		ret = btrfs_sysfs_add_one_qgroup(fs_info, qgroup);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/5] btrfs: introduce BTRFS_QGROUP_STATUS_FLAGS_MASK for later expansion
  2021-08-31  9:48 [PATCH 0/5] btrfs: qgroup: address the performance penalty for subvolume dropping Qu Wenruo
  2021-08-31  9:48 ` [PATCH 1/5] btrfs: sysfs: introduce qgroup global attribute groups Qu Wenruo
@ 2021-08-31  9:49 ` Qu Wenruo
  2021-08-31  9:49 ` [PATCH 3/5] btrfs: introduce BTRFS_QGROUP_RUNTIME_FLAG_CANCEL_RESCAN Qu Wenruo
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Qu Wenruo @ 2021-08-31  9:49 UTC (permalink / raw)
  To: linux-btrfs

Currently we only have 3 qgroup flags:
- BTRFS_QGROUP_STATUS_FLAG_ON
- BTRFS_QGROUP_STATUS_FLAG_RESCAN
- BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT

These flags matches the on-disk flags used in btrfs_qgroup_status.

But we're going to introduce extra runtime flags which will not reach
disks.

So here we introduce a new mask, BTRFS_QGROUP_STATUS_FLAGS_MASK, to
make sure only those flags can reach disks.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/qgroup.c               | 6 ++++--
 include/uapi/linux/btrfs_tree.h | 4 ++++
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index db680f5be745..9babf925bd59 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -867,7 +867,8 @@ static int update_qgroup_status_item(struct btrfs_trans_handle *trans)
 	l = path->nodes[0];
 	slot = path->slots[0];
 	ptr = btrfs_item_ptr(l, slot, struct btrfs_qgroup_status_item);
-	btrfs_set_qgroup_status_flags(l, ptr, fs_info->qgroup_flags);
+	btrfs_set_qgroup_status_flags(l, ptr, fs_info->qgroup_flags &
+				      BTRFS_QGROUP_STATUS_FLAGS_MASK);
 	btrfs_set_qgroup_status_generation(l, ptr, trans->transid);
 	btrfs_set_qgroup_status_rescan(l, ptr,
 				fs_info->qgroup_rescan_progress.objectid);
@@ -1027,7 +1028,8 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info)
 	btrfs_set_qgroup_status_version(leaf, ptr, BTRFS_QGROUP_STATUS_VERSION);
 	fs_info->qgroup_flags = BTRFS_QGROUP_STATUS_FLAG_ON |
 				BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT;
-	btrfs_set_qgroup_status_flags(leaf, ptr, fs_info->qgroup_flags);
+	btrfs_set_qgroup_status_flags(leaf, ptr, fs_info->qgroup_flags &
+				      BTRFS_QGROUP_STATUS_FLAGS_MASK);
 	btrfs_set_qgroup_status_rescan(leaf, ptr, 0);
 
 	btrfs_mark_buffer_dirty(leaf);
diff --git a/include/uapi/linux/btrfs_tree.h b/include/uapi/linux/btrfs_tree.h
index e1c4c732aaba..86dbd13a88f3 100644
--- a/include/uapi/linux/btrfs_tree.h
+++ b/include/uapi/linux/btrfs_tree.h
@@ -973,6 +973,10 @@ static inline __u16 btrfs_qgroup_level(__u64 qgroupid)
  */
 #define BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT	(1ULL << 2)
 
+#define BTRFS_QGROUP_STATUS_FLAGS_MASK	(BTRFS_QGROUP_STATUS_FLAG_ON |\
+					 BTRFS_QGROUP_STATUS_FLAG_RESCAN |\
+					 BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT)
+
 #define BTRFS_QGROUP_STATUS_VERSION        1
 
 struct btrfs_qgroup_status_item {
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/5] btrfs: introduce BTRFS_QGROUP_RUNTIME_FLAG_CANCEL_RESCAN
  2021-08-31  9:48 [PATCH 0/5] btrfs: qgroup: address the performance penalty for subvolume dropping Qu Wenruo
  2021-08-31  9:48 ` [PATCH 1/5] btrfs: sysfs: introduce qgroup global attribute groups Qu Wenruo
  2021-08-31  9:49 ` [PATCH 2/5] btrfs: introduce BTRFS_QGROUP_STATUS_FLAGS_MASK for later expansion Qu Wenruo
@ 2021-08-31  9:49 ` Qu Wenruo
  2021-08-31  9:49 ` [PATCH 4/5] btrfs: introduce BTRFS_QGROUP_RUNTIME_FLAG_NO_ACCOUNTING to skip qgroup accounting Qu Wenruo
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Qu Wenruo @ 2021-08-31  9:49 UTC (permalink / raw)
  To: linux-btrfs

Here we introduce a new runtime flag,
BTRFS_QGROUP_RUNTIME_FLAG_CANCEL_RESCAN, which will inform qgroup rescan
to cancel its work asynchronously.

This is to address the window when an operation makes qgroup numbers
inconsistent (like qgroup inheriting) while a qgroup rescan is running.

In that case, qgroup inconsistent flag will be cleared when qgroup
rescan finishes.
But we changed the ownership of some extents, which means the rescan is
already meaningless, and the qgroup inconsistent flag should not be
cleared.

With the new flag, each time we set INCONSISTENT flag, we also set this
new flag to inform any running qgroup rescan to exit immediately, and
leaving the INCONSISTENT flag there.

The new runtime flag can only be cleared when a new rescan is started.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/qgroup.c | 49 ++++++++++++++++++++++++++++++-----------------
 fs/btrfs/qgroup.h |  2 ++
 2 files changed, 33 insertions(+), 18 deletions(-)

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 9babf925bd59..91657ba1ebd1 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -322,6 +322,14 @@ int btrfs_verify_qgroup_counts(struct btrfs_fs_info *fs_info, u64 qgroupid,
 }
 #endif
 
+static void qgroup_mark_inconsistent(struct btrfs_fs_info *fs_info)
+{
+	BUILD_BUG_ON(BTRFS_QGROUP_RUNTIME_FLAG_CANCEL_RESCAN &
+		     BTRFS_QGROUP_STATUS_FLAGS_MASK);
+	fs_info->qgroup_flags |= (BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT |
+				  BTRFS_QGROUP_RUNTIME_FLAG_CANCEL_RESCAN);
+}
+
 /*
  * The full config is read in one go, only called from open_ctree()
  * It doesn't use any locking, as at this point we're still single-threaded
@@ -390,7 +398,7 @@ int btrfs_read_qgroup_config(struct btrfs_fs_info *fs_info)
 			}
 			if (btrfs_qgroup_status_generation(l, ptr) !=
 			    fs_info->generation) {
-				flags |= BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT;
+				qgroup_mark_inconsistent(fs_info);
 				btrfs_err(fs_info,
 					"qgroup generation mismatch, marked as inconsistent");
 			}
@@ -408,7 +416,7 @@ int btrfs_read_qgroup_config(struct btrfs_fs_info *fs_info)
 		if ((qgroup && found_key.type == BTRFS_QGROUP_INFO_KEY) ||
 		    (!qgroup && found_key.type == BTRFS_QGROUP_LIMIT_KEY)) {
 			btrfs_err(fs_info, "inconsistent qgroup config");
-			flags |= BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT;
+			qgroup_mark_inconsistent(fs_info);
 		}
 		if (!qgroup) {
 			qgroup = add_qgroup_rb(fs_info, found_key.offset);
@@ -1661,7 +1669,7 @@ int btrfs_limit_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid,
 
 	ret = update_qgroup_limit_item(trans, qgroup);
 	if (ret) {
-		fs_info->qgroup_flags |= BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT;
+		qgroup_mark_inconsistent(fs_info);
 		btrfs_info(fs_info, "unable to update quota limit for %llu",
 		       qgroupid);
 	}
@@ -1737,7 +1745,7 @@ int btrfs_qgroup_trace_extent_post(struct btrfs_trans_handle *trans,
 	ret = btrfs_find_all_roots(NULL, trans->fs_info, bytenr, 0, &old_root,
 				   true);
 	if (ret < 0) {
-		trans->fs_info->qgroup_flags |= BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT;
+		qgroup_mark_inconsistent(trans->fs_info);
 		btrfs_warn(trans->fs_info,
 "error accounting new delayed refs extent (err code: %d), quota inconsistent",
 			ret);
@@ -2213,7 +2221,7 @@ static int qgroup_trace_subtree_swap(struct btrfs_trans_handle *trans,
 out:
 	btrfs_free_path(dst_path);
 	if (ret < 0)
-		fs_info->qgroup_flags |= BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT;
+		qgroup_mark_inconsistent(fs_info);
 	return ret;
 }
 
@@ -2717,12 +2725,10 @@ int btrfs_run_qgroups(struct btrfs_trans_handle *trans)
 		spin_unlock(&fs_info->qgroup_lock);
 		ret = update_qgroup_info_item(trans, qgroup);
 		if (ret)
-			fs_info->qgroup_flags |=
-					BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT;
+			qgroup_mark_inconsistent(fs_info);
 		ret = update_qgroup_limit_item(trans, qgroup);
 		if (ret)
-			fs_info->qgroup_flags |=
-					BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT;
+			qgroup_mark_inconsistent(fs_info);
 		spin_lock(&fs_info->qgroup_lock);
 	}
 	if (test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags))
@@ -2733,7 +2739,7 @@ int btrfs_run_qgroups(struct btrfs_trans_handle *trans)
 
 	ret = update_qgroup_status_item(trans);
 	if (ret)
-		fs_info->qgroup_flags |= BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT;
+		qgroup_mark_inconsistent(fs_info);
 
 	return ret;
 }
@@ -2851,7 +2857,7 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle *trans, u64 srcid,
 
 		ret = update_qgroup_limit_item(trans, dstgroup);
 		if (ret) {
-			fs_info->qgroup_flags |= BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT;
+			qgroup_mark_inconsistent(fs_info);
 			btrfs_info(fs_info,
 				   "unable to update quota limit for %llu",
 				   dstgroup->qgroupid);
@@ -2957,7 +2963,7 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle *trans, u64 srcid,
 	if (!committing)
 		mutex_unlock(&fs_info->qgroup_ioctl_lock);
 	if (need_rescan)
-		fs_info->qgroup_flags |= BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT;
+		qgroup_mark_inconsistent(fs_info);
 	return ret;
 }
 
@@ -3226,7 +3232,8 @@ static int qgroup_rescan_leaf(struct btrfs_trans_handle *trans,
 static bool rescan_should_stop(struct btrfs_fs_info *fs_info)
 {
 	return btrfs_fs_closing(fs_info) ||
-		test_bit(BTRFS_FS_STATE_REMOUNTING, &fs_info->fs_state);
+		test_bit(BTRFS_FS_STATE_REMOUNTING, &fs_info->fs_state) ||
+		fs_info->qgroup_flags & BTRFS_QGROUP_RUNTIME_FLAG_CANCEL_RESCAN;
 }
 
 static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
@@ -3274,7 +3281,8 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
 	if (err > 0 &&
 	    fs_info->qgroup_flags & BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT) {
 		fs_info->qgroup_flags &= ~BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT;
-	} else if (err < 0) {
+	} else if (err < 0 ||
+		   fs_info->qgroup_flags & BTRFS_QGROUP_RUNTIME_FLAG_CANCEL_RESCAN) {
 		fs_info->qgroup_flags |= BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT;
 	}
 	mutex_unlock(&fs_info->qgroup_rescan_lock);
@@ -3293,7 +3301,8 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
 	}
 
 	mutex_lock(&fs_info->qgroup_rescan_lock);
-	if (!stopped)
+	if (!stopped ||
+	    fs_info->qgroup_flags & BTRFS_QGROUP_RUNTIME_FLAG_CANCEL_RESCAN)
 		fs_info->qgroup_flags &= ~BTRFS_QGROUP_STATUS_FLAG_RESCAN;
 	if (trans) {
 		ret = update_qgroup_status_item(trans);
@@ -3304,6 +3313,7 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
 		}
 	}
 	fs_info->qgroup_rescan_running = false;
+	fs_info->qgroup_flags &= ~BTRFS_QGROUP_RUNTIME_FLAG_CANCEL_RESCAN;
 	complete_all(&fs_info->qgroup_rescan_completion);
 	mutex_unlock(&fs_info->qgroup_rescan_lock);
 
@@ -3314,6 +3324,9 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
 
 	if (stopped) {
 		btrfs_info(fs_info, "qgroup scan paused");
+	} else if (fs_info->qgroup_flags &
+		   BTRFS_QGROUP_RUNTIME_FLAG_CANCEL_RESCAN) {
+		btrfs_info(fs_info, "qgroup scan cancelled");
 	} else if (err >= 0) {
 		btrfs_info(fs_info, "qgroup scan completed%s",
 			err > 0 ? " (inconsistency flag cleared)" : "");
@@ -3373,6 +3386,7 @@ qgroup_rescan_init(struct btrfs_fs_info *fs_info, u64 progress_objectid,
 
 	memset(&fs_info->qgroup_rescan_progress, 0,
 		sizeof(fs_info->qgroup_rescan_progress));
+	fs_info->qgroup_flags &= ~BTRFS_QGROUP_RUNTIME_FLAG_CANCEL_RESCAN;
 	fs_info->qgroup_rescan_progress.objectid = progress_objectid;
 	init_completion(&fs_info->qgroup_rescan_completion);
 	mutex_unlock(&fs_info->qgroup_rescan_lock);
@@ -4169,8 +4183,7 @@ int btrfs_qgroup_add_swapped_blocks(struct btrfs_trans_handle *trans,
 	spin_unlock(&blocks->lock);
 out:
 	if (ret < 0)
-		fs_info->qgroup_flags |=
-			BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT;
+		qgroup_mark_inconsistent(fs_info);
 	return ret;
 }
 
@@ -4257,7 +4270,7 @@ int btrfs_qgroup_trace_subtree_after_cow(struct btrfs_trans_handle *trans,
 		btrfs_err_rl(fs_info,
 			     "failed to account subtree at bytenr %llu: %d",
 			     subvol_eb->start, ret);
-		fs_info->qgroup_flags |= BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT;
+		qgroup_mark_inconsistent(fs_info);
 	}
 	return ret;
 }
diff --git a/fs/btrfs/qgroup.h b/fs/btrfs/qgroup.h
index 880e9df0dac1..f92e635bfb69 100644
--- a/fs/btrfs/qgroup.h
+++ b/fs/btrfs/qgroup.h
@@ -100,6 +100,8 @@
  *     subtree rescan for them.
  */
 
+#define BTRFS_QGROUP_RUNTIME_FLAG_CANCEL_RESCAN		(1UL << 3)
+
 /*
  * Record a dirty extent, and info qgroup to update quota on it
  * TODO: Use kmem cache to alloc it.
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/5] btrfs: introduce BTRFS_QGROUP_RUNTIME_FLAG_NO_ACCOUNTING to skip qgroup accounting
  2021-08-31  9:48 [PATCH 0/5] btrfs: qgroup: address the performance penalty for subvolume dropping Qu Wenruo
                   ` (2 preceding siblings ...)
  2021-08-31  9:49 ` [PATCH 3/5] btrfs: introduce BTRFS_QGROUP_RUNTIME_FLAG_CANCEL_RESCAN Qu Wenruo
@ 2021-08-31  9:49 ` Qu Wenruo
  2021-08-31  9:49 ` [PATCH 5/5] btrfs: skip subtree scan if it's too high to avoid low stall in btrfs_commit_transaction() Qu Wenruo
  2021-09-02 16:25 ` [PATCH 0/5] btrfs: qgroup: address the performance penalty for subvolume dropping David Sterba
  5 siblings, 0 replies; 11+ messages in thread
From: Qu Wenruo @ 2021-08-31  9:49 UTC (permalink / raw)
  To: linux-btrfs

The new flag will make btrfs qgroup to skip all its time consuming
qgroup accounting.

The lifespan is the same as BTRFS_QGROUP_RUNTIME_FLAG_CANCEL_RESCAN,
only get cleared after a new rescan.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/qgroup.c | 18 ++++++++++++++----
 fs/btrfs/qgroup.h |  1 +
 2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 91657ba1ebd1..291c404e8718 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -326,8 +326,11 @@ static void qgroup_mark_inconsistent(struct btrfs_fs_info *fs_info)
 {
 	BUILD_BUG_ON(BTRFS_QGROUP_RUNTIME_FLAG_CANCEL_RESCAN &
 		     BTRFS_QGROUP_STATUS_FLAGS_MASK);
+	BUILD_BUG_ON(BTRFS_QGROUP_RUNTIME_FLAG_NO_ACCOUNTING &
+		     BTRFS_QGROUP_STATUS_FLAGS_MASK);
 	fs_info->qgroup_flags |= (BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT |
-				  BTRFS_QGROUP_RUNTIME_FLAG_CANCEL_RESCAN);
+				  BTRFS_QGROUP_RUNTIME_FLAG_CANCEL_RESCAN |
+				  BTRFS_QGROUP_RUNTIME_FLAG_NO_ACCOUNTING);
 }
 
 /*
@@ -1742,6 +1745,10 @@ int btrfs_qgroup_trace_extent_post(struct btrfs_trans_handle *trans,
 	 */
 	ASSERT(trans != NULL);
 
+	if (trans->fs_info->qgroup_flags &
+	    BTRFS_QGROUP_RUNTIME_FLAG_NO_ACCOUNTING)
+		return 0;
+
 	ret = btrfs_find_all_roots(NULL, trans->fs_info, bytenr, 0, &old_root,
 				   true);
 	if (ret < 0) {
@@ -2556,7 +2563,8 @@ int btrfs_qgroup_account_extent(struct btrfs_trans_handle *trans, u64 bytenr,
 	 * If quotas get disabled meanwhile, the resources need to be freed and
 	 * we can't just exit here.
 	 */
-	if (!test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags))
+	if (!test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags) ||
+	    fs_info->qgroup_flags & BTRFS_QGROUP_RUNTIME_FLAG_NO_ACCOUNTING)
 		goto out_free;
 
 	if (new_roots) {
@@ -2652,7 +2660,8 @@ int btrfs_qgroup_account_extents(struct btrfs_trans_handle *trans)
 		num_dirty_extents++;
 		trace_btrfs_qgroup_account_extents(fs_info, record);
 
-		if (!ret) {
+		if (!ret && !(fs_info->qgroup_flags &
+			      BTRFS_QGROUP_RUNTIME_FLAG_NO_ACCOUNTING)) {
 			/*
 			 * Old roots should be searched when inserting qgroup
 			 * extent record
@@ -3386,7 +3395,8 @@ qgroup_rescan_init(struct btrfs_fs_info *fs_info, u64 progress_objectid,
 
 	memset(&fs_info->qgroup_rescan_progress, 0,
 		sizeof(fs_info->qgroup_rescan_progress));
-	fs_info->qgroup_flags &= ~BTRFS_QGROUP_RUNTIME_FLAG_CANCEL_RESCAN;
+	fs_info->qgroup_flags &= ~(BTRFS_QGROUP_RUNTIME_FLAG_CANCEL_RESCAN |
+				   BTRFS_QGROUP_RUNTIME_FLAG_NO_ACCOUNTING);
 	fs_info->qgroup_rescan_progress.objectid = progress_objectid;
 	init_completion(&fs_info->qgroup_rescan_completion);
 	mutex_unlock(&fs_info->qgroup_rescan_lock);
diff --git a/fs/btrfs/qgroup.h b/fs/btrfs/qgroup.h
index f92e635bfb69..fe42975dcea7 100644
--- a/fs/btrfs/qgroup.h
+++ b/fs/btrfs/qgroup.h
@@ -101,6 +101,7 @@
  */
 
 #define BTRFS_QGROUP_RUNTIME_FLAG_CANCEL_RESCAN		(1UL << 3)
+#define BTRFS_QGROUP_RUNTIME_FLAG_NO_ACCOUNTING		(1UL << 4)
 
 /*
  * Record a dirty extent, and info qgroup to update quota on it
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 5/5] btrfs: skip subtree scan if it's too high to avoid low stall in btrfs_commit_transaction()
  2021-08-31  9:48 [PATCH 0/5] btrfs: qgroup: address the performance penalty for subvolume dropping Qu Wenruo
                   ` (3 preceding siblings ...)
  2021-08-31  9:49 ` [PATCH 4/5] btrfs: introduce BTRFS_QGROUP_RUNTIME_FLAG_NO_ACCOUNTING to skip qgroup accounting Qu Wenruo
@ 2021-08-31  9:49 ` Qu Wenruo
  2021-09-02 16:25 ` [PATCH 0/5] btrfs: qgroup: address the performance penalty for subvolume dropping David Sterba
  5 siblings, 0 replies; 11+ messages in thread
From: Qu Wenruo @ 2021-08-31  9:49 UTC (permalink / raw)
  To: linux-btrfs

Btrfs qgroup has a long history of bringing performance penalty in
btrfs_commit_transaction().

Although we tried our best to migrate such impact, there is still an
unsolved call site, btrfs_drop_snapshot().

This function will find the highest shared tree block and modify its
extent ownership to do a subvolume/snapshot dropping.

Such change will affect the whole subtree, and cause tons of qgroup
dirty extents and stall btrfs_commit_transaction().

To avoid such problem, here we introduce a new sysfs interface,
/sys/fs/btrfs/<uuid>/qgroups/drop_subptree_threshold, to determine at
whether and at which level we should skip qgroup accounting for subtree
dropping.

The default value is BTRFS_MAX_LEVEL, thus every subtree drop will go
through qgroup accounting, to ensure qgroup numbers are kept as
consistent as possible.

While for performance sensitive users, they can change the values to
more reasonable values like 3, to make any subtree, which is at or higher
than level 3, to mark qgroup inconsistent and skip the accounting.

The cost is obvious, the qgroup number is no longer consistent, but at
least performance is more reasonable, and users have the control.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/ctree.h   |  1 +
 fs/btrfs/disk-io.c |  1 +
 fs/btrfs/qgroup.c  | 18 ++++++++++++++++++
 fs/btrfs/sysfs.c   | 39 +++++++++++++++++++++++++++++++++++++++
 4 files changed, 59 insertions(+)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 38870ae46cbb..158291993a97 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -938,6 +938,7 @@ struct btrfs_fs_info {
 	struct completion qgroup_rescan_completion;
 	struct btrfs_work qgroup_rescan_work;
 	bool qgroup_rescan_running;	/* protected by qgroup_rescan_lock */
+	u8 qgroup_drop_subtree_thres;
 
 	/* filesystem state */
 	unsigned long fs_state;
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 41ea50f48cfe..0f2861e7b696 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2281,6 +2281,7 @@ static void btrfs_init_qgroup(struct btrfs_fs_info *fs_info)
 	fs_info->qgroup_seq = 1;
 	fs_info->qgroup_ulist = NULL;
 	fs_info->qgroup_rescan_running = false;
+	fs_info->qgroup_drop_subtree_thres = BTRFS_MAX_LEVEL;
 	mutex_init(&fs_info->qgroup_rescan_lock);
 }
 
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 291c404e8718..8dd29db48f75 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -1211,6 +1211,7 @@ int btrfs_quota_disable(struct btrfs_fs_info *fs_info)
 	quota_root = fs_info->quota_root;
 	fs_info->quota_root = NULL;
 	fs_info->qgroup_flags &= ~BTRFS_QGROUP_STATUS_FLAG_ON;
+	fs_info->qgroup_drop_subtree_thres = BTRFS_MAX_LEVEL;
 	spin_unlock(&fs_info->qgroup_lock);
 
 	btrfs_free_qgroup_config(fs_info);
@@ -2239,6 +2240,7 @@ int btrfs_qgroup_trace_subtree(struct btrfs_trans_handle *trans,
 	struct btrfs_fs_info *fs_info = trans->fs_info;
 	int ret = 0;
 	int level;
+	u8 drop_subptree_thres;
 	struct extent_buffer *eb = root_eb;
 	struct btrfs_path *path = NULL;
 
@@ -2248,6 +2250,22 @@ int btrfs_qgroup_trace_subtree(struct btrfs_trans_handle *trans,
 	if (!test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags))
 		return 0;
 
+	spin_lock(&fs_info->qgroup_lock);
+	drop_subptree_thres = fs_info->qgroup_drop_subtree_thres;
+	spin_unlock(&fs_info->qgroup_lock);
+	/*
+	 * This function only get called for snapshot drop, if we hit a high
+	 * node here, it means we are going to change ownership for quite a lot
+	 * of extents, which will greatly slow down btrfs_commit_transaction().
+	 *
+	 * So here if we find a high tree here, we just skip the accounting and
+	 * mark qgroup inconsistent.
+	 */
+	if (root_level >= drop_subptree_thres) {
+		qgroup_mark_inconsistent(fs_info);
+		return 0;
+	}
+
 	if (!extent_buffer_uptodate(root_eb)) {
 		ret = btrfs_read_buffer(root_eb, root_gen, root_level, NULL);
 		if (ret)
diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 72edc6011d01..9316e9411171 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -1859,6 +1859,44 @@ static ssize_t qgroup_flags_show(struct kobject *qgroups_kobj,
 
 BTRFS_ATTR(qgroups, qgroup_flags, qgroup_flags_show);
 
+static ssize_t qgroup_drop_subtree_thres_show(struct kobject *qgroups_kobj,
+					      struct kobj_attribute *a,
+					      char *buf)
+{
+	struct btrfs_fs_info *fs_info = to_fs_info(qgroups_kobj->parent);
+	u8 result;
+
+	spin_lock(&fs_info->qgroup_lock);
+	result = fs_info->qgroup_drop_subtree_thres;
+	spin_unlock(&fs_info->qgroup_lock);
+
+	return scnprintf(buf, PAGE_SIZE, "%d\n", result);
+}
+
+static ssize_t qgroup_drop_subtree_thres_store(struct kobject *qgroups_kobj,
+					       struct kobj_attribute *a,
+					       const char *buf, size_t len)
+{
+	struct btrfs_fs_info *fs_info = to_fs_info(qgroups_kobj->parent);
+	u8 new_thres;
+	int ret;
+
+	ret = kstrtou8(buf, 10, &new_thres);
+	if (ret)
+		return -EINVAL;
+
+	if (new_thres > BTRFS_MAX_LEVEL)
+		return -EINVAL;
+
+	spin_lock(&fs_info->qgroup_lock);
+	fs_info->qgroup_drop_subtree_thres = new_thres;
+	spin_unlock(&fs_info->qgroup_lock);
+	return len;
+}
+
+BTRFS_ATTR_RW(qgroups, drop_subtree_threshold, qgroup_drop_subtree_thres_show,
+	      qgroup_drop_subtree_thres_store);
+
 /*
  * Qgroups global info
  *
@@ -1866,6 +1904,7 @@ BTRFS_ATTR(qgroups, qgroup_flags, qgroup_flags_show);
  */
 static struct attribute *qgroups_attrs[] = {
 	BTRFS_ATTR_PTR(qgroups, qgroup_flags),
+	BTRFS_ATTR_PTR(qgroups, drop_subtree_threshold),
 	NULL
 };
 ATTRIBUTE_GROUPS(qgroups);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/5] btrfs: qgroup: address the performance penalty for subvolume dropping
  2021-08-31  9:48 [PATCH 0/5] btrfs: qgroup: address the performance penalty for subvolume dropping Qu Wenruo
                   ` (4 preceding siblings ...)
  2021-08-31  9:49 ` [PATCH 5/5] btrfs: skip subtree scan if it's too high to avoid low stall in btrfs_commit_transaction() Qu Wenruo
@ 2021-09-02 16:25 ` David Sterba
  2021-09-02 22:28   ` Qu Wenruo
  5 siblings, 1 reply; 11+ messages in thread
From: David Sterba @ 2021-09-02 16:25 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Tue, Aug 31, 2021 at 05:48:58PM +0800, Qu Wenruo wrote:
> Btrfs qgroup has a long history of bringing huge performance penalty,
> from subvolume dropping to balance.
> 
> Although we solved the problem for balance, but the subvolume dropping
> problem is still unresolved, as we really need to do all the costly
> backref for all the involved subtrees, or qgroup numbers will be
> inconsistent.
> 
> But the performance penalty is sometimes too big, so big that it's
> better just to disable qgroup, do the drop, then do the rescan.
> 
> This patchset will address the problem by introducing a user
> configurable sysfs interface, to allow certain high subtree dropping to
> mark qgroup inconsistent, and skip the whole accounting.
> 
> The following things are needed for this objective:
> 
> - New qgroups attributes
> 
>   Instead of plain qgroup kobjects, we need extra attributes like
>   drop_subtree_threshold.
> 
>   This patchset will introduce two new attributes to the existing
>   qgroups kobject:
>   * qgroups_flags
>     To indicate the qgroup status flags like ON, RESCAN, INCONSISTENT.
> 
>   * drop_subtree_threshold
>     To show the subtree dropping level threshold.
>     The default value is BTRFS_MAX_LEVEL (8), which means all subtree
>     dropping will go through the qgroup accounting, while costly it will
>     try to keep qgroup numbers as consistent as possible.
> 
>     Users can specify values like 3, meaning any subtree which is at
>     level 3 or higher will mark qgroup inconsistent and skip all the
>     costly accounting.
> 
>     This only affects subvolume dropping.
> 
> - Skip qgroup accounting when the numbers are already inconsistent
> 
>   But still keeps the qgroup relationship correct, thus users can keep
>   its qgroup organization while do the rescan later.
> 
> 
> This sysfs interface needs user space tools to monitor and set the
> values for each btrfs.
> 
> Currently the target user space tool is snapper, which by default
> utilizes qgroups for its space-aware snapshots reclaim mechanism.

This is an interesting approach, though I'm there are some usability
questions. First as a user, how do I know I need to use it?  The height
of the subvolume fs tree is not easily accessible.

The sysfs file is not protected in any way so multiple tools can decide
to set it to different values. And whether rescan is required or not
depends on the value so setting it.

It might be better to set the level (or a bit) to the subvol deletion
request, eg. a "fast" mode that would internally use maximum height 3 to
do slow deletion and anything else for fast leaving qgroup numbers
inconsistent.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/5] btrfs: sysfs: introduce qgroup global attribute groups
  2021-08-31  9:48 ` [PATCH 1/5] btrfs: sysfs: introduce qgroup global attribute groups Qu Wenruo
@ 2021-09-02 16:28   ` David Sterba
  2021-09-02 22:30     ` Qu Wenruo
  0 siblings, 1 reply; 11+ messages in thread
From: David Sterba @ 2021-09-02 16:28 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Tue, Aug 31, 2021 at 05:48:59PM +0800, Qu Wenruo wrote:
> Although we already have info kobject for each qgroup, we don't have
> global qgroup info attributes to show things like qgroup flags.
> 
> Add this qgroups attribute groups, and the first member is qgroup_flags,
> which is a read-only attribute to show human readable qgroup flags.
> 
> The path is:
>  /sys/fs/btrfs/<uuid>/qgroups/qgroup_flags
> 
> The output would look like:
>  ON | INCONSISTENT

So that's another format of sysfs file, we should try to keep them
consistent or follow common practices. The recommended way for sysfs is
to have one file per value, and here it follows the known states.

So eg

/sys/fs/.../qgroups/enabled		-> 0 or 1
/sys/fs/.../qgroups/inconsistent	-> 0 or 1
...

The files can be also writable so rescan can be triggered from sysfs, or
quotas disabled eventually. For the start exporting the state would be
sufficient though.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/5] btrfs: qgroup: address the performance penalty for subvolume dropping
  2021-09-02 16:25 ` [PATCH 0/5] btrfs: qgroup: address the performance penalty for subvolume dropping David Sterba
@ 2021-09-02 22:28   ` Qu Wenruo
  0 siblings, 0 replies; 11+ messages in thread
From: Qu Wenruo @ 2021-09-02 22:28 UTC (permalink / raw)
  To: dsterba, Qu Wenruo, linux-btrfs



On 2021/9/3 上午12:25, David Sterba wrote:
> On Tue, Aug 31, 2021 at 05:48:58PM +0800, Qu Wenruo wrote:
>> Btrfs qgroup has a long history of bringing huge performance penalty,
>> from subvolume dropping to balance.
>>
>> Although we solved the problem for balance, but the subvolume dropping
>> problem is still unresolved, as we really need to do all the costly
>> backref for all the involved subtrees, or qgroup numbers will be
>> inconsistent.
>>
>> But the performance penalty is sometimes too big, so big that it's
>> better just to disable qgroup, do the drop, then do the rescan.
>>
>> This patchset will address the problem by introducing a user
>> configurable sysfs interface, to allow certain high subtree dropping to
>> mark qgroup inconsistent, and skip the whole accounting.
>>
>> The following things are needed for this objective:
>>
>> - New qgroups attributes
>>
>>    Instead of plain qgroup kobjects, we need extra attributes like
>>    drop_subtree_threshold.
>>
>>    This patchset will introduce two new attributes to the existing
>>    qgroups kobject:
>>    * qgroups_flags
>>      To indicate the qgroup status flags like ON, RESCAN, INCONSISTENT.
>>
>>    * drop_subtree_threshold
>>      To show the subtree dropping level threshold.
>>      The default value is BTRFS_MAX_LEVEL (8), which means all subtree
>>      dropping will go through the qgroup accounting, while costly it will
>>      try to keep qgroup numbers as consistent as possible.
>>
>>      Users can specify values like 3, meaning any subtree which is at
>>      level 3 or higher will mark qgroup inconsistent and skip all the
>>      costly accounting.
>>
>>      This only affects subvolume dropping.
>>
>> - Skip qgroup accounting when the numbers are already inconsistent
>>
>>    But still keeps the qgroup relationship correct, thus users can keep
>>    its qgroup organization while do the rescan later.
>>
>>
>> This sysfs interface needs user space tools to monitor and set the
>> values for each btrfs.
>>
>> Currently the target user space tool is snapper, which by default
>> utilizes qgroups for its space-aware snapshots reclaim mechanism.
>
> This is an interesting approach, though I'm there are some usability
> questions. First as a user, how do I know I need to use it?  The height
> of the subvolume fs tree is not easily accessible.

The generic idea is, if you're using qgroup and find btrfs-cleaner
taking all CPU for a while, then it's the case.

>
> The sysfs file is not protected in any way so multiple tools can decide
> to set it to different values. And whether rescan is required or not
> depends on the value so setting it.

That's true, but shouldn't that be the problem of the users?

>
> It might be better to set the level (or a bit) to the subvol deletion
> request, eg. a "fast" mode that would internally use maximum height 3 to
> do slow deletion and anything else for fast leaving qgroup numbers
> inconsistent.
>
The problem is, the qgroup part is completely optional, thus I'm not
sure if it's a good idea to add new interface just for an optional feature.

Furthermore, when deleting a subvolume, it's only unlinked, the real
deletion can happen after a mount cycle, in that case, runtime values
will be lost.

If we really want consistent behavior, then we need new on-disk format,
which looks overkilled to me.

Thanks,
Qu

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/5] btrfs: sysfs: introduce qgroup global attribute groups
  2021-09-02 16:28   ` David Sterba
@ 2021-09-02 22:30     ` Qu Wenruo
  2021-09-03 14:03       ` David Sterba
  0 siblings, 1 reply; 11+ messages in thread
From: Qu Wenruo @ 2021-09-02 22:30 UTC (permalink / raw)
  To: dsterba, Qu Wenruo, linux-btrfs



On 2021/9/3 上午12:28, David Sterba wrote:
> On Tue, Aug 31, 2021 at 05:48:59PM +0800, Qu Wenruo wrote:
>> Although we already have info kobject for each qgroup, we don't have
>> global qgroup info attributes to show things like qgroup flags.
>>
>> Add this qgroups attribute groups, and the first member is qgroup_flags,
>> which is a read-only attribute to show human readable qgroup flags.
>>
>> The path is:
>>   /sys/fs/btrfs/<uuid>/qgroups/qgroup_flags
>>
>> The output would look like:
>>   ON | INCONSISTENT
>
> So that's another format of sysfs file, we should try to keep them
> consistent or follow common practices. The recommended way for sysfs is
> to have one file per value, and here it follows the known states.
>
> So eg
>
> /sys/fs/.../qgroups/enabled		-> 0 or 1
> /sys/fs/.../qgroups/inconsistent	-> 0 or 1
> ...
>
> The files can be also writable so rescan can be triggered from sysfs, or
> quotas disabled eventually. For the start exporting the state would be
> sufficient though.
>
OK, that sounds indeed better than the current solution.

Although there may be a small window that one reading 3 attributes could
get inconsistent view, as it's no longer atomic.

Would that be a problem?

Thanks,
Qu

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/5] btrfs: sysfs: introduce qgroup global attribute groups
  2021-09-02 22:30     ` Qu Wenruo
@ 2021-09-03 14:03       ` David Sterba
  0 siblings, 0 replies; 11+ messages in thread
From: David Sterba @ 2021-09-03 14:03 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: dsterba, Qu Wenruo, linux-btrfs

On Fri, Sep 03, 2021 at 06:30:36AM +0800, Qu Wenruo wrote:
> 
> 
> On 2021/9/3 上午12:28, David Sterba wrote:
> > On Tue, Aug 31, 2021 at 05:48:59PM +0800, Qu Wenruo wrote:
> >> Although we already have info kobject for each qgroup, we don't have
> >> global qgroup info attributes to show things like qgroup flags.
> >>
> >> Add this qgroups attribute groups, and the first member is qgroup_flags,
> >> which is a read-only attribute to show human readable qgroup flags.
> >>
> >> The path is:
> >>   /sys/fs/btrfs/<uuid>/qgroups/qgroup_flags
> >>
> >> The output would look like:
> >>   ON | INCONSISTENT
> >
> > So that's another format of sysfs file, we should try to keep them
> > consistent or follow common practices. The recommended way for sysfs is
> > to have one file per value, and here it follows the known states.
> >
> > So eg
> >
> > /sys/fs/.../qgroups/enabled		-> 0 or 1
> > /sys/fs/.../qgroups/inconsistent	-> 0 or 1
> > ...
> >
> > The files can be also writable so rescan can be triggered from sysfs, or
> > quotas disabled eventually. For the start exporting the state would be
> > sufficient though.
> >
> OK, that sounds indeed better than the current solution.
> 
> Although there may be a small window that one reading 3 attributes could
> get inconsistent view, as it's no longer atomic.
> 
> Would that be a problem?

I hope not, the status can change after reading the sysfs files anyway.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-09-03 14:04 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-31  9:48 [PATCH 0/5] btrfs: qgroup: address the performance penalty for subvolume dropping Qu Wenruo
2021-08-31  9:48 ` [PATCH 1/5] btrfs: sysfs: introduce qgroup global attribute groups Qu Wenruo
2021-09-02 16:28   ` David Sterba
2021-09-02 22:30     ` Qu Wenruo
2021-09-03 14:03       ` David Sterba
2021-08-31  9:49 ` [PATCH 2/5] btrfs: introduce BTRFS_QGROUP_STATUS_FLAGS_MASK for later expansion Qu Wenruo
2021-08-31  9:49 ` [PATCH 3/5] btrfs: introduce BTRFS_QGROUP_RUNTIME_FLAG_CANCEL_RESCAN Qu Wenruo
2021-08-31  9:49 ` [PATCH 4/5] btrfs: introduce BTRFS_QGROUP_RUNTIME_FLAG_NO_ACCOUNTING to skip qgroup accounting Qu Wenruo
2021-08-31  9:49 ` [PATCH 5/5] btrfs: skip subtree scan if it's too high to avoid low stall in btrfs_commit_transaction() Qu Wenruo
2021-09-02 16:25 ` [PATCH 0/5] btrfs: qgroup: address the performance penalty for subvolume dropping David Sterba
2021-09-02 22:28   ` Qu Wenruo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.