All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] Qgroup and inode_cache fix, with small cleanups
@ 2016-11-29  7:50 Qu Wenruo
  2016-11-29  7:50 ` [PATCH v2 1/3] btrfs: qgroup: Fix qgroup corruption caused by inode_cache mount option Qu Wenruo
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Qu Wenruo @ 2016-11-29  7:50 UTC (permalink / raw)
  To: linux-btrfs; +Cc: julia.lawall

This patchset fixes a qgroup bug when using with inode_cache mount option.

The bug reminds us that the design of separate
btrfs_qgroup_prepare_account_extents() is a bad practice especially the
safest timing is to call prepare just before btrfs_qgroup_account_extents().

So the patchset will cleanup btrfs_qgroup_prepare_account_extents() to
avoid further similar bugs.

And add a quick exit for non-fs extents for qgroup, mainly to reduce the
noise of qgroup trace events.

v2:
  Fix the missing mutex_unlock() at error path of 1st patch.

Qu Wenruo (3):
  btrfs: qgroup: Fix qgroup corruption caused by inode_cache mount
    option
  btrfs: qgroup: Add quick exit for non-fs extents
  btrfs: qgroup: Cleanup btrfs_qgroup_prepare_account_extents function

 fs/btrfs/qgroup.c      | 82 +++++++++++++++++++++++++++++---------------------
 fs/btrfs/qgroup.h      |  2 --
 fs/btrfs/transaction.c | 21 +++++++------
 3 files changed, 59 insertions(+), 46 deletions(-)

-- 
2.7.4




^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v2 1/3] btrfs: qgroup: Fix qgroup corruption caused by inode_cache mount option
  2016-11-29  7:50 [PATCH v2 0/3] Qgroup and inode_cache fix, with small cleanups Qu Wenruo
@ 2016-11-29  7:50 ` Qu Wenruo
  2016-11-29  7:50 ` [PATCH v2 2/3] btrfs: qgroup: Add quick exit for non-fs extents Qu Wenruo
  2016-11-29  7:50 ` [PATCH v2 3/3] btrfs: qgroup: Cleanup btrfs_qgroup_prepare_account_extents function Qu Wenruo
  2 siblings, 0 replies; 4+ messages in thread
From: Qu Wenruo @ 2016-11-29  7:50 UTC (permalink / raw)
  To: linux-btrfs; +Cc: julia.lawall

[BUG]
The easist way to reproduce the bug is:
------
 # mkfs.btrfs -f $dev -n 16K
 # mount $dev $mnt -o inode_cache
 # btrfs quota enable $mnt
 # btrfs quota rescan -w $mnt
 # btrfs qgroup show $mnt
qgroupid         rfer         excl
--------         ----         ----
0/5          32.00KiB     32.00KiB
             ^^ Twice the correct value
------

And fstests/btrfs qgroup test group can easily detect them.
Although some of them are false alerts since old test cases are using
fixed golden output.
While new test cases will use "btrfs check" to detect qgroup mismatch.

[CAUSE]
Inode_cache mount option will make commit_fs_roots() to call
btrfs_save_ino_cache() to update fs/subvol trees, and generate new
delayed refs.

However we call btrfs_qgroup_prepare_account_extents() too early, before
commit_fs_roots().
This makes the "old_roots" for newly generated extents are always NULL.
For freeing extent case, this makes both new_roots and old_roots to be
empty, while correct old_roots should not be empty.
This causing qgroup numbers not decreased correctly.

[FIX]
Modify the timing of calling btrfs_qgroup_prepare_account_extents() to
just before btrfs_qgroup_account_extents(), and add needed delayed_refs
handler.
So qgroup can handle inode_map mount options correctly.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
v2:
  Fix a missing mutex_unlock() at error routine.
---
 fs/btrfs/transaction.c | 25 ++++++++++++++++++-------
 1 file changed, 18 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 9517de0..11e8f62 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -2102,13 +2102,6 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans,
 		goto scrub_continue;
 	}
 
-	/* Reocrd old roots for later qgroup accounting */
-	ret = btrfs_qgroup_prepare_account_extents(trans, root->fs_info);
-	if (ret) {
-		mutex_unlock(&root->fs_info->reloc_mutex);
-		goto scrub_continue;
-	}
-
 	/*
 	 * make sure none of the code above managed to slip in a
 	 * delayed item
@@ -2151,6 +2144,24 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans,
 	btrfs_free_log_root_tree(trans, root->fs_info);
 
 	/*
+	 * commit_fs_roots() can call btrfs_save_ino_cache(), which generates
+	 * new delayed refs. Must handle them or qgroup can be wrong.
+	 */
+	ret = btrfs_run_delayed_refs(trans, root, (unsigned long)-1);
+	if (ret) {
+		mutex_unlock(&root->fs_info->tree_log_mutex);
+		mutex_unlock(&root->fs_info->reloc_mutex);
+		goto scrub_continue;
+	}
+
+	ret = btrfs_qgroup_prepare_account_extents(trans, root->fs_info);
+	if (ret) {
+		mutex_unlock(&root->fs_info->tree_log_mutex);
+		mutex_unlock(&root->fs_info->reloc_mutex);
+		goto scrub_continue;
+	}
+
+	/*
 	 * Since fs roots are all committed, we can get a quite accurate
 	 * new_roots. So let's do quota accounting.
 	 */
-- 
2.7.4




^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v2 2/3] btrfs: qgroup: Add quick exit for non-fs extents
  2016-11-29  7:50 [PATCH v2 0/3] Qgroup and inode_cache fix, with small cleanups Qu Wenruo
  2016-11-29  7:50 ` [PATCH v2 1/3] btrfs: qgroup: Fix qgroup corruption caused by inode_cache mount option Qu Wenruo
@ 2016-11-29  7:50 ` Qu Wenruo
  2016-11-29  7:50 ` [PATCH v2 3/3] btrfs: qgroup: Cleanup btrfs_qgroup_prepare_account_extents function Qu Wenruo
  2 siblings, 0 replies; 4+ messages in thread
From: Qu Wenruo @ 2016-11-29  7:50 UTC (permalink / raw)
  To: linux-btrfs; +Cc: julia.lawall

For btrfs_qgroup_account_extent(), modify make it exit quicker for
non-fs extents.

This will also reduce the noise in trace_btrfs_qgroup_account_extent
event.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
v2:
  None
---
 fs/btrfs/qgroup.c | 41 +++++++++++++++++++++++++++++++++++++++--
 1 file changed, 39 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index e97f304..b9a2fd1 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -1904,6 +1904,33 @@ static int qgroup_update_counters(struct btrfs_fs_info *fs_info,
 	return 0;
 }
 
+/*
+ * Helper to check if the @roots is a list of fs tree roots
+ * Return 0 for definitely not a fs/subvol tree roots ulist
+ * Return 1 for possible fs/subvol tree roots ulist(including empty)
+ */
+static int maybe_fs_roots(struct ulist *roots)
+{
+	struct ulist_node *unode;
+	struct ulist_iterator uiter;
+
+	/* Empty one, still possible for fs roots */
+	if (!roots || roots->nnodes == 0)
+		return 1;
+
+	ULIST_ITER_INIT(&uiter);
+	unode = ulist_next(roots, &uiter);
+	if (!unode)
+		return 1;
+
+	/*
+	 * If it contains fs tree roots, then it must belongs to fs/subvol
+	 * trees.
+	 * If it contains non-fs tree, it won't be shared to fs/subvol trees.
+	 */
+	return is_fstree(unode->val);
+}
+
 int
 btrfs_qgroup_account_extent(struct btrfs_trans_handle *trans,
 			    struct btrfs_fs_info *fs_info,
@@ -1917,10 +1944,20 @@ btrfs_qgroup_account_extent(struct btrfs_trans_handle *trans,
 	u64 nr_old_roots = 0;
 	int ret = 0;
 
-	if (new_roots)
+	if (new_roots) {
+		if (!maybe_fs_roots(new_roots))
+			goto out_free;
 		nr_new_roots = new_roots->nnodes;
-	if (old_roots)
+	}
+	if (old_roots) {
+		if (!maybe_fs_roots(old_roots))
+			goto out_free;
 		nr_old_roots = old_roots->nnodes;
+	}
+
+	/* Quick exit, either not fs tree roots, or won't affect any qgroup */
+	if (nr_old_roots == 0 && nr_new_roots == 0)
+		goto out_free;
 
 	if (!test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags))
 		goto out_free;
-- 
2.7.4




^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v2 3/3] btrfs: qgroup: Cleanup btrfs_qgroup_prepare_account_extents function
  2016-11-29  7:50 [PATCH v2 0/3] Qgroup and inode_cache fix, with small cleanups Qu Wenruo
  2016-11-29  7:50 ` [PATCH v2 1/3] btrfs: qgroup: Fix qgroup corruption caused by inode_cache mount option Qu Wenruo
  2016-11-29  7:50 ` [PATCH v2 2/3] btrfs: qgroup: Add quick exit for non-fs extents Qu Wenruo
@ 2016-11-29  7:50 ` Qu Wenruo
  2 siblings, 0 replies; 4+ messages in thread
From: Qu Wenruo @ 2016-11-29  7:50 UTC (permalink / raw)
  To: linux-btrfs; +Cc: julia.lawall

Quite a lot of qgroup corruption happens due to wrong timing of calling
btrfs_qgroup_prepare_account_extents().

Since the safest timing is calling it just before
btrfs_qgroup_account_extents(), there is no need to separate these 2
function.

Merging them will make code cleaner and less bug prone.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
v2:
  None
---
 fs/btrfs/qgroup.c      | 41 +++++++++--------------------------------
 fs/btrfs/qgroup.h      |  2 --
 fs/btrfs/transaction.c | 10 ----------
 3 files changed, 9 insertions(+), 44 deletions(-)

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index b9a2fd1..00e3c16 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -1419,37 +1419,6 @@ int btrfs_limit_qgroup(struct btrfs_trans_handle *trans,
 	return ret;
 }
 
-int btrfs_qgroup_prepare_account_extents(struct btrfs_trans_handle *trans,
-					 struct btrfs_fs_info *fs_info)
-{
-	struct btrfs_qgroup_extent_record *record;
-	struct btrfs_delayed_ref_root *delayed_refs;
-	struct rb_node *node;
-	u64 qgroup_to_skip;
-	int ret = 0;
-
-	delayed_refs = &trans->transaction->delayed_refs;
-	qgroup_to_skip = delayed_refs->qgroup_to_skip;
-
-	/*
-	 * No need to do lock, since this function will only be called in
-	 * btrfs_commit_transaction().
-	 */
-	node = rb_first(&delayed_refs->dirty_extent_root);
-	while (node) {
-		record = rb_entry(node, struct btrfs_qgroup_extent_record,
-				  node);
-		ret = btrfs_find_all_roots(NULL, fs_info, record->bytenr, 0,
-					   &record->old_roots);
-		if (ret < 0)
-			break;
-		if (qgroup_to_skip)
-			ulist_del(record->old_roots, qgroup_to_skip, 0);
-		node = rb_next(node);
-	}
-	return ret;
-}
-
 int btrfs_qgroup_trace_extent_nolock(struct btrfs_fs_info *fs_info,
 				struct btrfs_delayed_ref_root *delayed_refs,
 				struct btrfs_qgroup_extent_record *record)
@@ -2038,6 +2007,11 @@ int btrfs_qgroup_account_extents(struct btrfs_trans_handle *trans,
 		trace_btrfs_qgroup_account_extents(fs_info, record);
 
 		if (!ret) {
+			/* Search commit root to find old_roots */
+			ret = btrfs_find_all_roots(NULL, fs_info,
+					record->bytenr, 0, &record->old_roots);
+			if (ret < 0)
+				goto cleanup;
 			/*
 			 * Use (u64)-1 as time_seq to do special search, which
 			 * doesn't lock tree or delayed_refs and search current
@@ -2047,8 +2021,11 @@ int btrfs_qgroup_account_extents(struct btrfs_trans_handle *trans,
 					record->bytenr, (u64)-1, &new_roots);
 			if (ret < 0)
 				goto cleanup;
-			if (qgroup_to_skip)
+			if (qgroup_to_skip) {
 				ulist_del(new_roots, qgroup_to_skip, 0);
+				ulist_del(record->old_roots, qgroup_to_skip,
+					  0);
+			}
 			ret = btrfs_qgroup_account_extent(trans, fs_info,
 					record->bytenr, record->num_bytes,
 					record->old_roots, new_roots);
diff --git a/fs/btrfs/qgroup.h b/fs/btrfs/qgroup.h
index 99c879d..2c7f701 100644
--- a/fs/btrfs/qgroup.h
+++ b/fs/btrfs/qgroup.h
@@ -90,8 +90,6 @@ int btrfs_limit_qgroup(struct btrfs_trans_handle *trans,
 int btrfs_read_qgroup_config(struct btrfs_fs_info *fs_info);
 void btrfs_free_qgroup_config(struct btrfs_fs_info *fs_info);
 struct btrfs_delayed_extent_op;
-int btrfs_qgroup_prepare_account_extents(struct btrfs_trans_handle *trans,
-					 struct btrfs_fs_info *fs_info);
 /*
  * Inform qgroup to trace one dirty extent, its info is recorded in @record.
  * So qgroup can account it at commit trans time.
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 11e8f62..37adbf8 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -1346,9 +1346,6 @@ static int qgroup_account_snapshot(struct btrfs_trans_handle *trans,
 	ret = commit_fs_roots(trans, src);
 	if (ret)
 		goto out;
-	ret = btrfs_qgroup_prepare_account_extents(trans, fs_info);
-	if (ret < 0)
-		goto out;
 	ret = btrfs_qgroup_account_extents(trans, fs_info);
 	if (ret < 0)
 		goto out;
@@ -2154,13 +2151,6 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans,
 		goto scrub_continue;
 	}
 
-	ret = btrfs_qgroup_prepare_account_extents(trans, root->fs_info);
-	if (ret) {
-		mutex_unlock(&root->fs_info->tree_log_mutex);
-		mutex_unlock(&root->fs_info->reloc_mutex);
-		goto scrub_continue;
-	}
-
 	/*
 	 * Since fs roots are all committed, we can get a quite accurate
 	 * new_roots. So let's do quota accounting.
-- 
2.7.4




^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-11-29  7:51 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-29  7:50 [PATCH v2 0/3] Qgroup and inode_cache fix, with small cleanups Qu Wenruo
2016-11-29  7:50 ` [PATCH v2 1/3] btrfs: qgroup: Fix qgroup corruption caused by inode_cache mount option Qu Wenruo
2016-11-29  7:50 ` [PATCH v2 2/3] btrfs: qgroup: Add quick exit for non-fs extents Qu Wenruo
2016-11-29  7:50 ` [PATCH v2 3/3] btrfs: qgroup: Cleanup btrfs_qgroup_prepare_account_extents function Qu Wenruo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.