All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] Btrfs: qgroup: part-2: bug fixes for qgroup reservation.
@ 2015-02-10 10:23 Dongsheng Yang
  2015-02-10 10:23 ` [PATCH 1/4] Btrfs: qgroup: free reserved in exceeding quota Dongsheng Yang
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Dongsheng Yang @ 2015-02-10 10:23 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Dongsheng Yang

Hi all,
	This is part2 for qgroup. It is based on 
	[PATCH 0/9] Btrfs: qgroup: part-1: bug fixes.

[1/4] is reviewed-by Josef.

[2/4] - [4/4] are trying to solve a problem in qgroup reservation.
It would be weired I introduce a may_use in 2/4 and delete it in 4/4.

Yes, I am glad to rebase and squash them, here I am keeping this history
of the problem for reviewing. I have explained the problem and sent the
[2/4] out. Josef give me a looks good at that time. But in the later, I
found another problem and solved it with [4/4]. So they are trying to
solve different problems at different timing. When the all patches are
reviewed by you guys, I am okey to squash them if you request.

Thanx.

Dongsheng Yang (4):
  Btrfs: qgroup: free reserved in exceeding quota.
  Btrfs: qgroup: Introduce a may_use to account
    space_info->bytes_may_use.
  Btrfs: qgroup, Account data space in more proper timings.
  btrfs: qgroup: do a reservation in a higher level.

 fs/btrfs/ctree.h       |  2 +-
 fs/btrfs/extent-tree.c | 25 +++++++++++--------------
 fs/btrfs/file.c        | 13 ++-----------
 fs/btrfs/inode.c       |  3 ++-
 fs/btrfs/qgroup.c      | 10 ++++------
 fs/btrfs/relocation.c  |  2 +-
 6 files changed, 21 insertions(+), 34 deletions(-)

-- 
1.8.4.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/4] Btrfs: qgroup: free reserved in exceeding quota.
  2015-02-10 10:23 [PATCH 0/4] Btrfs: qgroup: part-2: bug fixes for qgroup reservation Dongsheng Yang
@ 2015-02-10 10:23 ` Dongsheng Yang
  2015-02-10 10:23 ` [PATCH 2/4] Btrfs: qgroup: Introduce a may_use to account space_info->bytes_may_use Dongsheng Yang
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Dongsheng Yang @ 2015-02-10 10:23 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Dongsheng Yang

When we exceed quota limit in writing, we will free
some reserved extent when we need to drop but not free
account in qgroup. It means, each time we exceed quota
in writing, there will be some remain space in qg->reserved
we can not use any more. If things go on like this, the
all space will be ate up.

Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
---
 fs/btrfs/extent-tree.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index a80b971..88b4e32 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -5275,8 +5275,11 @@ out_fail:
 			to_free = 0;
 	}
 	spin_unlock(&BTRFS_I(inode)->lock);
-	if (dropped)
+	if (dropped) {
+		if (root->fs_info->quota_enabled)
+			btrfs_qgroup_free(root, dropped * root->nodesize);
 		to_free += btrfs_calc_trans_metadata_size(root, dropped);
+	}
 
 	if (to_free) {
 		btrfs_block_rsv_release(root, block_rsv, to_free);
-- 
1.8.4.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/4] Btrfs: qgroup: Introduce a may_use to account space_info->bytes_may_use.
  2015-02-10 10:23 [PATCH 0/4] Btrfs: qgroup: part-2: bug fixes for qgroup reservation Dongsheng Yang
  2015-02-10 10:23 ` [PATCH 1/4] Btrfs: qgroup: free reserved in exceeding quota Dongsheng Yang
@ 2015-02-10 10:23 ` Dongsheng Yang
  2015-02-10 10:23 ` [PATCH 3/4] Btrfs: qgroup, Account data space in more proper timings Dongsheng Yang
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Dongsheng Yang @ 2015-02-10 10:23 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Dongsheng Yang

Currently, for pre_alloc or delay_alloc, the bytes will be accounted
in space_info by the three guys.
space_info->bytes_may_use --- space_info->reserved --- space_info->used.
But on the other hand, in qgroup, there are only two counters to account the
bytes, qgroup->reserved and qgroup->excl. And qg->reserved accounts
bytes in space_info->bytes_may_use and qg->excl accounts bytes in
space_info->used. So the bytes in space_info->reserved is not accounted
in qgroup. If so, there is a window we can exceed the quota limit when
bytes is in space_info->reserved.

Example:
	# btrfs quota enable /mnt
	# btrfs qgroup limit -e 10M /mnt
	# for((i=0;i<20;i++));do fallocate -l 1M /mnt/data$i; done
	# sync
	# btrfs qgroup show -pcre /mnt
qgroupid rfer     excl     max_rfer max_excl parent  child
-------- ----     ----     -------- -------- ------  -----
0/5      20987904 20987904 0        10485760 ---     ---

qg->excl is 20987904 larger than max_excl 10485760.

This patch introduce a new counter named may_use to qgroup, then
there are three counters in qgroup to account bytes in space_info
as below.
space_info->bytes_may_use --- space_info->reserved --- space_info->used.
qgroup->may_use           --- qgroup->reserved     --- qgroup->excl

With this patch applied:
	# btrfs quota enable /mnt
	# btrfs qgroup limit -e 10M /mnt
	# for((i=0;i<20;i++));do fallocate -l 1M /mnt/data$i; done
fallocate: /mnt/data9: fallocate failed: Disk quota exceeded
fallocate: /mnt/data10: fallocate failed: Disk quota exceeded
fallocate: /mnt/data11: fallocate failed: Disk quota exceeded
fallocate: /mnt/data12: fallocate failed: Disk quota exceeded
fallocate: /mnt/data13: fallocate failed: Disk quota exceeded
fallocate: /mnt/data14: fallocate failed: Disk quota exceeded
fallocate: /mnt/data15: fallocate failed: Disk quota exceeded
fallocate: /mnt/data16: fallocate failed: Disk quota exceeded
fallocate: /mnt/data17: fallocate failed: Disk quota exceeded
fallocate: /mnt/data18: fallocate failed: Disk quota exceeded
fallocate: /mnt/data19: fallocate failed: Disk quota exceeded
	# sync
	# btrfs qgroup show -pcre /mnt
qgroupid rfer    excl    max_rfer max_excl parent  child
-------- ----    ----    -------- -------- ------  -----
0/5      9453568 9453568 0        10485760 ---     ---

Reported-by: Cyril SCETBON <cyril.scetbon@free.fr>
Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
---
 fs/btrfs/extent-tree.c | 20 ++++++++++++++-
 fs/btrfs/inode.c       | 18 ++++++++++++-
 fs/btrfs/qgroup.c      | 68 +++++++++++++++++++++++++++++++++++++++++++++++---
 fs/btrfs/qgroup.h      |  4 +++
 4 files changed, 104 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 88b4e32..d1a7ce0 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -5512,8 +5512,12 @@ static int pin_down_extent(struct btrfs_root *root,
 
 	set_extent_dirty(root->fs_info->pinned_extents, bytenr,
 			 bytenr + num_bytes - 1, GFP_NOFS | __GFP_NOFAIL);
-	if (reserved)
+	if (reserved) {
+		btrfs_qgroup_update_reserved_bytes(root->fs_info,
+						   root->root_key.objectid,
+						   num_bytes, -1);
 		trace_btrfs_reserved_extent_free(root, bytenr, num_bytes);
+	}
 	return 0;
 }
 
@@ -6244,6 +6248,9 @@ void btrfs_free_tree_block(struct btrfs_trans_handle *trans,
 		btrfs_update_reserved_bytes(cache, buf->len, RESERVE_FREE, 0);
 		trace_btrfs_reserved_extent_free(root, buf->start, buf->len);
 		pin = 0;
+		btrfs_qgroup_update_reserved_bytes(root->fs_info,
+						   root->root_key.objectid,
+						   buf->len, -1);
 	}
 out:
 	if (pin)
@@ -6978,7 +6985,11 @@ static int __btrfs_free_reserved_extent(struct btrfs_root *root,
 	else {
 		btrfs_add_free_space(cache, start, len);
 		btrfs_update_reserved_bytes(cache, len, RESERVE_FREE, delalloc);
+		btrfs_qgroup_update_reserved_bytes(root->fs_info,
+						   root->root_key.objectid,
+						   len, -1);
 	}
+
 	btrfs_put_block_group(cache);
 
 	trace_btrfs_reserved_extent_free(root, start, len);
@@ -7214,6 +7225,9 @@ int btrfs_alloc_logged_file_extent(struct btrfs_trans_handle *trans,
 	BUG_ON(ret); /* logic error */
 	ret = alloc_reserved_file_extent(trans, root, 0, root_objectid,
 					 0, owner, offset, ins, 1);
+	btrfs_qgroup_update_reserved_bytes(root->fs_info,
+					   root->root_key.objectid,
+					   ins->offset, 1);
 	btrfs_put_block_group(block_group);
 	return ret;
 }
@@ -7360,6 +7374,10 @@ struct extent_buffer *btrfs_alloc_tree_block(struct btrfs_trans_handle *trans,
 		return ERR_PTR(ret);
 	}
 
+	btrfs_qgroup_update_reserved_bytes(root->fs_info,
+					   root_objectid,
+					   ins.offset, 1);
+
 	buf = btrfs_init_new_buffer(trans, root, ins.objectid,
 				    blocksize, level);
 	BUG_ON(IS_ERR(buf)); /* -ENOMEM */
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index e687bb0..e350cd6 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -59,6 +59,7 @@
 #include "backref.h"
 #include "hash.h"
 #include "props.h"
+#include "qgroup.h"
 
 struct btrfs_iget_args {
 	struct btrfs_key *location;
@@ -745,7 +746,9 @@ retry:
 			}
 			goto out_free;
 		}
-
+		btrfs_qgroup_update_reserved_bytes(root->fs_info,
+						   root->root_key.objectid,
+						   ins.offset, 1);
 		/*
 		 * here we're doing allocation and writeback of the
 		 * compressed pages
@@ -970,6 +973,10 @@ static noinline int cow_file_range(struct inode *inode,
 		if (ret < 0)
 			goto out_unlock;
 
+		btrfs_qgroup_update_reserved_bytes(root->fs_info,
+						   root->root_key.objectid,
+						   ins.offset, 1);
+
 		em = alloc_extent_map();
 		if (!em) {
 			ret = -ENOMEM;
@@ -6797,6 +6804,10 @@ static struct extent_map *btrfs_new_extent_direct(struct inode *inode,
 		return ERR_PTR(ret);
 	}
 
+	btrfs_qgroup_update_reserved_bytes(root->fs_info,
+					   root->root_key.objectid,
+					   ins.offset, 1);
+
 	return em;
 }
 
@@ -9321,6 +9332,11 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode,
 				btrfs_end_transaction(trans, root);
 			break;
 		}
+
+		btrfs_qgroup_update_reserved_bytes(root->fs_info,
+						   root->root_key.objectid,
+						   ins.offset, 1);
+
 		btrfs_drop_extent_cache(inode, cur_offset,
 					cur_offset + ins.offset -1, 0);
 
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index b57fe45..0a86bb1 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -72,6 +72,7 @@ struct btrfs_qgroup {
 	/*
 	 * reservation tracking
 	 */
+	u64 may_use;
 	u64 reserved;
 
 	/*
@@ -1431,6 +1432,8 @@ static int qgroup_excl_accounting(struct btrfs_fs_info *fs_info,
 	WARN_ON(sign < 0 && qgroup->excl < oper->num_bytes);
 	qgroup->excl += sign * oper->num_bytes;
 	qgroup->excl_cmpr += sign * oper->num_bytes;
+	if (sign > 0)
+		qgroup->reserved -= oper->num_bytes;
 
 	qgroup_dirty(fs_info, qgroup);
 
@@ -1450,6 +1453,8 @@ static int qgroup_excl_accounting(struct btrfs_fs_info *fs_info,
 		qgroup->rfer_cmpr += sign * oper->num_bytes;
 		WARN_ON(sign < 0 && qgroup->excl < oper->num_bytes);
 		qgroup->excl += sign * oper->num_bytes;
+		if (sign > 0)
+			qgroup->reserved -= oper->num_bytes;
 		qgroup->excl_cmpr += sign * oper->num_bytes;
 		qgroup_dirty(fs_info, qgroup);
 
@@ -2392,6 +2397,61 @@ out:
 	return ret;
 }
 
+int btrfs_qgroup_update_reserved_bytes(struct btrfs_fs_info *fs_info,
+					    u64 ref_root,
+					    u64 num_bytes,
+					    int sign)
+{
+	struct btrfs_root *quota_root;
+	struct btrfs_qgroup *qgroup;
+	int ret = 0;
+	struct ulist_node *unode;
+	struct ulist_iterator uiter;
+
+	if (!is_fstree(ref_root) || !fs_info->quota_enabled)
+		return 0;
+
+	if (num_bytes == 0)
+		return 0;
+
+	spin_lock(&fs_info->qgroup_lock);
+	quota_root = fs_info->quota_root;
+	if (!quota_root)
+		goto out;
+
+	qgroup = find_qgroup_rb(fs_info, ref_root);
+	if (!qgroup)
+		goto out;
+
+	ulist_reinit(fs_info->qgroup_ulist);
+	ret = ulist_add(fs_info->qgroup_ulist, qgroup->qgroupid,
+			(uintptr_t)qgroup, GFP_ATOMIC);
+	if (ret < 0)
+		goto out;
+
+	ULIST_ITER_INIT(&uiter);
+	while ((unode = ulist_next(fs_info->qgroup_ulist, &uiter))) {
+		struct btrfs_qgroup *qg;
+		struct btrfs_qgroup_list *glist;
+
+		qg = u64_to_ptr(unode->aux);
+
+		qg->reserved += sign * num_bytes;
+
+		list_for_each_entry(glist, &qg->groups, next_group) {
+			ret = ulist_add(fs_info->qgroup_ulist,
+					glist->group->qgroupid,
+					(uintptr_t)glist->group, GFP_ATOMIC);
+			if (ret < 0)
+				goto out;
+		}
+	}
+
+out:
+	spin_unlock(&fs_info->qgroup_lock);
+	return ret;
+}
+
 /*
  * reserve some space for a qgroup and all its parents. The reservation takes
  * place with start_transaction or dealloc_reserve, similar to ENOSPC
@@ -2440,14 +2500,14 @@ int btrfs_qgroup_reserve(struct btrfs_root *root, u64 num_bytes)
 		qg = u64_to_ptr(unode->aux);
 
 		if ((qg->lim_flags & BTRFS_QGROUP_LIMIT_MAX_RFER) &&
-		    qg->reserved + (s64)qg->rfer + num_bytes >
+		    qg->reserved + qg->may_use + (s64)qg->rfer + num_bytes >
 		    qg->max_rfer) {
 			ret = -EDQUOT;
 			goto out;
 		}
 
 		if ((qg->lim_flags & BTRFS_QGROUP_LIMIT_MAX_EXCL) &&
-		    qg->reserved + (s64)qg->excl + num_bytes >
+		    qg->reserved + qg->may_use + (s64)qg->excl + num_bytes >
 		    qg->max_excl) {
 			ret = -EDQUOT;
 			goto out;
@@ -2471,7 +2531,7 @@ int btrfs_qgroup_reserve(struct btrfs_root *root, u64 num_bytes)
 
 		qg = u64_to_ptr(unode->aux);
 
-		qg->reserved += num_bytes;
+		qg->may_use += num_bytes;
 	}
 
 out:
@@ -2517,7 +2577,7 @@ void btrfs_qgroup_free(struct btrfs_root *root, u64 num_bytes)
 
 		qg = u64_to_ptr(unode->aux);
 
-		qg->reserved -= num_bytes;
+		qg->may_use -= num_bytes;
 
 		list_for_each_entry(glist, &qg->groups, next_group) {
 			ret = ulist_add(fs_info->qgroup_ulist,
diff --git a/fs/btrfs/qgroup.h b/fs/btrfs/qgroup.h
index 52d8b19..99f0487 100644
--- a/fs/btrfs/qgroup.h
+++ b/fs/btrfs/qgroup.h
@@ -97,6 +97,10 @@ int btrfs_run_qgroups(struct btrfs_trans_handle *trans,
 int btrfs_qgroup_inherit(struct btrfs_trans_handle *trans,
 			 struct btrfs_fs_info *fs_info, u64 srcid, u64 objectid,
 			 struct btrfs_qgroup_inherit *inherit);
+int btrfs_qgroup_update_reserved_bytes(struct btrfs_fs_info *fs_info,
+				       u64 ref_root,
+				       u64 num_bytes,
+				       int sign);
 int btrfs_qgroup_reserve(struct btrfs_root *root, u64 num_bytes);
 void btrfs_qgroup_free(struct btrfs_root *root, u64 num_bytes);
 
-- 
1.8.4.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/4] Btrfs: qgroup, Account data space in more proper timings.
  2015-02-10 10:23 [PATCH 0/4] Btrfs: qgroup: part-2: bug fixes for qgroup reservation Dongsheng Yang
  2015-02-10 10:23 ` [PATCH 1/4] Btrfs: qgroup: free reserved in exceeding quota Dongsheng Yang
  2015-02-10 10:23 ` [PATCH 2/4] Btrfs: qgroup: Introduce a may_use to account space_info->bytes_may_use Dongsheng Yang
@ 2015-02-10 10:23 ` Dongsheng Yang
  2015-02-10 10:23 ` [PATCH 4/4] btrfs: qgroup: do a reservation in a higher level Dongsheng Yang
  2015-03-04  1:47 ` [PATCH 0/4] Btrfs: qgroup: part-2: bug fixes for qgroup reservation Dongsheng Yang
  4 siblings, 0 replies; 6+ messages in thread
From: Dongsheng Yang @ 2015-02-10 10:23 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Dongsheng Yang

Currenly, in data writing, ->reserved is accounted in
fill_delalloc(), but ->may_use is released in clear_bit_hook()
which is called by btrfs_finish_ordered_io(). That's too late,
that said, between fill_delalloc() and btrfs_finish_ordered_io(),
the data is doublely accounted by qgroup. It will cause some
unexpected -EDQUOT.

Example:
	# btrfs quota enable /root/btrfs-auto-test/
	# btrfs subvolume create /root/btrfs-auto-test//sub
	Create subvolume '/root/btrfs-auto-test/sub'
	# btrfs qgroup limit 1G /root/btrfs-auto-test//sub
	dd if=/dev/zero of=/root/btrfs-auto-test//sub/file bs=1024 count=1500000
	dd: error writing '/root/btrfs-auto-test//sub/file': Disk quota exceeded
	681353+0 records in
	681352+0 records out
	697704448 bytes (698 MB) copied, 8.15563 s, 85.5 MB/s
It's (698 MB) when we got an -EDQUOT, but we limit it by 1G.

This patch move the btrfs_qgroup_reserve/free() for data from
btrfs_delalloc_reserve/release_metadata() to btrfs_check_data_free_space()
and btrfs_free_reserved_data_space(). Then the accounter in qgroup
will be updated at the same time with the accounter in space_info updated.
In this way, the unexpected -EDQUOT will be killed.

Reported-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
---
 fs/btrfs/extent-tree.c | 16 +++++++++-------
 fs/btrfs/file.c        |  9 ---------
 2 files changed, 9 insertions(+), 16 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index d1a7ce0..67c2e28 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3774,12 +3774,16 @@ commit_trans:
 					      data_sinfo->flags, bytes, 1);
 		return -ENOSPC;
 	}
+	ret = btrfs_qgroup_reserve(root, bytes);
+	if (ret)
+		goto out;
 	data_sinfo->bytes_may_use += bytes;
 	trace_btrfs_space_reservation(root->fs_info, "space_info",
 				      data_sinfo->flags, bytes, 1);
+out:
 	spin_unlock(&data_sinfo->lock);
 
-	return 0;
+	return ret;
 }
 
 /*
@@ -3796,6 +3800,7 @@ void btrfs_free_reserved_data_space(struct inode *inode, u64 bytes)
 	data_sinfo = root->fs_info->data_sinfo;
 	spin_lock(&data_sinfo->lock);
 	WARN_ON(data_sinfo->bytes_may_use < bytes);
+	btrfs_qgroup_free(root, bytes);
 	data_sinfo->bytes_may_use -= bytes;
 	trace_btrfs_space_reservation(root->fs_info, "space_info",
 				      data_sinfo->flags, bytes, 0);
@@ -5191,8 +5196,7 @@ int btrfs_delalloc_reserve_metadata(struct inode *inode, u64 num_bytes)
 	spin_unlock(&BTRFS_I(inode)->lock);
 
 	if (root->fs_info->quota_enabled) {
-		ret = btrfs_qgroup_reserve(root, num_bytes +
-					   nr_extents * root->nodesize);
+		ret = btrfs_qgroup_reserve(root, nr_extents * root->nodesize);
 		if (ret)
 			goto out_fail;
 	}
@@ -5200,8 +5204,7 @@ int btrfs_delalloc_reserve_metadata(struct inode *inode, u64 num_bytes)
 	ret = reserve_metadata_bytes(root, block_rsv, to_reserve, flush);
 	if (unlikely(ret)) {
 		if (root->fs_info->quota_enabled)
-			btrfs_qgroup_free(root, num_bytes +
-						nr_extents * root->nodesize);
+			btrfs_qgroup_free(root, nr_extents * root->nodesize);
 		goto out_fail;
 	}
 
@@ -5319,8 +5322,7 @@ void btrfs_delalloc_release_metadata(struct inode *inode, u64 num_bytes)
 	trace_btrfs_space_reservation(root->fs_info, "delalloc",
 				      btrfs_ino(inode), to_free, 0);
 	if (root->fs_info->quota_enabled) {
-		btrfs_qgroup_free(root, num_bytes +
-					dropped * root->nodesize);
+		btrfs_qgroup_free(root, dropped * root->nodesize);
 	}
 
 	btrfs_block_rsv_release(root, &root->fs_info->delalloc_block_rsv,
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index e409025..0ab1333 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -2527,7 +2527,6 @@ static long btrfs_fallocate(struct file *file, int mode,
 {
 	struct inode *inode = file_inode(file);
 	struct extent_state *cached_state = NULL;
-	struct btrfs_root *root = BTRFS_I(inode)->root;
 	u64 cur_offset;
 	u64 last_byte;
 	u64 alloc_start;
@@ -2555,11 +2554,6 @@ static long btrfs_fallocate(struct file *file, int mode,
 	ret = btrfs_check_data_free_space(inode, alloc_end - alloc_start);
 	if (ret)
 		return ret;
-	if (root->fs_info->quota_enabled) {
-		ret = btrfs_qgroup_reserve(root, alloc_end - alloc_start);
-		if (ret)
-			goto out_reserve_fail;
-	}
 
 	mutex_lock(&inode->i_mutex);
 	ret = inode_newsize_ok(inode, alloc_end);
@@ -2677,9 +2671,6 @@ static long btrfs_fallocate(struct file *file, int mode,
 			     &cached_state, GFP_NOFS);
 out:
 	mutex_unlock(&inode->i_mutex);
-	if (root->fs_info->quota_enabled)
-		btrfs_qgroup_free(root, alloc_end - alloc_start);
-out_reserve_fail:
 	/* Let go of our reservation. */
 	btrfs_free_reserved_data_space(inode, alloc_end - alloc_start);
 	return ret;
-- 
1.8.4.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 4/4] btrfs: qgroup: do a reservation in a higher level.
  2015-02-10 10:23 [PATCH 0/4] Btrfs: qgroup: part-2: bug fixes for qgroup reservation Dongsheng Yang
                   ` (2 preceding siblings ...)
  2015-02-10 10:23 ` [PATCH 3/4] Btrfs: qgroup, Account data space in more proper timings Dongsheng Yang
@ 2015-02-10 10:23 ` Dongsheng Yang
  2015-03-04  1:47 ` [PATCH 0/4] Btrfs: qgroup: part-2: bug fixes for qgroup reservation Dongsheng Yang
  4 siblings, 0 replies; 6+ messages in thread
From: Dongsheng Yang @ 2015-02-10 10:23 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Dongsheng Yang

There are two problems in qgroup:

a). The PAGE_CACHE is 4K, even when we are writing a data of 1K,
qgroup will reserve a 4K size. It will cause the last 3K in a qgroup
is not available to user.

b). When user is writing a inline data, qgroup will not reserve it,
it means this is a window we can exceed the limit of a qgroup.

The main idea of this patch is reserving the data size of write_bytes
rather than the reserve_bytes. It means qgroup will not care about
the data size btrfs will reserve for user, but only care about the
data size user is going to write. Then reserve it when user want to
write and release it in transaction committed.

In this way, qgroup can be released from the complex procedure in
btrfs and only do the reserve when user want to write and account
when the data is written in commit_transaction().

Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
---
 fs/btrfs/ctree.h       |  2 +-
 fs/btrfs/extent-tree.c | 38 +++++----------------------
 fs/btrfs/file.c        |  4 +--
 fs/btrfs/inode.c       | 15 -----------
 fs/btrfs/qgroup.c      | 70 +++-----------------------------------------------
 fs/btrfs/qgroup.h      |  4 ---
 fs/btrfs/relocation.c  |  2 +-
 7 files changed, 14 insertions(+), 121 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 7e60741..1d17e7a 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3433,7 +3433,7 @@ enum btrfs_reserve_flush_enum {
 	BTRFS_RESERVE_FLUSH_ALL,
 };
 
-int btrfs_check_data_free_space(struct inode *inode, u64 bytes);
+int btrfs_check_data_free_space(struct inode *inode, u64 bytes, u64 write_bytes);
 void btrfs_free_reserved_data_space(struct inode *inode, u64 bytes);
 void btrfs_trans_release_metadata(struct btrfs_trans_handle *trans,
 				struct btrfs_root *root);
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 67c2e28..652be74 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3286,7 +3286,7 @@ again:
 	num_pages *= 16;
 	num_pages *= PAGE_CACHE_SIZE;
 
-	ret = btrfs_check_data_free_space(inode, num_pages);
+	ret = btrfs_check_data_free_space(inode, num_pages, num_pages);
 	if (ret)
 		goto out_put;
 
@@ -3673,7 +3673,7 @@ u64 btrfs_get_alloc_profile(struct btrfs_root *root, int data)
  * This will check the space that the inode allocates from to make sure we have
  * enough space for bytes.
  */
-int btrfs_check_data_free_space(struct inode *inode, u64 bytes)
+int btrfs_check_data_free_space(struct inode *inode, u64 bytes, u64 write_bytes)
 {
 	struct btrfs_space_info *data_sinfo;
 	struct btrfs_root *root = BTRFS_I(inode)->root;
@@ -3774,7 +3774,7 @@ commit_trans:
 					      data_sinfo->flags, bytes, 1);
 		return -ENOSPC;
 	}
-	ret = btrfs_qgroup_reserve(root, bytes);
+	ret = btrfs_qgroup_reserve(root, write_bytes);
 	if (ret)
 		goto out;
 	data_sinfo->bytes_may_use += bytes;
@@ -3800,7 +3800,6 @@ void btrfs_free_reserved_data_space(struct inode *inode, u64 bytes)
 	data_sinfo = root->fs_info->data_sinfo;
 	spin_lock(&data_sinfo->lock);
 	WARN_ON(data_sinfo->bytes_may_use < bytes);
-	btrfs_qgroup_free(root, bytes);
 	data_sinfo->bytes_may_use -= bytes;
 	trace_btrfs_space_reservation(root->fs_info, "space_info",
 				      data_sinfo->flags, bytes, 0);
@@ -5041,8 +5040,6 @@ void btrfs_subvolume_release_metadata(struct btrfs_root *root,
 				      u64 qgroup_reserved)
 {
 	btrfs_block_rsv_release(root, rsv, (u64)-1);
-	if (qgroup_reserved)
-		btrfs_qgroup_free(root, qgroup_reserved);
 }
 
 /**
@@ -5278,11 +5275,8 @@ out_fail:
 			to_free = 0;
 	}
 	spin_unlock(&BTRFS_I(inode)->lock);
-	if (dropped) {
-		if (root->fs_info->quota_enabled)
-			btrfs_qgroup_free(root, dropped * root->nodesize);
+	if (dropped)
 		to_free += btrfs_calc_trans_metadata_size(root, dropped);
-	}
 
 	if (to_free) {
 		btrfs_block_rsv_release(root, block_rsv, to_free);
@@ -5321,9 +5315,6 @@ void btrfs_delalloc_release_metadata(struct inode *inode, u64 num_bytes)
 
 	trace_btrfs_space_reservation(root->fs_info, "delalloc",
 				      btrfs_ino(inode), to_free, 0);
-	if (root->fs_info->quota_enabled) {
-		btrfs_qgroup_free(root, dropped * root->nodesize);
-	}
 
 	btrfs_block_rsv_release(root, &root->fs_info->delalloc_block_rsv,
 				to_free);
@@ -5348,7 +5339,7 @@ int btrfs_delalloc_reserve_space(struct inode *inode, u64 num_bytes)
 {
 	int ret;
 
-	ret = btrfs_check_data_free_space(inode, num_bytes);
+	ret = btrfs_check_data_free_space(inode, num_bytes, num_bytes);
 	if (ret)
 		return ret;
 
@@ -5514,12 +5505,8 @@ static int pin_down_extent(struct btrfs_root *root,
 
 	set_extent_dirty(root->fs_info->pinned_extents, bytenr,
 			 bytenr + num_bytes - 1, GFP_NOFS | __GFP_NOFAIL);
-	if (reserved) {
-		btrfs_qgroup_update_reserved_bytes(root->fs_info,
-						   root->root_key.objectid,
-						   num_bytes, -1);
+	if (reserved)
 		trace_btrfs_reserved_extent_free(root, bytenr, num_bytes);
-	}
 	return 0;
 }
 
@@ -6250,9 +6237,6 @@ void btrfs_free_tree_block(struct btrfs_trans_handle *trans,
 		btrfs_update_reserved_bytes(cache, buf->len, RESERVE_FREE, 0);
 		trace_btrfs_reserved_extent_free(root, buf->start, buf->len);
 		pin = 0;
-		btrfs_qgroup_update_reserved_bytes(root->fs_info,
-						   root->root_key.objectid,
-						   buf->len, -1);
 	}
 out:
 	if (pin)
@@ -6987,9 +6971,6 @@ static int __btrfs_free_reserved_extent(struct btrfs_root *root,
 	else {
 		btrfs_add_free_space(cache, start, len);
 		btrfs_update_reserved_bytes(cache, len, RESERVE_FREE, delalloc);
-		btrfs_qgroup_update_reserved_bytes(root->fs_info,
-						   root->root_key.objectid,
-						   len, -1);
 	}
 
 	btrfs_put_block_group(cache);
@@ -7227,9 +7208,6 @@ int btrfs_alloc_logged_file_extent(struct btrfs_trans_handle *trans,
 	BUG_ON(ret); /* logic error */
 	ret = alloc_reserved_file_extent(trans, root, 0, root_objectid,
 					 0, owner, offset, ins, 1);
-	btrfs_qgroup_update_reserved_bytes(root->fs_info,
-					   root->root_key.objectid,
-					   ins->offset, 1);
 	btrfs_put_block_group(block_group);
 	return ret;
 }
@@ -7376,10 +7354,6 @@ struct extent_buffer *btrfs_alloc_tree_block(struct btrfs_trans_handle *trans,
 		return ERR_PTR(ret);
 	}
 
-	btrfs_qgroup_update_reserved_bytes(root->fs_info,
-					   root_objectid,
-					   ins.offset, 1);
-
 	buf = btrfs_init_new_buffer(trans, root, ins.objectid,
 				    blocksize, level);
 	BUG_ON(IS_ERR(buf)); /* -ENOMEM */
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 0ab1333..4172144 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1514,7 +1514,7 @@ static noinline ssize_t __btrfs_buffered_write(struct file *file,
 		}
 
 		reserve_bytes = num_pages << PAGE_CACHE_SHIFT;
-		ret = btrfs_check_data_free_space(inode, reserve_bytes);
+		ret = btrfs_check_data_free_space(inode, reserve_bytes, write_bytes);
 		if (ret == -ENOSPC &&
 		    (BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW |
 					      BTRFS_INODE_PREALLOC))) {
@@ -2551,7 +2551,7 @@ static long btrfs_fallocate(struct file *file, int mode,
 	 * Make sure we have enough space before we do the
 	 * allocation.
 	 */
-	ret = btrfs_check_data_free_space(inode, alloc_end - alloc_start);
+	ret = btrfs_check_data_free_space(inode, alloc_end - alloc_start, alloc_end - alloc_start);
 	if (ret)
 		return ret;
 
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index e350cd6..f666191 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -746,9 +746,6 @@ retry:
 			}
 			goto out_free;
 		}
-		btrfs_qgroup_update_reserved_bytes(root->fs_info,
-						   root->root_key.objectid,
-						   ins.offset, 1);
 		/*
 		 * here we're doing allocation and writeback of the
 		 * compressed pages
@@ -973,10 +970,6 @@ static noinline int cow_file_range(struct inode *inode,
 		if (ret < 0)
 			goto out_unlock;
 
-		btrfs_qgroup_update_reserved_bytes(root->fs_info,
-						   root->root_key.objectid,
-						   ins.offset, 1);
-
 		em = alloc_extent_map();
 		if (!em) {
 			ret = -ENOMEM;
@@ -6804,10 +6797,6 @@ static struct extent_map *btrfs_new_extent_direct(struct inode *inode,
 		return ERR_PTR(ret);
 	}
 
-	btrfs_qgroup_update_reserved_bytes(root->fs_info,
-					   root->root_key.objectid,
-					   ins.offset, 1);
-
 	return em;
 }
 
@@ -9333,10 +9322,6 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode,
 			break;
 		}
 
-		btrfs_qgroup_update_reserved_bytes(root->fs_info,
-						   root->root_key.objectid,
-						   ins.offset, 1);
-
 		btrfs_drop_extent_cache(inode, cur_offset,
 					cur_offset + ins.offset -1, 0);
 
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 0a86bb1..5a1463d 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -72,7 +72,6 @@ struct btrfs_qgroup {
 	/*
 	 * reservation tracking
 	 */
-	u64 may_use;
 	u64 reserved;
 
 	/*
@@ -2397,67 +2396,6 @@ out:
 	return ret;
 }
 
-int btrfs_qgroup_update_reserved_bytes(struct btrfs_fs_info *fs_info,
-					    u64 ref_root,
-					    u64 num_bytes,
-					    int sign)
-{
-	struct btrfs_root *quota_root;
-	struct btrfs_qgroup *qgroup;
-	int ret = 0;
-	struct ulist_node *unode;
-	struct ulist_iterator uiter;
-
-	if (!is_fstree(ref_root) || !fs_info->quota_enabled)
-		return 0;
-
-	if (num_bytes == 0)
-		return 0;
-
-	spin_lock(&fs_info->qgroup_lock);
-	quota_root = fs_info->quota_root;
-	if (!quota_root)
-		goto out;
-
-	qgroup = find_qgroup_rb(fs_info, ref_root);
-	if (!qgroup)
-		goto out;
-
-	ulist_reinit(fs_info->qgroup_ulist);
-	ret = ulist_add(fs_info->qgroup_ulist, qgroup->qgroupid,
-			(uintptr_t)qgroup, GFP_ATOMIC);
-	if (ret < 0)
-		goto out;
-
-	ULIST_ITER_INIT(&uiter);
-	while ((unode = ulist_next(fs_info->qgroup_ulist, &uiter))) {
-		struct btrfs_qgroup *qg;
-		struct btrfs_qgroup_list *glist;
-
-		qg = u64_to_ptr(unode->aux);
-
-		qg->reserved += sign * num_bytes;
-
-		list_for_each_entry(glist, &qg->groups, next_group) {
-			ret = ulist_add(fs_info->qgroup_ulist,
-					glist->group->qgroupid,
-					(uintptr_t)glist->group, GFP_ATOMIC);
-			if (ret < 0)
-				goto out;
-		}
-	}
-
-out:
-	spin_unlock(&fs_info->qgroup_lock);
-	return ret;
-}
-
-/*
- * reserve some space for a qgroup and all its parents. The reservation takes
- * place with start_transaction or dealloc_reserve, similar to ENOSPC
- * accounting. If not enough space is available, EDQUOT is returned.
- * We assume that the requested space is new for all qgroups.
- */
 int btrfs_qgroup_reserve(struct btrfs_root *root, u64 num_bytes)
 {
 	struct btrfs_root *quota_root;
@@ -2500,14 +2438,14 @@ int btrfs_qgroup_reserve(struct btrfs_root *root, u64 num_bytes)
 		qg = u64_to_ptr(unode->aux);
 
 		if ((qg->lim_flags & BTRFS_QGROUP_LIMIT_MAX_RFER) &&
-		    qg->reserved + qg->may_use + (s64)qg->rfer + num_bytes >
+		    qg->reserved + (s64)qg->rfer + num_bytes >
 		    qg->max_rfer) {
 			ret = -EDQUOT;
 			goto out;
 		}
 
 		if ((qg->lim_flags & BTRFS_QGROUP_LIMIT_MAX_EXCL) &&
-		    qg->reserved + qg->may_use + (s64)qg->excl + num_bytes >
+		    qg->reserved + (s64)qg->excl + num_bytes >
 		    qg->max_excl) {
 			ret = -EDQUOT;
 			goto out;
@@ -2531,7 +2469,7 @@ int btrfs_qgroup_reserve(struct btrfs_root *root, u64 num_bytes)
 
 		qg = u64_to_ptr(unode->aux);
 
-		qg->may_use += num_bytes;
+		qg->reserved += num_bytes;
 	}
 
 out:
@@ -2577,7 +2515,7 @@ void btrfs_qgroup_free(struct btrfs_root *root, u64 num_bytes)
 
 		qg = u64_to_ptr(unode->aux);
 
-		qg->may_use -= num_bytes;
+		qg->reserved -= num_bytes;
 
 		list_for_each_entry(glist, &qg->groups, next_group) {
 			ret = ulist_add(fs_info->qgroup_ulist,
diff --git a/fs/btrfs/qgroup.h b/fs/btrfs/qgroup.h
index 99f0487..52d8b19 100644
--- a/fs/btrfs/qgroup.h
+++ b/fs/btrfs/qgroup.h
@@ -97,10 +97,6 @@ int btrfs_run_qgroups(struct btrfs_trans_handle *trans,
 int btrfs_qgroup_inherit(struct btrfs_trans_handle *trans,
 			 struct btrfs_fs_info *fs_info, u64 srcid, u64 objectid,
 			 struct btrfs_qgroup_inherit *inherit);
-int btrfs_qgroup_update_reserved_bytes(struct btrfs_fs_info *fs_info,
-				       u64 ref_root,
-				       u64 num_bytes,
-				       int sign);
 int btrfs_qgroup_reserve(struct btrfs_root *root, u64 num_bytes);
 void btrfs_qgroup_free(struct btrfs_root *root, u64 num_bytes);
 
diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 74257d6..3b66cd7 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -3027,7 +3027,7 @@ int prealloc_file_extent_cluster(struct inode *inode,
 	mutex_lock(&inode->i_mutex);
 
 	ret = btrfs_check_data_free_space(inode, cluster->end +
-					  1 - cluster->start);
+					  1 - cluster->start, 0);
 	if (ret)
 		goto out;
 
-- 
1.8.4.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/4] Btrfs: qgroup: part-2: bug fixes for qgroup reservation.
  2015-02-10 10:23 [PATCH 0/4] Btrfs: qgroup: part-2: bug fixes for qgroup reservation Dongsheng Yang
                   ` (3 preceding siblings ...)
  2015-02-10 10:23 ` [PATCH 4/4] btrfs: qgroup: do a reservation in a higher level Dongsheng Yang
@ 2015-03-04  1:47 ` Dongsheng Yang
  4 siblings, 0 replies; 6+ messages in thread
From: Dongsheng Yang @ 2015-03-04  1:47 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs

On 02/10/2015 06:23 PM, Dongsheng Yang wrote:
> Hi all,
> 	This is part2 for qgroup. It is based on
> 	[PATCH 0/9] Btrfs: qgroup: part-1: bug fixes.
>
> [1/4] is reviewed-by Josef.
>
> [2/4] - [4/4] are trying to solve a problem in qgroup reservation.
> It would be weired I introduce a may_use in 2/4 and delete it in 4/4.
>
> Yes, I am glad to rebase and squash them, here I am keeping this history
> of the problem for reviewing. I have explained the problem and sent the
> [2/4] out. Josef give me a looks good at that time. But in the later, I
> found another problem and solved it with [4/4]. So they are trying to
> solve different problems at different timing. When the all patches are
> reviewed by you guys, I am okey to squash them if you request.

Hi, any comment on this patchset?

Thanx
>
> Thanx.
>
> Dongsheng Yang (4):
>    Btrfs: qgroup: free reserved in exceeding quota.
>    Btrfs: qgroup: Introduce a may_use to account
>      space_info->bytes_may_use.
>    Btrfs: qgroup, Account data space in more proper timings.
>    btrfs: qgroup: do a reservation in a higher level.
>
>   fs/btrfs/ctree.h       |  2 +-
>   fs/btrfs/extent-tree.c | 25 +++++++++++--------------
>   fs/btrfs/file.c        | 13 ++-----------
>   fs/btrfs/inode.c       |  3 ++-
>   fs/btrfs/qgroup.c      | 10 ++++------
>   fs/btrfs/relocation.c  |  2 +-
>   6 files changed, 21 insertions(+), 34 deletions(-)
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-03-04  1:50 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-10 10:23 [PATCH 0/4] Btrfs: qgroup: part-2: bug fixes for qgroup reservation Dongsheng Yang
2015-02-10 10:23 ` [PATCH 1/4] Btrfs: qgroup: free reserved in exceeding quota Dongsheng Yang
2015-02-10 10:23 ` [PATCH 2/4] Btrfs: qgroup: Introduce a may_use to account space_info->bytes_may_use Dongsheng Yang
2015-02-10 10:23 ` [PATCH 3/4] Btrfs: qgroup, Account data space in more proper timings Dongsheng Yang
2015-02-10 10:23 ` [PATCH 4/4] btrfs: qgroup: do a reservation in a higher level Dongsheng Yang
2015-03-04  1:47 ` [PATCH 0/4] Btrfs: qgroup: part-2: bug fixes for qgroup reservation Dongsheng Yang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.