All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] Btrfs: improve the latency of the space reservation
@ 2013-11-04 15:13 Miao Xie
  2013-11-04 15:13 ` [PATCH 1/7] Btrfs: remove unnecessary initialization and memory barrior in shrink_delalloc() Miao Xie
                   ` (7 more replies)
  0 siblings, 8 replies; 14+ messages in thread
From: Miao Xie @ 2013-11-04 15:13 UTC (permalink / raw)
  To: linux-btrfs

This patchset improve the latency of the space reservation when doing the
delalloc inode flush. As we know, the current code would wait for
the completion all the ordered extents in the filesystem. It was unnecessary,
we can finish the wait and try to do reservation once again earlier if we
get enough free space. It can reduce the wait time and make the perfermance
up.

We did the buffered write test by the sysbench on my box, the following
is the result.
 Memory:	2GB
 CPU:		2Cores * 1CPU
 Partition:	20GB(SSD)
				w/		w/o
 rndwr-512KB-1Thread-2GB	120.38MB/s	110.08MB/s
 rndwr-512KB-8Threads-2GB	119.43MB/s	110.96MB/s
 seqwr-512KB-1Thread-2GB	99.676MB/s	98.53MB/s
 seqwr-512KB-8Threads-2GB	111.79MB/s	99.176MB/s

 rndwr-128KB-1Thread-2GB	139.23MB/s	95.864MB/s
 rndwr-128KB-8Threads-2GB	126.16MB/s	96.686MB/s
 seqwr-128KB-1Thread-2GB	100.24MB/s	100.95.MB/s
 seqwr-128KB-8Threads-2GB	126.51MB/s	100.26MB/s

rndwr: random write test
seqwr: sequential write test
512KB,128KB: the size of each request
1Thread, 8Threads: the number of the test threads
2GB: The total file size

Miao Xie (7):
  Btrfs: remove unnecessary initialization and memory barrior in shrink_delalloc()
  Btrfs: wait for the ordered extent only when we want
  Btrfs: pick up the code for the item number calculation in flush_space()
  Btrfs: fix the confusion between delalloc bytes and metadata bytes
  Btrfs: don't wait for all the async delalloc when shrinking delalloc
  Btrfs: don't wait for the completion of all the ordered extents
  Btrfs: rename btrfs_start_all_delalloc_inodes

 fs/btrfs/ctree.h        |  3 +--
 fs/btrfs/dev-replace.c  |  6 ++---
 fs/btrfs/extent-tree.c  | 62 ++++++++++++++++++++++++++++++++++---------------
 fs/btrfs/inode.c        |  3 +--
 fs/btrfs/ioctl.c        |  2 +-
 fs/btrfs/ordered-data.c | 22 ++++++++++++++----
 fs/btrfs/ordered-data.h |  4 ++--
 fs/btrfs/relocation.c   |  4 ++--
 fs/btrfs/super.c        |  2 +-
 fs/btrfs/transaction.c  |  4 ++--
 10 files changed, 73 insertions(+), 39 deletions(-)

-- 
1.8.1.4


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/7] Btrfs: remove unnecessary initialization and memory barrior in shrink_delalloc()
  2013-11-04 15:13 [PATCH 0/7] Btrfs: improve the latency of the space reservation Miao Xie
@ 2013-11-04 15:13 ` Miao Xie
  2013-11-05 11:45   ` David Sterba
  2013-11-04 15:13 ` [PATCH 2/7] Btrfs: wait for the ordered extent only when we want Miao Xie
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 14+ messages in thread
From: Miao Xie @ 2013-11-04 15:13 UTC (permalink / raw)
  To: linux-btrfs

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/extent-tree.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 1d238ed..abe65ed 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -4016,15 +4016,14 @@ static void shrink_delalloc(struct btrfs_root *root, u64 to_reclaim, u64 orig,
 	u64 delalloc_bytes;
 	u64 max_reclaim;
 	long time_left;
-	unsigned long nr_pages = (2 * 1024 * 1024) >> PAGE_CACHE_SHIFT;
-	int loops = 0;
+	unsigned long nr_pages;
+	int loops;
 	enum btrfs_reserve_flush_enum flush;
 
 	trans = (struct btrfs_trans_handle *)current->journal_info;
 	block_rsv = &root->fs_info->delalloc_block_rsv;
 	space_info = block_rsv->space_info;
 
-	smp_mb();
 	delalloc_bytes = percpu_counter_sum_positive(
 						&root->fs_info->delalloc_bytes);
 	if (delalloc_bytes == 0) {
@@ -4034,6 +4033,7 @@ static void shrink_delalloc(struct btrfs_root *root, u64 to_reclaim, u64 orig,
 		return;
 	}
 
+	loops = 0;
 	while (delalloc_bytes && loops < 3) {
 		max_reclaim = min(delalloc_bytes, to_reclaim);
 		nr_pages = max_reclaim >> PAGE_CACHE_SHIFT;
@@ -4064,7 +4064,6 @@ static void shrink_delalloc(struct btrfs_root *root, u64 to_reclaim, u64 orig,
 			if (time_left)
 				break;
 		}
-		smp_mb();
 		delalloc_bytes = percpu_counter_sum_positive(
 						&root->fs_info->delalloc_bytes);
 	}
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/7] Btrfs: wait for the ordered extent only when we want
  2013-11-04 15:13 [PATCH 0/7] Btrfs: improve the latency of the space reservation Miao Xie
  2013-11-04 15:13 ` [PATCH 1/7] Btrfs: remove unnecessary initialization and memory barrior in shrink_delalloc() Miao Xie
@ 2013-11-04 15:13 ` Miao Xie
  2013-11-04 15:13 ` [PATCH 3/7] Btrfs: pick up the code for the item number calculation in flush_space() Miao Xie
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Miao Xie @ 2013-11-04 15:13 UTC (permalink / raw)
  To: linux-btrfs

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/extent-tree.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index abe65ed..ed6eceb 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -4029,7 +4029,8 @@ static void shrink_delalloc(struct btrfs_root *root, u64 to_reclaim, u64 orig,
 	if (delalloc_bytes == 0) {
 		if (trans)
 			return;
-		btrfs_wait_all_ordered_extents(root->fs_info);
+		if (wait_ordered)
+			btrfs_wait_all_ordered_extents(root->fs_info);
 		return;
 	}
 
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/7] Btrfs: pick up the code for the item number calculation in flush_space()
  2013-11-04 15:13 [PATCH 0/7] Btrfs: improve the latency of the space reservation Miao Xie
  2013-11-04 15:13 ` [PATCH 1/7] Btrfs: remove unnecessary initialization and memory barrior in shrink_delalloc() Miao Xie
  2013-11-04 15:13 ` [PATCH 2/7] Btrfs: wait for the ordered extent only when we want Miao Xie
@ 2013-11-04 15:13 ` Miao Xie
  2013-11-04 15:13 ` [PATCH 4/7] Btrfs: fix the confusion between delalloc bytes and metadata bytes Miao Xie
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Miao Xie @ 2013-11-04 15:13 UTC (permalink / raw)
  To: linux-btrfs

This patch picked up the code that was used to calculate the number of
the items for which we need reserve space, and we will use it in the next
patch.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/extent-tree.c | 25 ++++++++++++++++---------
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index ed6eceb..32dcf80 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -4004,6 +4004,18 @@ static void btrfs_writeback_inodes_sb_nr(struct btrfs_root *root,
 	}
 }
 
+static inline int calc_reclaim_items_nr(struct btrfs_root *root, u64 to_reclaim)
+{
+	u64 bytes;
+	int nr;
+
+	bytes = btrfs_calc_trans_metadata_size(root, 1);
+	nr = (int)div64_u64(to_reclaim, bytes);
+	if (!nr)
+		nr = 1;
+	return nr;
+}
+
 /*
  * shrink metadata reservation for delalloc
  */
@@ -4149,16 +4161,11 @@ static int flush_space(struct btrfs_root *root,
 	switch (state) {
 	case FLUSH_DELAYED_ITEMS_NR:
 	case FLUSH_DELAYED_ITEMS:
-		if (state == FLUSH_DELAYED_ITEMS_NR) {
-			u64 bytes = btrfs_calc_trans_metadata_size(root, 1);
-
-			nr = (int)div64_u64(num_bytes, bytes);
-			if (!nr)
-				nr = 1;
-			nr *= 2;
-		} else {
+		if (state == FLUSH_DELAYED_ITEMS_NR)
+			nr = calc_reclaim_items_nr(root, num_bytes) * 2;
+		else
 			nr = -1;
-		}
+
 		trans = btrfs_join_transaction(root);
 		if (IS_ERR(trans)) {
 			ret = PTR_ERR(trans);
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/7] Btrfs: fix the confusion between delalloc bytes and metadata bytes
  2013-11-04 15:13 [PATCH 0/7] Btrfs: improve the latency of the space reservation Miao Xie
                   ` (2 preceding siblings ...)
  2013-11-04 15:13 ` [PATCH 3/7] Btrfs: pick up the code for the item number calculation in flush_space() Miao Xie
@ 2013-11-04 15:13 ` Miao Xie
  2013-11-04 15:13 ` [PATCH 5/7] Btrfs: don't wait for all the async delalloc when shrinking delalloc Miao Xie
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Miao Xie @ 2013-11-04 15:13 UTC (permalink / raw)
  To: linux-btrfs

In shrink_delalloc(), what we need reclaim is the metadata space, so
flushing pages by to_reclaim is not reasonable, it is very likely that
the pages we flush are not enough. And then we had to invoke the flush
function for several times, at the worst, we need call flush_space for
several times. It wasted time.

We improve this problem by converting the metadata space size we need
reserve to the delalloc bytes, By this way, we can flush the pages
by a reasonable number.

(Now we use a fixed number to do conversion, it is not flexible, maybe
 we can find a good way to improve it in the future.)

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/extent-tree.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 32dcf80..c48fde1 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -4016,6 +4016,8 @@ static inline int calc_reclaim_items_nr(struct btrfs_root *root, u64 to_reclaim)
 	return nr;
 }
 
+#define EXTENT_SIZE_PER_ITEM	(256 * 1024)
+
 /*
  * shrink metadata reservation for delalloc
  */
@@ -4032,6 +4034,10 @@ static void shrink_delalloc(struct btrfs_root *root, u64 to_reclaim, u64 orig,
 	int loops;
 	enum btrfs_reserve_flush_enum flush;
 
+	/* Calc the number of the pages we need flush for space reservation */
+	to_reclaim = calc_reclaim_items_nr(root, to_reclaim);
+	to_reclaim *= EXTENT_SIZE_PER_ITEM;
+
 	trans = (struct btrfs_trans_handle *)current->journal_info;
 	block_rsv = &root->fs_info->delalloc_block_rsv;
 	space_info = block_rsv->space_info;
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/7] Btrfs: don't wait for all the async delalloc when shrinking delalloc
  2013-11-04 15:13 [PATCH 0/7] Btrfs: improve the latency of the space reservation Miao Xie
                   ` (3 preceding siblings ...)
  2013-11-04 15:13 ` [PATCH 4/7] Btrfs: fix the confusion between delalloc bytes and metadata bytes Miao Xie
@ 2013-11-04 15:13 ` Miao Xie
  2013-11-04 15:13 ` [PATCH 6/7] Btrfs: don't wait for the completion of all the ordered extents Miao Xie
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Miao Xie @ 2013-11-04 15:13 UTC (permalink / raw)
  To: linux-btrfs

It was very likely that there were lots of async delalloc pages in the
filesystem, if we waited until all the pages were flushed, we would be
blocked for a long time, and the performance would also drop down.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/extent-tree.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index c48fde1..d691e15 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -4061,9 +4061,19 @@ static void shrink_delalloc(struct btrfs_root *root, u64 to_reclaim, u64 orig,
 		 * We need to wait for the async pages to actually start before
 		 * we do anything.
 		 */
-		wait_event(root->fs_info->async_submit_wait,
-			   !atomic_read(&root->fs_info->async_delalloc_pages));
+		max_reclaim = atomic_read(&root->fs_info->async_delalloc_pages);
+		if (!max_reclaim)
+			goto skip_async;
+
+		if (max_reclaim <= nr_pages)
+			max_reclaim = 0;
+		else
+			max_reclaim -= nr_pages;
 
+		wait_event(root->fs_info->async_submit_wait,
+			   atomic_read(&root->fs_info->async_delalloc_pages) <=
+			   (int)max_reclaim);
+skip_async:
 		if (!trans)
 			flush = BTRFS_RESERVE_FLUSH_ALL;
 		else
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 6/7] Btrfs: don't wait for the completion of all the ordered extents
  2013-11-04 15:13 [PATCH 0/7] Btrfs: improve the latency of the space reservation Miao Xie
                   ` (4 preceding siblings ...)
  2013-11-04 15:13 ` [PATCH 5/7] Btrfs: don't wait for all the async delalloc when shrinking delalloc Miao Xie
@ 2013-11-04 15:13 ` Miao Xie
  2013-11-04 15:13 ` [PATCH 7/7] Btrfs: rename btrfs_start_all_delalloc_inodes Miao Xie
  2013-11-04 17:05 ` [PATCH 0/7] Btrfs: improve the latency of the space reservation Josef Bacik
  7 siblings, 0 replies; 14+ messages in thread
From: Miao Xie @ 2013-11-04 15:13 UTC (permalink / raw)
  To: linux-btrfs

It is very likely that there are lots of ordered extents in the filesytem,
if we wait for the completion of all of them when we want to reclaim some
space for the metadata space reservation, we would be blocked for a long
time. The performance would drop down suddenly for a long time.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/dev-replace.c  |  4 ++--
 fs/btrfs/extent-tree.c  | 11 ++++++-----
 fs/btrfs/ioctl.c        |  2 +-
 fs/btrfs/ordered-data.c | 22 +++++++++++++++++-----
 fs/btrfs/ordered-data.h |  4 ++--
 fs/btrfs/relocation.c   |  2 +-
 fs/btrfs/super.c        |  2 +-
 fs/btrfs/transaction.c  |  2 +-
 8 files changed, 31 insertions(+), 18 deletions(-)

diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index 2a9bd5b..bc55b36 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -400,7 +400,7 @@ int btrfs_dev_replace_start(struct btrfs_root *root,
 	args->result = BTRFS_IOCTL_DEV_REPLACE_RESULT_NO_ERROR;
 	btrfs_dev_replace_unlock(dev_replace);
 
-	btrfs_wait_all_ordered_extents(root->fs_info);
+	btrfs_wait_ordered_roots(root->fs_info, -1);
 
 	/* force writing the updated state information to disk */
 	trans = btrfs_start_transaction(root, 0);
@@ -475,7 +475,7 @@ static int btrfs_dev_replace_finishing(struct btrfs_fs_info *fs_info,
 		mutex_unlock(&dev_replace->lock_finishing_cancel_unmount);
 		return ret;
 	}
-	btrfs_wait_all_ordered_extents(root->fs_info);
+	btrfs_wait_ordered_roots(root->fs_info, -1);
 
 	trans = btrfs_start_transaction(root, 0);
 	if (IS_ERR(trans)) {
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index d691e15..410d972 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -4000,7 +4000,7 @@ static void btrfs_writeback_inodes_sb_nr(struct btrfs_root *root,
 		 */
 		btrfs_start_all_delalloc_inodes(root->fs_info, 0);
 		if (!current->journal_info)
-			btrfs_wait_all_ordered_extents(root->fs_info);
+			btrfs_wait_ordered_roots(root->fs_info, -1);
 	}
 }
 
@@ -4032,11 +4032,12 @@ static void shrink_delalloc(struct btrfs_root *root, u64 to_reclaim, u64 orig,
 	long time_left;
 	unsigned long nr_pages;
 	int loops;
+	int items;
 	enum btrfs_reserve_flush_enum flush;
 
 	/* Calc the number of the pages we need flush for space reservation */
-	to_reclaim = calc_reclaim_items_nr(root, to_reclaim);
-	to_reclaim *= EXTENT_SIZE_PER_ITEM;
+	items = calc_reclaim_items_nr(root, to_reclaim);
+	to_reclaim = items * EXTENT_SIZE_PER_ITEM;
 
 	trans = (struct btrfs_trans_handle *)current->journal_info;
 	block_rsv = &root->fs_info->delalloc_block_rsv;
@@ -4048,7 +4049,7 @@ static void shrink_delalloc(struct btrfs_root *root, u64 to_reclaim, u64 orig,
 		if (trans)
 			return;
 		if (wait_ordered)
-			btrfs_wait_all_ordered_extents(root->fs_info);
+			btrfs_wait_ordered_roots(root->fs_info, items);
 		return;
 	}
 
@@ -4087,7 +4088,7 @@ skip_async:
 
 		loops++;
 		if (wait_ordered && !trans) {
-			btrfs_wait_all_ordered_extents(root->fs_info);
+			btrfs_wait_ordered_roots(root->fs_info, items);
 		} else {
 			time_left = schedule_timeout_killable(1);
 			if (time_left)
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 9d46f60..524692d 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -574,7 +574,7 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir,
 	if (ret)
 		return ret;
 
-	btrfs_wait_ordered_extents(root);
+	btrfs_wait_ordered_extents(root, -1);
 
 	pending_snapshot = kzalloc(sizeof(*pending_snapshot), GFP_NOFS);
 	if (!pending_snapshot)
diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
index c702cb6..97efc12 100644
--- a/fs/btrfs/ordered-data.c
+++ b/fs/btrfs/ordered-data.c
@@ -563,10 +563,11 @@ static void btrfs_run_ordered_extent_work(struct btrfs_work *work)
  * wait for all the ordered extents in a root.  This is done when balancing
  * space between drives.
  */
-void btrfs_wait_ordered_extents(struct btrfs_root *root)
+int btrfs_wait_ordered_extents(struct btrfs_root *root, int nr)
 {
 	struct list_head splice, works;
 	struct btrfs_ordered_extent *ordered, *next;
+	int count = 0;
 
 	INIT_LIST_HEAD(&splice);
 	INIT_LIST_HEAD(&works);
@@ -574,7 +575,7 @@ void btrfs_wait_ordered_extents(struct btrfs_root *root)
 	mutex_lock(&root->fs_info->ordered_operations_mutex);
 	spin_lock(&root->ordered_extent_lock);
 	list_splice_init(&root->ordered_extents, &splice);
-	while (!list_empty(&splice)) {
+	while (!list_empty(&splice) && nr) {
 		ordered = list_first_entry(&splice, struct btrfs_ordered_extent,
 					   root_extent_list);
 		list_move_tail(&ordered->root_extent_list,
@@ -589,7 +590,11 @@ void btrfs_wait_ordered_extents(struct btrfs_root *root)
 
 		cond_resched();
 		spin_lock(&root->ordered_extent_lock);
+		if (nr != -1)
+			nr--;
+		count++;
 	}
+	list_splice_tail(&splice, &root->ordered_extents);
 	spin_unlock(&root->ordered_extent_lock);
 
 	list_for_each_entry_safe(ordered, next, &works, work_list) {
@@ -599,18 +604,21 @@ void btrfs_wait_ordered_extents(struct btrfs_root *root)
 		cond_resched();
 	}
 	mutex_unlock(&root->fs_info->ordered_operations_mutex);
+
+	return count;
 }
 
-void btrfs_wait_all_ordered_extents(struct btrfs_fs_info *fs_info)
+void btrfs_wait_ordered_roots(struct btrfs_fs_info *fs_info, int nr)
 {
 	struct btrfs_root *root;
 	struct list_head splice;
+	int done;
 
 	INIT_LIST_HEAD(&splice);
 
 	spin_lock(&fs_info->ordered_root_lock);
 	list_splice_init(&fs_info->ordered_roots, &splice);
-	while (!list_empty(&splice)) {
+	while (!list_empty(&splice) && nr) {
 		root = list_first_entry(&splice, struct btrfs_root,
 					ordered_root);
 		root = btrfs_grab_fs_root(root);
@@ -619,10 +627,14 @@ void btrfs_wait_all_ordered_extents(struct btrfs_fs_info *fs_info)
 			       &fs_info->ordered_roots);
 		spin_unlock(&fs_info->ordered_root_lock);
 
-		btrfs_wait_ordered_extents(root);
+		done = btrfs_wait_ordered_extents(root, nr);
 		btrfs_put_fs_root(root);
 
 		spin_lock(&fs_info->ordered_root_lock);
+		if (nr != -1) {
+			nr -= done;
+			WARN_ON(nr < 0);
+		}
 	}
 	spin_unlock(&fs_info->ordered_root_lock);
 }
diff --git a/fs/btrfs/ordered-data.h b/fs/btrfs/ordered-data.h
index 0c0b356..296f15d 100644
--- a/fs/btrfs/ordered-data.h
+++ b/fs/btrfs/ordered-data.h
@@ -195,8 +195,8 @@ int btrfs_run_ordered_operations(struct btrfs_trans_handle *trans,
 void btrfs_add_ordered_operation(struct btrfs_trans_handle *trans,
 				 struct btrfs_root *root,
 				 struct inode *inode);
-void btrfs_wait_ordered_extents(struct btrfs_root *root);
-void btrfs_wait_all_ordered_extents(struct btrfs_fs_info *fs_info);
+int btrfs_wait_ordered_extents(struct btrfs_root *root, int nr);
+void btrfs_wait_ordered_roots(struct btrfs_fs_info *fs_info, int nr);
 void btrfs_get_logged_extents(struct btrfs_root *log, struct inode *inode);
 void btrfs_wait_logged_extents(struct btrfs_root *log, u64 transid);
 void btrfs_free_logged_extents(struct btrfs_root *log, u64 transid);
diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 4a35572..0cec204 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -4246,7 +4246,7 @@ int btrfs_relocate_block_group(struct btrfs_root *extent_root, u64 group_start)
 		err = ret;
 		goto out;
 	}
-	btrfs_wait_all_ordered_extents(fs_info);
+	btrfs_wait_ordered_roots(fs_info, -1);
 
 	while (1) {
 		mutex_lock(&fs_info->cleaner_mutex);
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index e913328..217ccb9 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -921,7 +921,7 @@ int btrfs_sync_fs(struct super_block *sb, int wait)
 		return 0;
 	}
 
-	btrfs_wait_all_ordered_extents(fs_info);
+	btrfs_wait_ordered_roots(fs_info, -1);
 
 	trans = btrfs_attach_transaction_barrier(root);
 	if (IS_ERR(trans)) {
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 8c81bdc..0fada5b 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -1603,7 +1603,7 @@ static inline int btrfs_start_delalloc_flush(struct btrfs_fs_info *fs_info)
 static inline void btrfs_wait_delalloc_flush(struct btrfs_fs_info *fs_info)
 {
 	if (btrfs_test_opt(fs_info->tree_root, FLUSHONCOMMIT))
-		btrfs_wait_all_ordered_extents(fs_info);
+		btrfs_wait_ordered_roots(fs_info, -1);
 }
 
 int btrfs_commit_transaction(struct btrfs_trans_handle *trans,
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 7/7] Btrfs: rename btrfs_start_all_delalloc_inodes
  2013-11-04 15:13 [PATCH 0/7] Btrfs: improve the latency of the space reservation Miao Xie
                   ` (5 preceding siblings ...)
  2013-11-04 15:13 ` [PATCH 6/7] Btrfs: don't wait for the completion of all the ordered extents Miao Xie
@ 2013-11-04 15:13 ` Miao Xie
  2013-11-04 17:09   ` Josef Bacik
  2013-11-04 17:05 ` [PATCH 0/7] Btrfs: improve the latency of the space reservation Josef Bacik
  7 siblings, 1 reply; 14+ messages in thread
From: Miao Xie @ 2013-11-04 15:13 UTC (permalink / raw)
  To: linux-btrfs

rename the function -- btrfs_start_all_delalloc_inodes(), and make its
name be compatible to btrfs_wait_ordered_roots(), since they are always
used at the same place.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/ctree.h       | 3 +--
 fs/btrfs/dev-replace.c | 2 +-
 fs/btrfs/extent-tree.c | 2 +-
 fs/btrfs/inode.c       | 3 +--
 fs/btrfs/relocation.c  | 2 +-
 fs/btrfs/transaction.c | 2 +-
 6 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 0506f40..1fbfc21 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3675,8 +3675,7 @@ int btrfs_truncate_inode_items(struct btrfs_trans_handle *trans,
 			       u32 min_type);
 
 int btrfs_start_delalloc_inodes(struct btrfs_root *root, int delay_iput);
-int btrfs_start_all_delalloc_inodes(struct btrfs_fs_info *fs_info,
-				    int delay_iput);
+int btrfs_start_delalloc_roots(struct btrfs_fs_info *fs_info, int delay_iput);
 int btrfs_set_extent_delalloc(struct inode *inode, u64 start, u64 end,
 			      struct extent_state **cached_state);
 int btrfs_create_subvol_root(struct btrfs_trans_handle *trans,
diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index bc55b36..7436e08 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -470,7 +470,7 @@ static int btrfs_dev_replace_finishing(struct btrfs_fs_info *fs_info,
 	 * flush all outstanding I/O and inode extent mappings before the
 	 * copy operation is declared as being finished
 	 */
-	ret = btrfs_start_all_delalloc_inodes(root->fs_info, 0);
+	ret = btrfs_start_delalloc_roots(root->fs_info, 0);
 	if (ret) {
 		mutex_unlock(&dev_replace->lock_finishing_cancel_unmount);
 		return ret;
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 410d972..0aad9ed 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3998,7 +3998,7 @@ static void btrfs_writeback_inodes_sb_nr(struct btrfs_root *root,
 		 * the filesystem is readonly(all dirty pages are written to
 		 * the disk).
 		 */
-		btrfs_start_all_delalloc_inodes(root->fs_info, 0);
+		btrfs_start_delalloc_roots(root->fs_info, 0);
 		if (!current->journal_info)
 			btrfs_wait_ordered_roots(root->fs_info, -1);
 	}
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index f4a6851..da618c4 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8276,8 +8276,7 @@ int btrfs_start_delalloc_inodes(struct btrfs_root *root, int delay_iput)
 	return ret;
 }
 
-int btrfs_start_all_delalloc_inodes(struct btrfs_fs_info *fs_info,
-				    int delay_iput)
+int btrfs_start_delalloc_roots(struct btrfs_fs_info *fs_info, int delay_iput)
 {
 	struct btrfs_root *root;
 	struct list_head splice;
diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 0cec204..af41487 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -4241,7 +4241,7 @@ int btrfs_relocate_block_group(struct btrfs_root *extent_root, u64 group_start)
 	printk(KERN_INFO "btrfs: relocating block group %llu flags %llu\n",
 	       rc->block_group->key.objectid, rc->block_group->flags);
 
-	ret = btrfs_start_all_delalloc_inodes(fs_info, 0);
+	ret = btrfs_start_delalloc_roots(fs_info, 0);
 	if (ret < 0) {
 		err = ret;
 		goto out;
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 0fada5b..840e672 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -1596,7 +1596,7 @@ static int btrfs_flush_all_pending_stuffs(struct btrfs_trans_handle *trans,
 static inline int btrfs_start_delalloc_flush(struct btrfs_fs_info *fs_info)
 {
 	if (btrfs_test_opt(fs_info->tree_root, FLUSHONCOMMIT))
-		return btrfs_start_all_delalloc_inodes(fs_info, 1);
+		return btrfs_start_delalloc_roots(fs_info, 1);
 	return 0;
 }
 
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/7] Btrfs: improve the latency of the space reservation
  2013-11-04 15:13 [PATCH 0/7] Btrfs: improve the latency of the space reservation Miao Xie
                   ` (6 preceding siblings ...)
  2013-11-04 15:13 ` [PATCH 7/7] Btrfs: rename btrfs_start_all_delalloc_inodes Miao Xie
@ 2013-11-04 17:05 ` Josef Bacik
  7 siblings, 0 replies; 14+ messages in thread
From: Josef Bacik @ 2013-11-04 17:05 UTC (permalink / raw)
  To: Miao Xie; +Cc: linux-btrfs

On Mon, Nov 04, 2013 at 11:13:19PM +0800, Miao Xie wrote:
> This patchset improve the latency of the space reservation when doing the
> delalloc inode flush. As we know, the current code would wait for
> the completion all the ordered extents in the filesystem. It was unnecessary,
> we can finish the wait and try to do reservation once again earlier if we
> get enough free space. It can reduce the wait time and make the perfermance
> up.
> 
> We did the buffered write test by the sysbench on my box, the following
> is the result.
>  Memory:	2GB
>  CPU:		2Cores * 1CPU
>  Partition:	20GB(SSD)
> 				w/		w/o
>  rndwr-512KB-1Thread-2GB	120.38MB/s	110.08MB/s
>  rndwr-512KB-8Threads-2GB	119.43MB/s	110.96MB/s
>  seqwr-512KB-1Thread-2GB	99.676MB/s	98.53MB/s
>  seqwr-512KB-8Threads-2GB	111.79MB/s	99.176MB/s
> 
>  rndwr-128KB-1Thread-2GB	139.23MB/s	95.864MB/s
>  rndwr-128KB-8Threads-2GB	126.16MB/s	96.686MB/s
>  seqwr-128KB-1Thread-2GB	100.24MB/s	100.95.MB/s
>  seqwr-128KB-8Threads-2GB	126.51MB/s	100.26MB/s
> 
> rndwr: random write test
> seqwr: sequential write test
> 512KB,128KB: the size of each request
> 1Thread, 8Threads: the number of the test threads
> 2GB: The total file size
> 
> Miao Xie (7):
>   Btrfs: remove unnecessary initialization and memory barrior in shrink_delalloc()
>   Btrfs: wait for the ordered extent only when we want
>   Btrfs: pick up the code for the item number calculation in flush_space()
>   Btrfs: fix the confusion between delalloc bytes and metadata bytes
>   Btrfs: don't wait for all the async delalloc when shrinking delalloc
>   Btrfs: don't wait for the completion of all the ordered extents
>   Btrfs: rename btrfs_start_all_delalloc_inodes
> 

Thank you for this, I've been wanting to do this for months but I keep getting
sucked into bug hell, remind me to kiss you at LSF.

Josef

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 7/7] Btrfs: rename btrfs_start_all_delalloc_inodes
  2013-11-04 15:13 ` [PATCH 7/7] Btrfs: rename btrfs_start_all_delalloc_inodes Miao Xie
@ 2013-11-04 17:09   ` Josef Bacik
  2013-11-05  3:07     ` Miao Xie
  0 siblings, 1 reply; 14+ messages in thread
From: Josef Bacik @ 2013-11-04 17:09 UTC (permalink / raw)
  To: Miao Xie; +Cc: linux-btrfs

On Mon, Nov 04, 2013 at 11:13:26PM +0800, Miao Xie wrote:
> rename the function -- btrfs_start_all_delalloc_inodes(), and make its
> name be compatible to btrfs_wait_ordered_roots(), since they are always
> used at the same place.
> 
> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
> ---
>  fs/btrfs/ctree.h       | 3 +--
>  fs/btrfs/dev-replace.c | 2 +-
>  fs/btrfs/extent-tree.c | 2 +-
>  fs/btrfs/inode.c       | 3 +--
>  fs/btrfs/relocation.c  | 2 +-
>  fs/btrfs/transaction.c | 2 +-
>  6 files changed, 6 insertions(+), 8 deletions(-)
> 

Theres another use in ioctl.c in my tree, I've just rebased that change in,
please check my tree when I push it to make sure I did it right.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 7/7] Btrfs: rename btrfs_start_all_delalloc_inodes
  2013-11-04 17:09   ` Josef Bacik
@ 2013-11-05  3:07     ` Miao Xie
  0 siblings, 0 replies; 14+ messages in thread
From: Miao Xie @ 2013-11-05  3:07 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs

On 	mon, 4 Nov 2013 12:09:18 -0500, Josef Bacik wrote:
> On Mon, Nov 04, 2013 at 11:13:26PM +0800, Miao Xie wrote:
>> rename the function -- btrfs_start_all_delalloc_inodes(), and make its
>> name be compatible to btrfs_wait_ordered_roots(), since they are always
>> used at the same place.
>>
>> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
>> ---
>>  fs/btrfs/ctree.h       | 3 +--
>>  fs/btrfs/dev-replace.c | 2 +-
>>  fs/btrfs/extent-tree.c | 2 +-
>>  fs/btrfs/inode.c       | 3 +--
>>  fs/btrfs/relocation.c  | 2 +-
>>  fs/btrfs/transaction.c | 2 +-
>>  6 files changed, 6 insertions(+), 8 deletions(-)
>>
> 
> Theres another use in ioctl.c in my tree, I've just rebased that change in,
> please check my tree when I push it to make sure I did it right.  Thanks,

I have checked it, everything is OK.

Thanks
Miao

> 
> Josef
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/7] Btrfs: remove unnecessary initialization and memory barrior in shrink_delalloc()
  2013-11-04 15:13 ` [PATCH 1/7] Btrfs: remove unnecessary initialization and memory barrior in shrink_delalloc() Miao Xie
@ 2013-11-05 11:45   ` David Sterba
  2013-11-06  1:39     ` Miao Xie
  2013-11-06  5:29     ` [PATCH V2 " Miao Xie
  0 siblings, 2 replies; 14+ messages in thread
From: David Sterba @ 2013-11-05 11:45 UTC (permalink / raw)
  To: Miao Xie; +Cc: linux-btrfs

On Mon, Nov 04, 2013 at 11:13:20PM +0800, Miao Xie wrote:
> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>

Reviewed-by: David Sterba <dsterba@suse.cz>

Would be nice to say why it's safe to remove the barrier. The percpu
counters do not need it, they use a spinlock. The barrier was there due
to access to fs_info->delalloc_bytes that was removed in commit
963d678b0f764930 that introduced the percpu counters.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/7] Btrfs: remove unnecessary initialization and memory barrior in shrink_delalloc()
  2013-11-05 11:45   ` David Sterba
@ 2013-11-06  1:39     ` Miao Xie
  2013-11-06  5:29     ` [PATCH V2 " Miao Xie
  1 sibling, 0 replies; 14+ messages in thread
From: Miao Xie @ 2013-11-06  1:39 UTC (permalink / raw)
  To: dsterba, linux-btrfs

On Tue, 5 Nov 2013 12:45:16 +0100, David Sterba wrote:
> On Mon, Nov 04, 2013 at 11:13:20PM +0800, Miao Xie wrote:
>> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
> 
> Reviewed-by: David Sterba <dsterba@suse.cz>
> 
> Would be nice to say why it's safe to remove the barrier. The percpu
> counters do not need it, they use a spinlock. The barrier was there due
> to access to fs_info->delalloc_bytes that was removed in commit
> 963d678b0f764930 that introduced the percpu counters.
> 

Right, I'll send a new one.

Thanks for you review.
Miao

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH V2 1/7] Btrfs: remove unnecessary initialization and memory barrior in shrink_delalloc()
  2013-11-05 11:45   ` David Sterba
  2013-11-06  1:39     ` Miao Xie
@ 2013-11-06  5:29     ` Miao Xie
  1 sibling, 0 replies; 14+ messages in thread
From: Miao Xie @ 2013-11-06  5:29 UTC (permalink / raw)
  To: linux-btrfs

- The variant nr_pages is assigned before we use it, so the initialization
  at the beginning is unnecessary.
- If we enter the branch of the first if statement, the initialization of
  the variant loops is also unnecessary.
- The memory barriers here are misused, the barrier is used to prevent the
  problems that we may get unexpected value that is caused by the random
  order of the independent memory operations. But here, though the memory
  operations before and after the barrier are independent, we are sure that
  all the variants we access before the memory barrier can not be changed,
  so even though the operations are reordered, it is safe and the memory
  barriers are unnecessary.

So we remove those statements.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
Changelog v1 -> v2:
- Add the explanation why we remove those statements in the changelog.
---
 fs/btrfs/extent-tree.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 1d238ed..abe65ed 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -4016,15 +4016,14 @@ static void shrink_delalloc(struct btrfs_root *root, u64 to_reclaim, u64 orig,
 	u64 delalloc_bytes;
 	u64 max_reclaim;
 	long time_left;
-	unsigned long nr_pages = (2 * 1024 * 1024) >> PAGE_CACHE_SHIFT;
-	int loops = 0;
+	unsigned long nr_pages;
+	int loops;
 	enum btrfs_reserve_flush_enum flush;
 
 	trans = (struct btrfs_trans_handle *)current->journal_info;
 	block_rsv = &root->fs_info->delalloc_block_rsv;
 	space_info = block_rsv->space_info;
 
-	smp_mb();
 	delalloc_bytes = percpu_counter_sum_positive(
 						&root->fs_info->delalloc_bytes);
 	if (delalloc_bytes == 0) {
@@ -4034,6 +4033,7 @@ static void shrink_delalloc(struct btrfs_root *root, u64 to_reclaim, u64 orig,
 		return;
 	}
 
+	loops = 0;
 	while (delalloc_bytes && loops < 3) {
 		max_reclaim = min(delalloc_bytes, to_reclaim);
 		nr_pages = max_reclaim >> PAGE_CACHE_SHIFT;
@@ -4064,7 +4064,6 @@ static void shrink_delalloc(struct btrfs_root *root, u64 to_reclaim, u64 orig,
 			if (time_left)
 				break;
 		}
-		smp_mb();
 		delalloc_bytes = percpu_counter_sum_positive(
 						&root->fs_info->delalloc_bytes);
 	}
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2013-11-06  5:28 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-04 15:13 [PATCH 0/7] Btrfs: improve the latency of the space reservation Miao Xie
2013-11-04 15:13 ` [PATCH 1/7] Btrfs: remove unnecessary initialization and memory barrior in shrink_delalloc() Miao Xie
2013-11-05 11:45   ` David Sterba
2013-11-06  1:39     ` Miao Xie
2013-11-06  5:29     ` [PATCH V2 " Miao Xie
2013-11-04 15:13 ` [PATCH 2/7] Btrfs: wait for the ordered extent only when we want Miao Xie
2013-11-04 15:13 ` [PATCH 3/7] Btrfs: pick up the code for the item number calculation in flush_space() Miao Xie
2013-11-04 15:13 ` [PATCH 4/7] Btrfs: fix the confusion between delalloc bytes and metadata bytes Miao Xie
2013-11-04 15:13 ` [PATCH 5/7] Btrfs: don't wait for all the async delalloc when shrinking delalloc Miao Xie
2013-11-04 15:13 ` [PATCH 6/7] Btrfs: don't wait for the completion of all the ordered extents Miao Xie
2013-11-04 15:13 ` [PATCH 7/7] Btrfs: rename btrfs_start_all_delalloc_inodes Miao Xie
2013-11-04 17:09   ` Josef Bacik
2013-11-05  3:07     ` Miao Xie
2013-11-04 17:05 ` [PATCH 0/7] Btrfs: improve the latency of the space reservation Josef Bacik

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.