All of lore.kernel.org
 help / color / mirror / Atom feed
* [Ocfs2-devel]   [PATCH 0/14] Ocfs2: Online defragmentaion V3.
@ 2011-01-21 10:20 Tristan Ye
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 01/14] Ocfs2/refcounttree: Fix a bug for refcounttree to writeback clusters in a right number Tristan Ye
                   ` (13 more replies)
  0 siblings, 14 replies; 22+ messages in thread
From: Tristan Ye @ 2011-01-21 10:20 UTC (permalink / raw)
  To: ocfs2-devel

Changes since v2:

 *. Add refcount support.

 *. Share Copy-On-Writes codes with refcounttree.c

 *. Re-organize the ordering of patches.

 *. Fix several trivial bugs.

-------------------------------------------------------------------------------
Changes since v1:

 *. implement following #2 strategy(simple extent_moving).

	It's a quite rough patches series v2 for online defrag/ext_moving on OCFS2, it's
workable anyway, may look ugly though;) The essence of online file defragmentation is
extents moving like what btrfs and ext4 were doing, adding 'OCFS2_IOC_MOVE_EXT' ioctl
to ocfs2 allows two strategies upon defragmentation:

1. simple-defragmentation-in-kernl, which means kernel will be responsible for
   claiming new clusters, and packing the defragmented extents according to a
   user-specified threshold.

2. simple-extents moving, in this case, userspace play much more important role
   when doing defragmentation, it needs to specify the new physical blk_offset
   where extents will be moved, kernel itself will not do anything more than
   moving the extents per requested, maybe kernel also needs to manage to
   probe/validate the new_blkoffset to guarantee enough free space around there.

Above two operations using the same OCFS2_IOC_MOVE_EXT:
-------------------------------------------------------------------------------
#define OCFS2_MOVE_EXT_FL_AUTO_DEFRAG   (0x00000001)    /* Kernel manages to
                                                           claim new clusters
                                                           as the goal place
                                                           for extents moving */
#define OCFS2_MOVE_EXT_FL_COMPLETE      (0x00000002)    /* Move or defragmenation
                                                           completely gets done.
                                                         */
struct ocfs2_move_extents {
/* All values are in bytes */
        /* in */
        __u64 me_start;         /* Virtual start in the file to move */
        __u64 me_len;           /* Length of the extents to be moved */
        __u64 me_goal;          /* Physical offset of the goal */
        __u64 me_thresh;        /* Maximum distance from goal or threshold
                                   for auto defragmentation */
        __u64 me_flags;         /* flags for the operation:
                                 * - auto defragmentation.
                                 * - refcount,xattr cases.
                                 */

        /* out */
        __u64 me_moved_len;     /* moved length, are we completely done? */
        __u64 me_new_offset;    /* Resulting physical location */
        __u32 me_reserved[2];   /* reserved for futhure */
};
-------------------------------------------------------------------------------

	Following are some interesting data gathered from simple tests:

1. Performance improvement gained on I/O reads:
-------------------------------------------------------------------------------
* Before defragmentation *

[root at ocfs2-box4 ~]# sync
[root at ocfs2-box4 ~]# echo 3>/proc/sys/vm/drop_caches 
[root at ocfs2-box4 ~]# time dd if=/storage/testfile-1 of=/dev/null
640000+0 records in
640000+0 records out
327680000 bytes (328 MB) copied, 19.9351 s, 16.4 MB/s

real	0m19.954s
user	0m0.246s
sys	0m1.111s

* Do defragmentation *

[root at ocfs2-box4 defrag]# ./defrag -s 0 -l 293601280  -t 3145728 /storage/testfile-1

* After defragmentation *

[root at ocfs2-box4 ~]# sync
[root at ocfs2-box4 ~]# echo 3>/proc/sys/vm/drop_caches
[root at ocfs2-box4 ~]# time dd if=/storage/testfile-1 of=/dev/null
640000+0 records in
640000+0 records out
327680000 bytes (328 MB) copied, 6.79885 s, 48.2 MB/s

real	0m6.969s
user	0m0.209s
sys	0m1.063s
-------------------------------------------------------------------------------


2. Extent tree layout via debugfs.ocfs2:
-------------------------------------------------------------------------------
* Before defragmentation *

        Tree Depth: 1   Count: 243   Next Free Rec: 8
        ## Offset        Clusters       Block#
        0  0             1173           86561
        1  1173          1173           84527
        2  2346          1151           81468
        3  3497          1173           76362
        4  4670          1173           74328
        5  5843          1172           66150
        6  7015          1460           70260
        7  8475          662            87680
        SubAlloc Bit: 1   SubAlloc Slot: 0
        Blknum: 86561   Next Leaf: 84527
        CRC32: abf06a6b   ECC: 44bc
        Tree Depth: 0   Count: 252   Next Free Rec: 252
        ## Offset        Clusters       Block#          Flags
        0  1             16             516104          0x0
        1  17            1              554632          0x0
        2  18            7              560144          0x0
        3  25            1              565960          0x0
        4  26            1              572632          0x
	...
	/* around 1700 extent records were hidden there */
	...
	138 9131          1              258968          0x0
        139 9132          1              259568          0x0
        140 9133          1              260168          0x0
        141 9134          1              260768          0x0
        142 9135          1              261368          0x0
        143 9136          1              261968          0x0

* After defragmentation *

      Tree Depth: 1   Count: 243   Next Free Rec: 1
	## Offset        Clusters       Block#
	0  0             9137           66081
	SubAlloc Bit: 1   SubAlloc Slot: 0
	Blknum: 66081   Next Leaf: 0
	CRC32: 22897d34   ECC: 0619
	Tree Depth: 0   Count: 252   Next Free Rec: 6
	## Offset        Clusters       Block#          Flags
	0  1             1600           4412936         0x0 
	1  1601          1595           20669448        0x0 
	2  3196          1600           9358856         0x0 
	3  4796          1404           14516232        0x0 
	4  6200          1600           21627400        0x0 
	5  7800          1337           7483400         0x0 
-------------------------------------------------------------------------------


TO-DO:

1. Adding refcount/xattr support.
2. Free space defragmentation.


Go to http://oss.oracle.com/osswiki/OCFS2/DesignDocs/OnlineDefrag for more details.


Tristan.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Ocfs2-devel] [PATCH 01/14] Ocfs2/refcounttree: Fix a bug for refcounttree to writeback clusters in a right number.
  2011-01-21 10:20 [Ocfs2-devel] [PATCH 0/14] Ocfs2: Online defragmentaion V3 Tristan Ye
@ 2011-01-21 10:20 ` Tristan Ye
  2011-01-27 23:29   ` Mark Fasheh
  2011-02-20 10:40   ` Joel Becker
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 02/14] Ocfs2/refcounttree: Publicate couple of funcs from refcounttree.c Tristan Ye
                   ` (12 subsequent siblings)
  13 siblings, 2 replies; 22+ messages in thread
From: Tristan Ye @ 2011-01-21 10:20 UTC (permalink / raw)
  To: ocfs2-devel

Current refcounttree codes actually didn't writeback the new pages out in
write-back mode, due to a bug of always passing a ZERO number of clusters
to 'ocfs2_cow_sync_writeback', the patch tries to pass a proper one in.

Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
---
 fs/ocfs2/refcounttree.c |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/ocfs2/refcounttree.c b/fs/ocfs2/refcounttree.c
index b5f9160..19ebc5a 100644
--- a/fs/ocfs2/refcounttree.c
+++ b/fs/ocfs2/refcounttree.c
@@ -3228,7 +3228,7 @@ static int ocfs2_make_clusters_writable(struct super_block *sb,
 					u32 num_clusters, unsigned int e_flags)
 {
 	int ret, delete, index, credits =  0;
-	u32 new_bit, new_len;
+	u32 new_bit, new_len, orig_num_clusters;
 	unsigned int set_len;
 	struct ocfs2_super *osb = OCFS2_SB(sb);
 	handle_t *handle;
@@ -3261,6 +3261,8 @@ static int ocfs2_make_clusters_writable(struct super_block *sb,
 		goto out;
 	}
 
+	orig_num_clusters = num_clusters;
+
 	while (num_clusters) {
 		ret = ocfs2_get_refcount_rec(ref_ci, context->ref_root_bh,
 					     p_cluster, num_clusters,
@@ -3348,7 +3350,8 @@ static int ocfs2_make_clusters_writable(struct super_block *sb,
 	 * in write-back mode.
 	 */
 	if (context->get_clusters == ocfs2_di_get_clusters) {
-		ret = ocfs2_cow_sync_writeback(sb, context, cpos, num_clusters);
+		ret = ocfs2_cow_sync_writeback(sb, context, cpos,
+					       orig_num_clusters);
 		if (ret)
 			mlog_errno(ret);
 	}
-- 
1.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Ocfs2-devel] [PATCH 02/14] Ocfs2/refcounttree: Publicate couple of funcs from refcounttree.c
  2011-01-21 10:20 [Ocfs2-devel] [PATCH 0/14] Ocfs2: Online defragmentaion V3 Tristan Ye
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 01/14] Ocfs2/refcounttree: Fix a bug for refcounttree to writeback clusters in a right number Tristan Ye
@ 2011-01-21 10:20 ` Tristan Ye
  2011-01-28  0:05   ` Mark Fasheh
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 03/14] Ocfs2/move_extents: Adding new ioctl code 'OCFS2_IOC_MOVE_EXT' to ocfs2 Tristan Ye
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 22+ messages in thread
From: Tristan Ye @ 2011-01-21 10:20 UTC (permalink / raw)
  To: ocfs2-devel

The original goal of commonizing these funcs is to benefit defraging/extent_moving
codes in the future,  based on the fact that reflink and defragmentation having
the same Copy-On-Wrtie mechanism.

Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
---
 fs/ocfs2/refcounttree.c |   58 +++++++++++++++++++++++-----------------------
 fs/ocfs2/refcounttree.h |   11 +++++++++
 2 files changed, 40 insertions(+), 29 deletions(-)

diff --git a/fs/ocfs2/refcounttree.c b/fs/ocfs2/refcounttree.c
index 19ebc5a..bd4ca78 100644
--- a/fs/ocfs2/refcounttree.c
+++ b/fs/ocfs2/refcounttree.c
@@ -66,7 +66,7 @@ struct ocfs2_cow_context {
 			    u32 *num_clusters,
 			    unsigned int *extent_flags);
 	int (*cow_duplicate_clusters)(handle_t *handle,
-				      struct ocfs2_cow_context *context,
+				      struct file *file,
 				      u32 cpos, u32 old_cluster,
 				      u32 new_cluster, u32 new_len);
 };
@@ -2922,20 +2922,21 @@ static int ocfs2_clear_cow_buffer(handle_t *handle, struct buffer_head *bh)
 	return 0;
 }
 
-static int ocfs2_duplicate_clusters_by_page(handle_t *handle,
-					    struct ocfs2_cow_context *context,
-					    u32 cpos, u32 old_cluster,
-					    u32 new_cluster, u32 new_len)
+int ocfs2_duplicate_clusters_by_page(handle_t *handle,
+				     struct file *file,
+				     u32 cpos, u32 old_cluster,
+				     u32 new_cluster, u32 new_len)
 {
 	int ret = 0, partial;
-	struct ocfs2_caching_info *ci = context->data_et.et_ci;
+	struct inode *inode = file->f_path.dentry->d_inode;
+	struct ocfs2_caching_info *ci = INODE_CACHE(inode);
 	struct super_block *sb = ocfs2_metadata_cache_get_super(ci);
 	u64 new_block = ocfs2_clusters_to_blocks(sb, new_cluster);
 	struct page *page;
 	pgoff_t page_index;
 	unsigned int from, to, readahead_pages;
 	loff_t offset, end, map_end;
-	struct address_space *mapping = context->inode->i_mapping;
+	struct address_space *mapping = inode->i_mapping;
 
 	mlog(0, "old_cluster %u, new %u, len %u at offset %u\n", old_cluster,
 	     new_cluster, new_len, cpos);
@@ -2949,8 +2950,8 @@ static int ocfs2_duplicate_clusters_by_page(handle_t *handle,
 	 * We only duplicate pages until we reach the page contains i_size - 1.
 	 * So trim 'end' to i_size.
 	 */
-	if (end > i_size_read(context->inode))
-		end = i_size_read(context->inode);
+	if (end > i_size_read(inode))
+		end = i_size_read(inode);
 
 	while (offset < end) {
 		page_index = offset >> PAGE_CACHE_SHIFT;
@@ -2973,10 +2974,9 @@ static int ocfs2_duplicate_clusters_by_page(handle_t *handle,
 		if (PAGE_CACHE_SIZE <= OCFS2_SB(sb)->s_clustersize)
 			BUG_ON(PageDirty(page));
 
-		if (PageReadahead(page) && context->file) {
+		if (PageReadahead(page) && file) {
 			page_cache_async_readahead(mapping,
-						   &context->file->f_ra,
-						   context->file,
+						   &file->f_ra, file,
 						   page, page_index,
 						   readahead_pages);
 		}
@@ -3000,8 +3000,7 @@ static int ocfs2_duplicate_clusters_by_page(handle_t *handle,
 			}
 		}
 
-		ocfs2_map_and_dirty_page(context->inode,
-					 handle, from, to,
+		ocfs2_map_and_dirty_page(inode, handle, from, to,
 					 page, 0, &new_block);
 		mark_page_accessed(page);
 unlock:
@@ -3016,14 +3015,15 @@ unlock:
 	return ret;
 }
 
-static int ocfs2_duplicate_clusters_by_jbd(handle_t *handle,
-					   struct ocfs2_cow_context *context,
-					   u32 cpos, u32 old_cluster,
-					   u32 new_cluster, u32 new_len)
+int ocfs2_duplicate_clusters_by_jbd(handle_t *handle,
+				    struct file *file,
+				    u32 cpos, u32 old_cluster,
+				    u32 new_cluster, u32 new_len)
 {
 	int ret = 0;
-	struct super_block *sb = context->inode->i_sb;
-	struct ocfs2_caching_info *ci = context->data_et.et_ci;
+	struct inode *inode = file->f_path.dentry->d_inode;
+	struct super_block *sb = inode->i_sb;
+	struct ocfs2_caching_info *ci = INODE_CACHE(inode);
 	int i, blocks = ocfs2_clusters_to_blocks(sb, new_len);
 	u64 old_block = ocfs2_clusters_to_blocks(sb, old_cluster);
 	u64 new_block = ocfs2_clusters_to_blocks(sb, new_cluster);
@@ -3146,8 +3146,8 @@ static int ocfs2_replace_clusters(handle_t *handle,
 
 	/*If the old clusters is unwritten, no need to duplicate. */
 	if (!(ext_flags & OCFS2_EXT_UNWRITTEN)) {
-		ret = context->cow_duplicate_clusters(handle, context, cpos,
-						      old, new, len);
+		ret = context->cow_duplicate_clusters(handle, context->file,
+						      cpos, old, new, len);
 		if (ret) {
 			mlog_errno(ret);
 			goto out;
@@ -3163,22 +3163,22 @@ out:
 	return ret;
 }
 
-static int ocfs2_cow_sync_writeback(struct super_block *sb,
-				    struct ocfs2_cow_context *context,
-				    u32 cpos, u32 num_clusters)
+int ocfs2_cow_sync_writeback(struct super_block *sb,
+			     struct inode *inode,
+			     u32 cpos, u32 num_clusters)
 {
 	int ret = 0;
 	loff_t offset, end, map_end;
 	pgoff_t page_index;
 	struct page *page;
 
-	if (ocfs2_should_order_data(context->inode))
+	if (ocfs2_should_order_data(inode))
 		return 0;
 
 	offset = ((loff_t)cpos) << OCFS2_SB(sb)->s_clustersize_bits;
 	end = offset + (num_clusters << OCFS2_SB(sb)->s_clustersize_bits);
 
-	ret = filemap_fdatawrite_range(context->inode->i_mapping,
+	ret = filemap_fdatawrite_range(inode->i_mapping,
 				       offset, end - 1);
 	if (ret < 0) {
 		mlog_errno(ret);
@@ -3191,7 +3191,7 @@ static int ocfs2_cow_sync_writeback(struct super_block *sb,
 		if (map_end > end)
 			map_end = end;
 
-		page = find_or_create_page(context->inode->i_mapping,
+		page = find_or_create_page(inode->i_mapping,
 					   page_index, GFP_NOFS);
 		BUG_ON(!page);
 
@@ -3350,7 +3350,7 @@ static int ocfs2_make_clusters_writable(struct super_block *sb,
 	 * in write-back mode.
 	 */
 	if (context->get_clusters == ocfs2_di_get_clusters) {
-		ret = ocfs2_cow_sync_writeback(sb, context, cpos,
+		ret = ocfs2_cow_sync_writeback(sb, context->inode, cpos,
 					       orig_num_clusters);
 		if (ret)
 			mlog_errno(ret);
diff --git a/fs/ocfs2/refcounttree.h b/fs/ocfs2/refcounttree.h
index c8ce46f..7754608 100644
--- a/fs/ocfs2/refcounttree.h
+++ b/fs/ocfs2/refcounttree.h
@@ -84,6 +84,17 @@ int ocfs2_refcount_cow_xattr(struct inode *inode,
 			     struct buffer_head *ref_root_bh,
 			     u32 cpos, u32 write_len,
 			     struct ocfs2_post_refcount *post);
+int ocfs2_duplicate_clusters_by_page(handle_t *handle,
+				     struct file *file,
+				     u32 cpos, u32 old_cluster,
+				     u32 new_cluster, u32 new_len);
+int ocfs2_duplicate_clusters_by_jbd(handle_t *handle,
+				    struct file *file,
+				    u32 cpos, u32 old_cluster,
+				    u32 new_cluster, u32 new_len);
+int ocfs2_cow_sync_writeback(struct super_block *sb,
+			     struct inode *inode,
+			     u32 cpos, u32 num_clusters);
 int ocfs2_add_refcount_flag(struct inode *inode,
 			    struct ocfs2_extent_tree *data_et,
 			    struct ocfs2_caching_info *ref_ci,
-- 
1.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Ocfs2-devel] [PATCH 03/14] Ocfs2/move_extents: Adding new ioctl code 'OCFS2_IOC_MOVE_EXT' to ocfs2.
  2011-01-21 10:20 [Ocfs2-devel] [PATCH 0/14] Ocfs2: Online defragmentaion V3 Tristan Ye
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 01/14] Ocfs2/refcounttree: Fix a bug for refcounttree to writeback clusters in a right number Tristan Ye
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 02/14] Ocfs2/refcounttree: Publicate couple of funcs from refcounttree.c Tristan Ye
@ 2011-01-21 10:20 ` Tristan Ye
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 04/14] Ocfs2/move_extents: Add basic framework and source files for extent moving Tristan Ye
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 22+ messages in thread
From: Tristan Ye @ 2011-01-21 10:20 UTC (permalink / raw)
  To: ocfs2-devel

Patch also manages to add a manipulative struture for this ioctl.

Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
---
 fs/ocfs2/ocfs2_ioctl.h |   29 +++++++++++++++++++++++++++++
 1 files changed, 29 insertions(+), 0 deletions(-)

diff --git a/fs/ocfs2/ocfs2_ioctl.h b/fs/ocfs2/ocfs2_ioctl.h
index b46f39b..f2be820 100644
--- a/fs/ocfs2/ocfs2_ioctl.h
+++ b/fs/ocfs2/ocfs2_ioctl.h
@@ -171,4 +171,33 @@ enum ocfs2_info_type {
 
 #define OCFS2_IOC_INFO		_IOR('o', 5, struct ocfs2_info)
 
+struct ocfs2_move_extents {
+/* All values are in bytes */
+	/* in */
+	__u64 me_start;		/* Virtual start in the file to move */
+	__u64 me_len;		/* Length of the extents to be moved */
+	__u64 me_goal;		/* Physical offset of the goal,
+				   it's in block unit */
+	__u64 me_thresh;	/* Maximum distance from goal or threshold
+				   for auto defragmentation */
+	__u64 me_flags;		/* Flags for the operation:
+				 * - auto defragmentation.
+				 * - refcount,xattr cases.
+				 */
+	/* out */
+	__u64 me_moved_len;	/* Moved/defraged length */
+	__u64 me_new_offset;	/* Resulting physical location */
+	__u32 me_reserved[2];	/* Reserved for futhure */
+};
+
+#define OCFS2_MOVE_EXT_FL_AUTO_DEFRAG	(0x00000001)	/* Kernel manages to
+							   claim new clusters
+							   as the goal place
+							   for extents moving */
+#define OCFS2_MOVE_EXT_FL_COMPLETE	(0x00000002)	/* Move or defragmenation
+							   completely gets done.
+							 */
+
+#define OCFS2_IOC_MOVE_EXT	_IOW('o', 6, struct ocfs2_move_extents)
+
 #endif /* OCFS2_IOCTL_H */
-- 
1.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Ocfs2-devel] [PATCH 04/14] Ocfs2/move_extents: Add basic framework and source files for extent moving.
  2011-01-21 10:20 [Ocfs2-devel] [PATCH 0/14] Ocfs2: Online defragmentaion V3 Tristan Ye
                   ` (2 preceding siblings ...)
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 03/14] Ocfs2/move_extents: Adding new ioctl code 'OCFS2_IOC_MOVE_EXT' to ocfs2 Tristan Ye
@ 2011-01-21 10:20 ` Tristan Ye
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 05/14] Ocfs2/move_extents: lock allocators and reserve metadata blocks and data clusters for extents moving Tristan Ye
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 22+ messages in thread
From: Tristan Ye @ 2011-01-21 10:20 UTC (permalink / raw)
  To: ocfs2-devel

Adding new files move_extents.[c|h] and fill it with nothing but
only a context structure.

Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
---
 fs/ocfs2/Makefile          |    1 +
 fs/ocfs2/cluster/masklog.h |    1 +
 fs/ocfs2/move_extents.c    |   57 ++++++++++++++++++++++++++++++++++++++++++++
 fs/ocfs2/move_extents.h    |   20 +++++++++++++++
 4 files changed, 79 insertions(+), 0 deletions(-)
 create mode 100644 fs/ocfs2/move_extents.c
 create mode 100644 fs/ocfs2/move_extents.h

diff --git a/fs/ocfs2/Makefile b/fs/ocfs2/Makefile
index 07d9fd8..5166f2d 100644
--- a/fs/ocfs2/Makefile
+++ b/fs/ocfs2/Makefile
@@ -30,6 +30,7 @@ ocfs2-objs := \
 	namei.o 		\
 	refcounttree.o		\
 	reservations.o		\
+	move_extents.o		\
 	resize.o		\
 	slot_map.o 		\
 	suballoc.o 		\
diff --git a/fs/ocfs2/cluster/masklog.h b/fs/ocfs2/cluster/masklog.h
index ea2ed9f..36f1cb2 100644
--- a/fs/ocfs2/cluster/masklog.h
+++ b/fs/ocfs2/cluster/masklog.h
@@ -121,6 +121,7 @@
 #define ML_KTHREAD	0x0000000400000000ULL /* kernel thread activity */
 #define ML_RESERVATIONS	0x0000000800000000ULL /* ocfs2 alloc reservations */
 #define ML_CLUSTER	0x0000001000000000ULL /* cluster stack */
+#define ML_MOVE_EXT	0x0000002000000000ULL /* move extents */
 
 #define MLOG_INITIAL_AND_MASK (ML_ERROR|ML_NOTICE)
 #define MLOG_INITIAL_NOT_MASK (ML_ENTRY|ML_EXIT)
diff --git a/fs/ocfs2/move_extents.c b/fs/ocfs2/move_extents.c
new file mode 100644
index 0000000..b79c2d3
--- /dev/null
+++ b/fs/ocfs2/move_extents.c
@@ -0,0 +1,57 @@
+/* -*- mode: c; c-basic-offset: 8; -*-
+ * vim: noexpandtab sw=8 ts=8 sts=0:
+ *
+ * move_extents.c
+ *
+ * Copyright (C) 2010 Oracle.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ */
+#include <linux/fs.h>
+#include <linux/types.h>
+#include <linux/mount.h>
+#include <linux/swap.h>
+
+#define MLOG_MASK_PREFIX ML_MOVE_EXT
+#include <cluster/masklog.h>
+
+#include "ocfs2.h"
+#include "ocfs2_ioctl.h"
+
+#include "alloc.h"
+#include "aops.h"
+#include "dlmglue.h"
+#include "extent_map.h"
+#include "inode.h"
+#include "journal.h"
+#include "suballoc.h"
+#include "uptodate.h"
+#include "super.h"
+#include "dir.h"
+#include "buffer_head_io.h"
+#include "sysfile.h"
+#include "suballoc.h"
+#include "refcounttree.h"
+#include "move_extents.h"
+
+struct ocfs2_move_extents_context {
+	struct inode *inode;
+	struct file *file;
+	int auto_defrag;
+	int credits;
+	u32 new_phys_cpos;
+	u32 clusters_moved;
+	u64 refcount_loc;
+	struct ocfs2_move_extents *range;
+	struct ocfs2_extent_tree et;
+	struct ocfs2_alloc_context *meta_ac;
+	struct ocfs2_alloc_context *data_ac;
+	struct ocfs2_cached_dealloc_ctxt dealloc;
+};
diff --git a/fs/ocfs2/move_extents.h b/fs/ocfs2/move_extents.h
new file mode 100644
index 0000000..eba4044
--- /dev/null
+++ b/fs/ocfs2/move_extents.h
@@ -0,0 +1,20 @@
+/* -*- mode: c; c-basic-offset: 8; -*-
+ * vim: noexpandtab sw=8 ts=8 sts=0:
+ *
+ * move_extents.h
+ *
+ * Copyright (C) 2010 Oracle.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ */
+#ifndef OCFS2_MOVE_EXTENTS_H
+#define OCFS2_MOVE_EXTENTS_H
+
+#endif /* OCFS2_MOVE_EXTENTS_H */
-- 
1.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Ocfs2-devel] [PATCH 05/14] Ocfs2/move_extents: lock allocators and reserve metadata blocks and data clusters for extents moving.
  2011-01-21 10:20 [Ocfs2-devel] [PATCH 0/14] Ocfs2: Online defragmentaion V3 Tristan Ye
                   ` (3 preceding siblings ...)
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 04/14] Ocfs2/move_extents: Add basic framework and source files for extent moving Tristan Ye
@ 2011-01-21 10:20 ` Tristan Ye
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 06/14] Ocfs2/move_extents: move a range of extent Tristan Ye
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 22+ messages in thread
From: Tristan Ye @ 2011-01-21 10:20 UTC (permalink / raw)
  To: ocfs2-devel

ocfs2_lock_allocators_move_extents() was like the common ocfs2_lock_allocators(),
to lock metadata and data alloctors during extents moving, reserve appropriate
metadata blocks and data clusters, also performa a best- effort to calculate the
credits for journal transaction in one run of movement.

Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
---
 fs/ocfs2/move_extents.c |   61 +++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 61 insertions(+), 0 deletions(-)

diff --git a/fs/ocfs2/move_extents.c b/fs/ocfs2/move_extents.c
index b79c2d3..9b30636 100644
--- a/fs/ocfs2/move_extents.c
+++ b/fs/ocfs2/move_extents.c
@@ -55,3 +55,64 @@ struct ocfs2_move_extents_context {
 	struct ocfs2_alloc_context *data_ac;
 	struct ocfs2_cached_dealloc_ctxt dealloc;
 };
+
+/*
+ * lock allocators, and reserving appropriate number of bits for
+ * meta blocks and data clusters.
+ *
+ * in some cases, we don't need to reserve clusters, just let data_ac
+ * be NULL.
+ */
+static int ocfs2_lock_allocators_move_extents(struct inode *inode,
+					struct ocfs2_extent_tree *et,
+					u32 clusters_to_move,
+					u32 extents_to_split,
+					struct ocfs2_alloc_context **meta_ac,
+					struct ocfs2_alloc_context **data_ac,
+					int extra_blocks,
+					int *credits)
+{
+	int ret, num_free_extents;
+	unsigned int max_recs_needed = 2 * extents_to_split + clusters_to_move;
+	struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
+
+	num_free_extents = ocfs2_num_free_extents(osb, et);
+	if (num_free_extents < 0) {
+		ret = num_free_extents;
+		mlog_errno(ret);
+		goto out;
+	}
+
+	if (!num_free_extents ||
+	    (ocfs2_sparse_alloc(osb) && num_free_extents < max_recs_needed))
+		extra_blocks += ocfs2_extend_meta_needed(et->et_root_el);
+
+	ret = ocfs2_reserve_new_metadata_blocks(osb, extra_blocks, meta_ac);
+	if (ret) {
+		mlog_errno(ret);
+		goto out;
+	}
+
+	if (data_ac) {
+		ret = ocfs2_reserve_clusters(osb, clusters_to_move, data_ac);
+		if (ret) {
+			mlog_errno(ret);
+			goto out;
+		}
+	}
+
+	*credits += ocfs2_calc_extend_credits(osb->sb, et->et_root_el,
+					      clusters_to_move + 2);
+
+	mlog(0, "reserve metadata_blocks: %d, data_clusters: %u, credits: %d\n",
+	     extra_blocks, clusters_to_move, *credits);
+out:
+	if (ret) {
+		if (*meta_ac) {
+			ocfs2_free_alloc_context(*meta_ac);
+			*meta_ac = NULL;
+		}
+	}
+
+	return ret;
+}
-- 
1.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Ocfs2-devel] [PATCH 06/14] Ocfs2/move_extents: move a range of extent.
  2011-01-21 10:20 [Ocfs2-devel] [PATCH 0/14] Ocfs2: Online defragmentaion V3 Tristan Ye
                   ` (4 preceding siblings ...)
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 05/14] Ocfs2/move_extents: lock allocators and reserve metadata blocks and data clusters for extents moving Tristan Ye
@ 2011-01-21 10:20 ` Tristan Ye
  2011-01-28  1:10   ` Mark Fasheh
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 07/14] Ocfs2/move_extents: defrag " Tristan Ye
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 22+ messages in thread
From: Tristan Ye @ 2011-01-21 10:20 UTC (permalink / raw)
  To: ocfs2-devel

The moving range of __ocfs2_move_extent() was within one extent always, it
consists following parts:

1. Duplicates the clusters in pages to new_blkoffset, where extent to be moved.

2. Split the original extent with new extent, coalecse the nearby extents if possible.

3. Append old clusters to truncate log, or decrease_refcount if the extent was refcounted.

Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
---
 fs/ocfs2/move_extents.c |  104 +++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 104 insertions(+), 0 deletions(-)

diff --git a/fs/ocfs2/move_extents.c b/fs/ocfs2/move_extents.c
index 9b30636..e28bd7d 100644
--- a/fs/ocfs2/move_extents.c
+++ b/fs/ocfs2/move_extents.c
@@ -56,6 +56,110 @@ struct ocfs2_move_extents_context {
 	struct ocfs2_cached_dealloc_ctxt dealloc;
 };
 
+static int __ocfs2_move_extent(handle_t *handle,
+			       struct ocfs2_move_extents_context *context,
+			       u32 cpos, u32 len, u32 p_cpos, u32 new_p_cpos,
+			       int ext_flags)
+{
+	int ret = 0, index;
+	struct inode *inode = context->inode;
+	struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
+	struct ocfs2_extent_rec *rec, replace_rec;
+	struct ocfs2_path *path = NULL;
+	struct ocfs2_extent_list *el;
+	u64 ino = ocfs2_metadata_cache_owner(context->et.et_ci);
+	u64 old_blkno = ocfs2_clusters_to_blocks(inode->i_sb, p_cpos);
+
+	ret = ocfs2_duplicate_clusters_by_page(handle, context->file, cpos,
+					       p_cpos, new_p_cpos, len);
+	if (ret) {
+		mlog_errno(ret);
+		goto out;
+	}
+
+	memset(&replace_rec, 0, sizeof(replace_rec));
+	replace_rec.e_cpos = cpu_to_le32(cpos);
+	replace_rec.e_leaf_clusters = cpu_to_le16(len);
+	replace_rec.e_blkno = cpu_to_le64(ocfs2_clusters_to_blocks(inode->i_sb,
+								   new_p_cpos));
+
+	path = ocfs2_new_path_from_et(&context->et);
+	if (!path) {
+		ret = -ENOMEM;
+		mlog_errno(ret);
+		goto out;
+	}
+
+	ret = ocfs2_find_path(INODE_CACHE(inode), path, cpos);
+	if (ret) {
+		mlog_errno(ret);
+		goto out;
+	}
+
+	el = path_leaf_el(path);
+
+	index = ocfs2_search_extent_list(el, cpos);
+	if (index == -1 || index >= le16_to_cpu(el->l_next_free_rec)) {
+		ocfs2_error(inode->i_sb,
+			    "Inode %llu has an extent at cpos %u which can no "
+			    "longer be found.\n",
+			    (unsigned long long)ino, cpos);
+		ret = -EROFS;
+		goto out;
+	}
+
+	rec = &el->l_recs[index];
+
+	BUG_ON(ext_flags != rec->e_flags);
+	/*
+	 * after moving/defraging to new location, the extent is not going
+	 * to be refcounted anymore.
+	 */
+	if (ext_flags & OCFS2_EXT_REFCOUNTED)
+		replace_rec.e_flags = ext_flags & ~OCFS2_EXT_REFCOUNTED;
+	else
+		replace_rec.e_flags = ext_flags;
+
+	ret = ocfs2_journal_access_di(handle, INODE_CACHE(inode),
+				      context->et.et_root_bh,
+				      OCFS2_JOURNAL_ACCESS_WRITE);
+	if (ret) {
+		mlog_errno(ret);
+		goto out;
+	}
+
+	ret = ocfs2_split_extent(handle, &context->et, path, index,
+				 &replace_rec, context->meta_ac,
+				 &context->dealloc);
+	if (ret) {
+		mlog_errno(ret);
+		goto out;
+	}
+
+	ocfs2_journal_dirty(handle, context->et.et_root_bh);
+
+	context->new_phys_cpos = new_p_cpos;
+
+	/*
+	 * need I to append truncate log for old clusters?
+	 */
+	if (old_blkno) {
+		if (ext_flags & OCFS2_EXT_REFCOUNTED)
+			ret = ocfs2_decrease_refcount(inode, handle,
+					ocfs2_blocks_to_clusters(osb->sb,
+								 old_blkno),
+					len, context->meta_ac,
+					&context->dealloc, 1);
+			
+		else
+			ret = ocfs2_truncate_log_append(osb, handle,
+							old_blkno, len);
+	}
+
+out:
+	return ret;
+}
+
 /*
  * lock allocators, and reserving appropriate number of bits for
  * meta blocks and data clusters.
-- 
1.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Ocfs2-devel] [PATCH 07/14] Ocfs2/move_extents: defrag a range of extent.
  2011-01-21 10:20 [Ocfs2-devel] [PATCH 0/14] Ocfs2: Online defragmentaion V3 Tristan Ye
                   ` (5 preceding siblings ...)
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 06/14] Ocfs2/move_extents: move a range of extent Tristan Ye
@ 2011-01-21 10:20 ` Tristan Ye
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 08/14] Ocfs2/move_extents: find the victim alloc group, where the given #blk fits Tristan Ye
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 22+ messages in thread
From: Tristan Ye @ 2011-01-21 10:20 UTC (permalink / raw)
  To: ocfs2-devel

It's a relatively complete function to accomplish defragmentation for entire
or partial extent, one journal handle was kept during the operation, it was
logically doing one more thing than ocfs2_move_extent() acutally, yes, it's
claiming the new clusters itself;-)

Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
---
 fs/ocfs2/move_extents.c |  136 +++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 136 insertions(+), 0 deletions(-)

diff --git a/fs/ocfs2/move_extents.c b/fs/ocfs2/move_extents.c
index e28bd7d..071bb12 100644
--- a/fs/ocfs2/move_extents.c
+++ b/fs/ocfs2/move_extents.c
@@ -220,3 +220,139 @@ out:
 
 	return ret;
 }
+
+/*
+ * Using one journal handle to guarantee the data consistency in case
+ * crash happens anywhere.
+ */
+static int ocfs2_defrag_extent(struct ocfs2_move_extents_context *context,
+			       u32 cpos, u32 phys_cpos, u32 len, int ext_flags)
+{
+	int ret, credits = 0, extra_blocks = 0;
+	handle_t *handle;
+	struct inode *inode = context->inode;
+	struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
+	struct inode *tl_inode = osb->osb_tl_inode;
+	struct ocfs2_refcount_tree *ref_tree = NULL;
+	u32 new_phys_cpos, new_len;
+	u64 phys_blkno = ocfs2_clusters_to_blocks(inode->i_sb, phys_cpos);
+
+	if ((ext_flags & OCFS2_EXT_REFCOUNTED) && len) {
+
+		BUG_ON(!(OCFS2_I(inode)->ip_dyn_features &
+			 OCFS2_HAS_REFCOUNT_FL));
+
+		BUG_ON(!context->refcount_loc);
+
+		ret = ocfs2_lock_refcount_tree(osb, context->refcount_loc, 1,
+					       &ref_tree, NULL);
+		if (ret) {
+			mlog_errno(ret);
+			return ret;
+		}
+		
+		ret = ocfs2_prepare_refcount_change_for_del(inode,
+							context->refcount_loc,
+							phys_blkno,
+							len,
+							&credits,
+							&extra_blocks);
+		if (ret) {
+			mlog_errno(ret);
+			goto out;
+		}
+	}
+
+	ret = ocfs2_lock_allocators_move_extents(inode, &context->et, len, 1,
+						 &context->meta_ac,
+						 &context->data_ac,
+						 extra_blocks, &credits);
+	if (ret) {
+		mlog_errno(ret);
+		goto out;
+	}
+
+	/*
+	 * should be using allocation reservation strategy there?
+	 *
+	 * if (context->data_ac)
+	 *	context->data_ac->ac_resv = &OCFS2_I(inode)->ip_la_data_resv;
+	 */
+
+	mutex_lock(&tl_inode->i_mutex);
+
+	if (ocfs2_truncate_log_needs_flush(osb)) {
+		ret = __ocfs2_flush_truncate_log(osb);
+		if (ret < 0) {
+			mlog_errno(ret);
+			goto out_unlock_mutex;
+		}
+	}
+
+	handle = ocfs2_start_trans(osb, credits);
+	if (IS_ERR(handle)) {
+		ret = PTR_ERR(handle);
+		mlog_errno(ret);
+		goto out_unlock_mutex;
+	}
+
+	ret = __ocfs2_claim_clusters(handle, context->data_ac, 1, len,
+				     &new_phys_cpos, &new_len);
+	if (ret) {
+		mlog_errno(ret);
+		goto out_commit;
+	}
+
+	/*
+	 * we're not quite patient here to make multiple attempts for claiming
+	 * enough clusters, failure to claim clusters per-requested is not a
+	 * disaster though, it can only mean partial range of defragmentation
+	 * or extent movements gets gone, users anyway is able to have another
+	 * try as they wish anytime, since they're going to be returned a
+	 * '-ENOSPC' and completed length of this movement.
+	 */
+	if (new_len != len) {
+		mlog(0, "len_claimed: %u, len: %u\n", new_len, len);
+		context->range->me_flags &= ~OCFS2_MOVE_EXT_FL_COMPLETE;
+		ret = -ENOSPC;
+		goto out_commit;
+	}
+
+	mlog(0, "cpos: %u, phys_cpos: %u, new_phys_cpos: %u\n", cpos,
+	     phys_cpos, new_phys_cpos);
+
+	ret = __ocfs2_move_extent(handle, context, cpos, len, phys_cpos,
+				  new_phys_cpos, ext_flags);
+	if (ret)
+		mlog_errno(ret);
+
+	/*
+	 * Here we should write the new page out first if we are
+	 * in write-back mode.
+	 */
+	ret = ocfs2_cow_sync_writeback(inode->i_sb, context->inode, cpos, len);
+	if (ret)
+		mlog_errno(ret);
+
+out_commit:
+	ocfs2_commit_trans(osb, handle);
+
+out_unlock_mutex:
+	mutex_unlock(&tl_inode->i_mutex);
+
+	if (context->data_ac) {
+		ocfs2_free_alloc_context(context->data_ac);
+		context->data_ac = NULL;
+	}
+
+	if (context->meta_ac) {
+		ocfs2_free_alloc_context(context->meta_ac);
+		context->meta_ac = NULL;
+	}
+
+out:
+	if (ref_tree)
+		ocfs2_unlock_refcount_tree(osb, ref_tree, 1);
+
+	return ret;
+}
-- 
1.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Ocfs2-devel] [PATCH 08/14] Ocfs2/move_extents: find the victim alloc group, where the given #blk fits.
  2011-01-21 10:20 [Ocfs2-devel] [PATCH 0/14] Ocfs2: Online defragmentaion V3 Tristan Ye
                   ` (6 preceding siblings ...)
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 07/14] Ocfs2/move_extents: defrag " Tristan Ye
@ 2011-01-21 10:20 ` Tristan Ye
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 09/14] Ocfs2/move_extents: helper to validate and adjust moving goal Tristan Ye
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 22+ messages in thread
From: Tristan Ye @ 2011-01-21 10:20 UTC (permalink / raw)
  To: ocfs2-devel

This function tries locate the right alloc group, where a given physical block
resides, it returns the caller a buffer_head of victim group descriptor, and also
the offset of block in this group, by passing the block number.

Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
---
 fs/ocfs2/move_extents.c |  104 +++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 104 insertions(+), 0 deletions(-)

diff --git a/fs/ocfs2/move_extents.c b/fs/ocfs2/move_extents.c
index 071bb12..2678605 100644
--- a/fs/ocfs2/move_extents.c
+++ b/fs/ocfs2/move_extents.c
@@ -356,3 +356,107 @@ out:
 
 	return ret;
 }
+
+/*
+ * find the victim alloc group, where #blkno fits.
+ */
+static int ocfs2_find_victim_alloc_group(struct inode *inode,
+					 u64 vict_blkno,
+					 int type, int slot,
+					 int *vict_bit,
+					 struct buffer_head **ret_bh)
+{
+	int ret, i, blocks_per_unit = 1;
+	u64 blkno;
+	char namebuf[40];
+
+	struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
+	struct buffer_head *ac_bh = NULL, *gd_bh = NULL;
+	struct ocfs2_chain_list *cl;
+	struct ocfs2_chain_rec *rec;
+	struct ocfs2_dinode *ac_dinode;
+	struct ocfs2_group_desc *bg;
+
+	ocfs2_sprintf_system_inode_name(namebuf, sizeof(namebuf), type, slot);
+	ret = ocfs2_lookup_ino_from_name(osb->sys_root_inode, namebuf,
+					 strlen(namebuf), &blkno);
+	if (ret) {
+		ret = -ENOENT;
+		goto out;
+	}
+
+	ret = ocfs2_read_blocks_sync(osb, blkno, 1, &ac_bh);
+	if (ret) {
+		mlog_errno(ret);
+		goto out;
+	}
+
+	ac_dinode = (struct ocfs2_dinode *)ac_bh->b_data;
+	cl = &(ac_dinode->id2.i_chain);
+	rec = &(cl->cl_recs[0]);
+
+	if (type == GLOBAL_BITMAP_SYSTEM_INODE)
+		blocks_per_unit <<= (osb->s_clustersize_bits -
+						inode->i_sb->s_blocksize_bits);
+	/*
+	 * 'vict_blkno' was out of the valid range.
+	 */
+	if ((vict_blkno < le64_to_cpu(rec->c_blkno)) ||
+	    (vict_blkno >= (le32_to_cpu(ac_dinode->id1.bitmap1.i_total) *
+				blocks_per_unit))) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	for (i = 0; i < le16_to_cpu(cl->cl_next_free_rec); i++) {
+
+		rec = &(cl->cl_recs[i]);
+		if (!rec)
+			continue;
+
+		bg = NULL;
+
+		do {
+			if (!bg)
+				blkno = le64_to_cpu(rec->c_blkno);
+			else
+				blkno = le64_to_cpu(bg->bg_next_group);
+
+			if (gd_bh) {
+				brelse(gd_bh);
+				gd_bh = NULL;
+			}
+
+			ret = ocfs2_read_blocks_sync(osb, blkno, 1, &gd_bh);
+			if (ret) {
+				mlog_errno(ret);
+				goto out;
+			}
+
+			bg = (struct ocfs2_group_desc *)gd_bh->b_data;
+
+			if (vict_blkno < (le64_to_cpu(bg->bg_blkno) +
+						le16_to_cpu(bg->bg_bits))) {
+
+				*ret_bh = gd_bh;
+				*vict_bit = (vict_blkno - blkno) /
+							blocks_per_unit;
+				mlog(0, "find the victim group: #%llu, "
+				     "total_bits: %u, vict_bit: %u\n",
+				     blkno, le16_to_cpu(bg->bg_bits),
+				     *vict_bit);
+				goto out;
+			}
+
+		} while (le64_to_cpu(bg->bg_next_group));
+	}
+
+	ret = -EINVAL;
+out:
+	brelse(ac_bh);
+
+	/*
+	 * caller has to release the gd_bh properly.
+	 */
+	return ret;
+}
-- 
1.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Ocfs2-devel] [PATCH 09/14] Ocfs2/move_extents: helper to validate and adjust moving goal.
  2011-01-21 10:20 [Ocfs2-devel] [PATCH 0/14] Ocfs2: Online defragmentaion V3 Tristan Ye
                   ` (7 preceding siblings ...)
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 08/14] Ocfs2/move_extents: find the victim alloc group, where the given #blk fits Tristan Ye
@ 2011-01-21 10:20 ` Tristan Ye
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 10/14] Ocfs2/move_extents: helper to probe a proper region to move in an alloc group Tristan Ye
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 22+ messages in thread
From: Tristan Ye @ 2011-01-21 10:20 UTC (permalink / raw)
  To: ocfs2-devel

First best-effort attempt to validate and adjust the goal (physical address in
block), while it can't guarantee later operation can succeed all the time since
global_bitmap may change a bit over time.

Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
---
 fs/ocfs2/move_extents.c |   61 +++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 61 insertions(+), 0 deletions(-)

diff --git a/fs/ocfs2/move_extents.c b/fs/ocfs2/move_extents.c
index 2678605..f4410e3 100644
--- a/fs/ocfs2/move_extents.c
+++ b/fs/ocfs2/move_extents.c
@@ -460,3 +460,64 @@ out:
 	 */
 	return ret;
 }
+
+/*
+ * XXX: helper to validate and adjust moving goal.
+ */
+static int ocfs2_validate_and_adjust_move_goal(struct inode *inode,
+					       struct ocfs2_move_extents *range)
+{
+	int ret, goal_bit = 0;
+
+	struct buffer_head *gd_bh = NULL;
+	struct ocfs2_group_desc *bg;
+	struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
+	int c_to_b = 1 << (osb->s_clustersize_bits -
+					inode->i_sb->s_blocksize_bits);
+
+	/*
+	 * validate goal sits within global_bitmap, and return the victim
+	 * group desc
+	 */
+	ret = ocfs2_find_victim_alloc_group(inode, range->me_goal,
+					    GLOBAL_BITMAP_SYSTEM_INODE,
+					    OCFS2_INVALID_SLOT,
+					    &goal_bit, &gd_bh);
+	if (ret)
+		goto out;
+
+	bg = (struct ocfs2_group_desc *)gd_bh->b_data;
+
+	/*
+	 * make goal become cluster aligned.
+	 */
+	if (range->me_goal % c_to_b)
+		range->me_goal = range->me_goal / c_to_b * c_to_b;
+
+	/*
+	 * moving goal is not allowd to start with a group desc blok(#0 blk)
+	 * let's compromise to the latter cluster.
+	 */
+	if (range->me_goal == le64_to_cpu(bg->bg_blkno))
+		range->me_goal += c_to_b;
+
+	/*
+	 * movement is not gonna cross two groups.
+	 */
+	if ((le16_to_cpu(bg->bg_bits) - goal_bit) * osb->s_clustersize <
+								range->me_len) {
+		ret = -EINVAL;
+		goto out;
+	}
+	/*
+	 * more exact validations/adjustments will be performed later during
+	 * moving operation for each extent range.
+	 */
+	mlog(0, "extents get ready to be moved to #%llu block\n",
+	     range->me_goal);
+
+out:
+	brelse(gd_bh);
+
+	return ret;
+}
-- 
1.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Ocfs2-devel] [PATCH 10/14] Ocfs2/move_extents: helper to probe a proper region to move in an alloc group.
  2011-01-21 10:20 [Ocfs2-devel] [PATCH 0/14] Ocfs2: Online defragmentaion V3 Tristan Ye
                   ` (8 preceding siblings ...)
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 09/14] Ocfs2/move_extents: helper to validate and adjust moving goal Tristan Ye
@ 2011-01-21 10:20 ` Tristan Ye
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 11/14] Ocfs2/move_extents: helpers to update the group descriptor and global bitmap inode Tristan Ye
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 22+ messages in thread
From: Tristan Ye @ 2011-01-21 10:20 UTC (permalink / raw)
  To: ocfs2-devel

Before doing the movement of extents, we'd better probe the alloc group from
'goal_blk' for searching a contiguous region to fit the wanted movement, we
even will have a best-effort try by compromising to a threshold around the
given goal.

Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
---
 fs/ocfs2/move_extents.c |   39 +++++++++++++++++++++++++++++++++++++++
 1 files changed, 39 insertions(+), 0 deletions(-)

diff --git a/fs/ocfs2/move_extents.c b/fs/ocfs2/move_extents.c
index f4410e3..61b2871 100644
--- a/fs/ocfs2/move_extents.c
+++ b/fs/ocfs2/move_extents.c
@@ -521,3 +521,42 @@ out:
 
 	return ret;
 }
+
+static void ocfs2_probe_alloc_group(struct inode *inode, struct buffer_head *bh,
+				    int *goal_bit, u32 move_len, u32 max_hop,
+				    u32 *phys_cpos)
+{
+	int i, used, last_free_bits = 0, base_bit = *goal_bit;
+	struct ocfs2_group_desc *gd = (struct ocfs2_group_desc *)bh->b_data;
+	u32 base_cpos = ocfs2_blocks_to_clusters(inode->i_sb,
+						 le64_to_cpu(gd->bg_blkno));
+
+	for (i = base_bit; i < le16_to_cpu(gd->bg_bits); i++) {
+
+		used = ocfs2_test_bit(i, (unsigned long *)gd->bg_bitmap);
+		if (used) {
+			/*
+			 * we even tried searching the free chunk by jumping
+			 * a 'max_hop' distance, but still failed.
+			 */
+			if ((i - base_bit) > max_hop) {
+				*phys_cpos = 0;
+				break;
+			}
+
+			if (last_free_bits)
+				last_free_bits = 0;
+
+			continue;
+		} else
+			last_free_bits++;
+
+		if (last_free_bits == move_len) {
+			*goal_bit = i;
+			*phys_cpos = base_cpos + i;
+			break;
+		}
+	}
+
+	mlog(0, "found phys_cpos: %u to fit the wanted moving.\n", *phys_cpos);
+}
-- 
1.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Ocfs2-devel] [PATCH 11/14] Ocfs2/move_extents: helpers to update the group descriptor and global bitmap inode.
  2011-01-21 10:20 [Ocfs2-devel] [PATCH 0/14] Ocfs2: Online defragmentaion V3 Tristan Ye
                   ` (9 preceding siblings ...)
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 10/14] Ocfs2/move_extents: helper to probe a proper region to move in an alloc group Tristan Ye
@ 2011-01-21 10:20 ` Tristan Ye
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 12/14] Ocfs2/move_extents: move entire/partial extent Tristan Ye
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 22+ messages in thread
From: Tristan Ye @ 2011-01-21 10:20 UTC (permalink / raw)
  To: ocfs2-devel

These helpers were actually borrowed from alloc.c, which may be publicated
later.

Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
---
 fs/ocfs2/move_extents.c |   80 +++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 80 insertions(+), 0 deletions(-)

diff --git a/fs/ocfs2/move_extents.c b/fs/ocfs2/move_extents.c
index 61b2871..2ee74cd 100644
--- a/fs/ocfs2/move_extents.c
+++ b/fs/ocfs2/move_extents.c
@@ -560,3 +560,83 @@ static void ocfs2_probe_alloc_group(struct inode *inode, struct buffer_head *bh,
 
 	mlog(0, "found phys_cpos: %u to fit the wanted moving.\n", *phys_cpos);
 }
+
+static int ocfs2_alloc_dinode_update_counts(struct inode *inode,
+				       handle_t *handle,
+				       struct buffer_head *di_bh,
+				       u32 num_bits,
+				       u16 chain)
+{
+	int ret;
+	u32 tmp_used;
+	struct ocfs2_dinode *di = (struct ocfs2_dinode *) di_bh->b_data;
+	struct ocfs2_chain_list *cl =
+				(struct ocfs2_chain_list *) &di->id2.i_chain;
+
+	ret = ocfs2_journal_access_di(handle, INODE_CACHE(inode), di_bh,
+				      OCFS2_JOURNAL_ACCESS_WRITE);
+	if (ret < 0) {
+		mlog_errno(ret);
+		goto out;
+	}
+
+	tmp_used = le32_to_cpu(di->id1.bitmap1.i_used);
+	di->id1.bitmap1.i_used = cpu_to_le32(num_bits + tmp_used);
+	le32_add_cpu(&cl->cl_recs[chain].c_free, -num_bits);
+	ocfs2_journal_dirty(handle, di_bh);
+
+out:
+	return ret;
+}
+
+static inline int ocfs2_block_group_set_bits(handle_t *handle,
+					     struct inode *alloc_inode,
+					     struct ocfs2_group_desc *bg,
+					     struct buffer_head *group_bh,
+					     unsigned int bit_off,
+					     unsigned int num_bits)
+{
+	int status;
+	void *bitmap = bg->bg_bitmap;
+	int journal_type = OCFS2_JOURNAL_ACCESS_WRITE;
+
+	mlog_entry_void();
+
+	/* All callers get the descriptor via
+	 * ocfs2_read_group_descriptor().  Any corruption is a code bug. */
+	BUG_ON(!OCFS2_IS_VALID_GROUP_DESC(bg));
+	BUG_ON(le16_to_cpu(bg->bg_free_bits_count) < num_bits);
+
+	mlog(0, "block_group_set_bits: off = %u, num = %u\n", bit_off,
+	     num_bits);
+
+	if (ocfs2_is_cluster_bitmap(alloc_inode))
+		journal_type = OCFS2_JOURNAL_ACCESS_UNDO;
+
+	status = ocfs2_journal_access_gd(handle,
+					 INODE_CACHE(alloc_inode),
+					 group_bh,
+					 journal_type);
+	if (status < 0) {
+		mlog_errno(status);
+		goto bail;
+	}
+
+	le16_add_cpu(&bg->bg_free_bits_count, -num_bits);
+	if (le16_to_cpu(bg->bg_free_bits_count) > le16_to_cpu(bg->bg_bits)) {
+		ocfs2_error(alloc_inode->i_sb, "Group descriptor # %llu has bit"
+			    " count %u but claims %u are freed. num_bits %d",
+			    (unsigned long long)le64_to_cpu(bg->bg_blkno),
+			    le16_to_cpu(bg->bg_bits),
+			    le16_to_cpu(bg->bg_free_bits_count), num_bits);
+		return -EROFS;
+	}
+	while (num_bits--)
+		ocfs2_set_bit(bit_off++, bitmap);
+
+	ocfs2_journal_dirty(handle, group_bh);
+
+bail:
+	mlog_exit(status);
+	return status;
+}
-- 
1.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Ocfs2-devel] [PATCH 12/14] Ocfs2/move_extents: move entire/partial extent.
  2011-01-21 10:20 [Ocfs2-devel] [PATCH 0/14] Ocfs2: Online defragmentaion V3 Tristan Ye
                   ` (10 preceding siblings ...)
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 11/14] Ocfs2/move_extents: helpers to update the group descriptor and global bitmap inode Tristan Ye
@ 2011-01-21 10:20 ` Tristan Ye
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 13/14] Ocfs2/move_extents: helper to calculate the defraging length in one run Tristan Ye
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 14/14] Ocfs2/move_extents: move/defrag extents within a certain range Tristan Ye
  13 siblings, 0 replies; 22+ messages in thread
From: Tristan Ye @ 2011-01-21 10:20 UTC (permalink / raw)
  To: ocfs2-devel

ocfs2_move_extent() logic will validate the goal_offset_in_block,
where extents to be moved, what's more, it also compromises a bit
to probe the appropriate region around given goal_offset when the
original goal is not able to fit the movement.

Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
---
 fs/ocfs2/move_extents.c |  165 +++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 165 insertions(+), 0 deletions(-)

diff --git a/fs/ocfs2/move_extents.c b/fs/ocfs2/move_extents.c
index 2ee74cd..c2b7638 100644
--- a/fs/ocfs2/move_extents.c
+++ b/fs/ocfs2/move_extents.c
@@ -640,3 +640,168 @@ bail:
 	mlog_exit(status);
 	return status;
 }
+
+static int ocfs2_move_extent(struct ocfs2_move_extents_context *context,
+			     u32 cpos, u32 phys_cpos, u32 *new_phys_cpos,
+			     u32 len, int ext_flags)
+{
+	int ret, credits = 0, extra_blocks = 0, goal_bit = 0;
+	handle_t *handle;
+	struct inode *inode = context->inode;
+	struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
+	struct inode *tl_inode = osb->osb_tl_inode;
+	struct inode *gb_inode = NULL;
+	struct buffer_head *gb_bh = NULL;
+	struct buffer_head *gd_bh = NULL;
+	struct ocfs2_group_desc *gd;
+	struct ocfs2_refcount_tree *ref_tree = NULL;
+	u32 move_max_hop = ocfs2_blocks_to_clusters(inode->i_sb,
+						    context->range->me_thresh);
+	u64 phys_blkno, new_phys_blkno;
+
+	phys_blkno = ocfs2_clusters_to_blocks(inode->i_sb, phys_cpos);
+
+	if ((ext_flags & OCFS2_EXT_REFCOUNTED) && len) {
+
+		BUG_ON(!(OCFS2_I(inode)->ip_dyn_features &
+			 OCFS2_HAS_REFCOUNT_FL));
+
+		BUG_ON(!context->refcount_loc);
+
+		ret = ocfs2_lock_refcount_tree(osb, context->refcount_loc, 1,
+					       &ref_tree, NULL);
+		if (ret) {
+			mlog_errno(ret);
+			return ret;
+		}
+		
+		ret = ocfs2_prepare_refcount_change_for_del(inode,
+							context->refcount_loc,
+							phys_blkno,
+							len,
+							&credits,
+							&extra_blocks);
+		if (ret) {
+			mlog_errno(ret);
+			goto out;
+		}
+	}
+
+	ret = ocfs2_lock_allocators_move_extents(inode, &context->et, len, 1,
+						 &context->meta_ac,
+						 NULL, extra_blocks, &credits);
+	if (ret) {
+		mlog_errno(ret);
+		goto out;
+	}
+
+	/*
+	 * need to count 2 extra credits for global_bitmap inode and
+	 * group descriptor.
+	 */
+	credits += OCFS2_INODE_UPDATE_CREDITS + 1;
+
+	/*
+	 * ocfs2_move_extent() didn't reserve any clusters in lock_allocators()
+	 * logic, while we still need to lock the global_bitmap.
+	 */
+	gb_inode = ocfs2_get_system_file_inode(osb, GLOBAL_BITMAP_SYSTEM_INODE,
+					       OCFS2_INVALID_SLOT);
+	if (!gb_inode) {
+		mlog(ML_ERROR, "unable to get global_bitmap inode\n");
+		ret = -EIO;
+		goto out;
+	}
+
+	mutex_lock(&gb_inode->i_mutex);
+
+	ret = ocfs2_inode_lock(gb_inode, &gb_bh, 1);
+	if (ret) {
+		mlog_errno(ret);
+		goto out_unlock_gb_mutex;
+	}
+
+	mutex_lock(&tl_inode->i_mutex);
+
+	handle = ocfs2_start_trans(osb, credits);
+	if (IS_ERR(handle)) {
+		ret = PTR_ERR(handle);
+		mlog_errno(ret);
+		goto out_unlock_tl_inode;
+	}
+
+	new_phys_blkno = ocfs2_clusters_to_blocks(inode->i_sb, *new_phys_cpos);
+	ret = ocfs2_find_victim_alloc_group(inode, new_phys_blkno,
+					    GLOBAL_BITMAP_SYSTEM_INODE,
+					    OCFS2_INVALID_SLOT,
+					    &goal_bit, &gd_bh);
+	if (ret) {
+		mlog_errno(ret);
+		goto out_commit;
+	}
+
+	/*
+	 * probe the victim cluster group to find a proper
+	 * region to fit wanted movement, it even will perfrom
+	 * a best-effort attempt by compromising to a threshold
+	 * around the goal.
+	 */
+	ocfs2_probe_alloc_group(inode, gd_bh, &goal_bit, len, move_max_hop,
+				new_phys_cpos);
+	if (!new_phys_cpos) {
+		ret = -ENOSPC;
+		goto out_commit;
+	}
+
+	ret = __ocfs2_move_extent(handle, context, cpos, len, phys_cpos,
+				  *new_phys_cpos, ext_flags);
+	if (ret) {
+		mlog_errno(ret);
+		goto out_commit;
+	}
+
+	gd = (struct ocfs2_group_desc *)gd_bh->b_data;
+	ret = ocfs2_alloc_dinode_update_counts(gb_inode, handle, gb_bh, len,
+					       le16_to_cpu(gd->bg_chain));
+	if (ret) {
+		mlog_errno(ret);
+		goto out_commit;
+	}
+
+	ret = ocfs2_block_group_set_bits(handle, gb_inode, gd, gd_bh,
+					 goal_bit, len);
+	if (ret)
+		mlog_errno(ret);
+
+	/*
+	 * Here we should write the new page out first if we are
+	 * in write-back mode.
+	 */
+	ret = ocfs2_cow_sync_writeback(inode->i_sb, context->inode, cpos, len);
+	if (ret)
+		mlog_errno(ret);
+
+out_commit:
+	ocfs2_commit_trans(osb, handle);
+	brelse(gd_bh);
+
+out_unlock_tl_inode:
+	mutex_unlock(&tl_inode->i_mutex);
+
+	ocfs2_inode_unlock(gb_inode, 1);
+out_unlock_gb_mutex:
+	mutex_unlock(&gb_inode->i_mutex);
+	brelse(gb_bh);
+	iput(gb_inode);
+
+out:
+	if (context->meta_ac) {
+		ocfs2_free_alloc_context(context->meta_ac);
+		context->meta_ac = NULL;
+	}
+
+	if (ref_tree)
+		ocfs2_unlock_refcount_tree(osb, ref_tree, 1);
+
+	return ret;
+}
-- 
1.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Ocfs2-devel] [PATCH 13/14] Ocfs2/move_extents: helper to calculate the defraging length in one run.
  2011-01-21 10:20 [Ocfs2-devel] [PATCH 0/14] Ocfs2: Online defragmentaion V3 Tristan Ye
                   ` (11 preceding siblings ...)
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 12/14] Ocfs2/move_extents: move entire/partial extent Tristan Ye
@ 2011-01-21 10:20 ` Tristan Ye
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 14/14] Ocfs2/move_extents: move/defrag extents within a certain range Tristan Ye
  13 siblings, 0 replies; 22+ messages in thread
From: Tristan Ye @ 2011-01-21 10:20 UTC (permalink / raw)
  To: ocfs2-devel

The helper is to calculate the defrag length in one run according to a threshold,
it will proceed doing defragmentation until the threshold was meet, and skip a
LARGE extent if any.

Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
---
 fs/ocfs2/move_extents.c |   30 ++++++++++++++++++++++++++++++
 1 files changed, 30 insertions(+), 0 deletions(-)

diff --git a/fs/ocfs2/move_extents.c b/fs/ocfs2/move_extents.c
index c2b7638..bad4f15 100644
--- a/fs/ocfs2/move_extents.c
+++ b/fs/ocfs2/move_extents.c
@@ -805,3 +805,33 @@ out:
 
 	return ret;
 }
+
+/*
+ * Helper to calculate the defraging length in one run according to threshold.
+ */
+static void ocfs2_calc_extent_defrag_len(u32 *alloc_size, u32 *len_defraged,
+					 u32 threshold, int *skip)
+{
+	if ((*alloc_size + *len_defraged) < threshold) {
+		/*
+		 * proceed defragmentation until we meet the thresh
+		 */
+		*len_defraged += *alloc_size;
+	} else if (*len_defraged == 0) {
+		/*
+		 * XXX: skip a large extent.
+		 */
+		*skip = 1;
+	} else {
+		/*
+		 * split this extent to coalesce with former pieces as
+		 * to reach the threshold.
+		 *
+		 * we're done here with one cycle of defragmentation
+		 * in a size of 'thresh', resetting 'len_defraged'
+		 * forces a new defragmentation.
+		 */
+		*alloc_size = threshold - *len_defraged;
+		*len_defraged = 0;
+	}
+}
-- 
1.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Ocfs2-devel] [PATCH 14/14] Ocfs2/move_extents: move/defrag extents within a certain range.
  2011-01-21 10:20 [Ocfs2-devel] [PATCH 0/14] Ocfs2: Online defragmentaion V3 Tristan Ye
                   ` (12 preceding siblings ...)
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 13/14] Ocfs2/move_extents: helper to calculate the defraging length in one run Tristan Ye
@ 2011-01-21 10:20 ` Tristan Ye
  13 siblings, 0 replies; 22+ messages in thread
From: Tristan Ye @ 2011-01-21 10:20 UTC (permalink / raw)
  To: ocfs2-devel

he basic logic of moving extents for a file is pretty like punching-hole
sequence, walk the extents within the range as user specified, calculating
an appropriate len to defrag/move, then let ocfs2_defrag/move_extent() to
do the actual moving.

This func ends up setting 'OCFS2_MOVE_EXT_FL_COMPLETE' to userpace if operation
gets done successfully.

Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
---
 fs/ocfs2/ioctl.c        |    5 +
 fs/ocfs2/move_extents.c |  312 +++++++++++++++++++++++++++++++++++++++++++++++
 fs/ocfs2/move_extents.h |    2 +
 3 files changed, 319 insertions(+), 0 deletions(-)

diff --git a/fs/ocfs2/ioctl.c b/fs/ocfs2/ioctl.c
index 7a48681..b8fe7a7 100644
--- a/fs/ocfs2/ioctl.c
+++ b/fs/ocfs2/ioctl.c
@@ -23,6 +23,7 @@
 #include "ioctl.h"
 #include "resize.h"
 #include "refcounttree.h"
+#include "move_extents.h"
 
 #include <linux/ext2_fs.h>
 
@@ -523,6 +524,8 @@ long ocfs2_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
 			return -EFAULT;
 
 		return ocfs2_info_handle(inode, &info, 0);
+	case OCFS2_IOC_MOVE_EXT:
+		return ocfs2_ioctl_move_extents(filp, (void __user *)arg);
 	default:
 		return -ENOTTY;
 	}
@@ -565,6 +568,8 @@ long ocfs2_compat_ioctl(struct file *file, unsigned cmd, unsigned long arg)
 			return -EFAULT;
 
 		return ocfs2_info_handle(inode, &info, 1);
+	case OCFS2_IOC_MOVE_EXT:
+		break;
 	default:
 		return -ENOIOCTLCMD;
 	}
diff --git a/fs/ocfs2/move_extents.c b/fs/ocfs2/move_extents.c
index bad4f15..aec9a92 100644
--- a/fs/ocfs2/move_extents.c
+++ b/fs/ocfs2/move_extents.c
@@ -835,3 +835,315 @@ static void ocfs2_calc_extent_defrag_len(u32 *alloc_size, u32 *len_defraged,
 		*len_defraged = 0;
 	}
 }
+
+static int __ocfs2_move_extents_range(struct buffer_head *di_bh,
+				struct ocfs2_move_extents_context *context)
+{
+	int ret, flags, do_defrag, skip = 0;
+	u32 cpos, phys_cpos, move_start, len_to_move, alloc_size;
+	u32 len_defraged = 0, defrag_thresh, new_phys_cpos = 0;
+
+	struct inode *inode = context->inode;
+	struct ocfs2_dinode *di = (struct ocfs2_dinode *)di_bh->b_data;
+	struct ocfs2_move_extents *range = context->range;
+	struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
+
+	if ((inode->i_size == 0) || (range->me_len == 0))
+		return 0;
+
+	if (OCFS2_I(inode)->ip_dyn_features & OCFS2_INLINE_DATA_FL)
+		return 0;
+
+	context->refcount_loc = le64_to_cpu(di->i_refcount_loc);
+
+	ocfs2_init_dinode_extent_tree(&context->et, INODE_CACHE(inode), di_bh);
+	ocfs2_init_dealloc_ctxt(&context->dealloc);
+
+	/*
+	 * TO-DO XXX:
+	 *
+	 * - xattr extents.
+	 */
+
+	do_defrag = context->auto_defrag;
+
+	/*
+	 * extents moving happens in unit of clusters, for the sake
+	 * of simplicity, we may ignore two clusters where 'byte_start'
+	 * and 'byte_start + len' were within.
+	 */
+	move_start = ocfs2_clusters_for_bytes(osb->sb, range->me_start);
+	len_to_move = (range->me_start + range->me_len) >>
+						osb->s_clustersize_bits;
+	if (len_to_move >= move_start)
+		len_to_move -= move_start;
+	else
+		len_to_move = 0;
+
+	if (do_defrag)
+		defrag_thresh = range->me_thresh >> osb->s_clustersize_bits;
+	else
+		new_phys_cpos = ocfs2_blocks_to_clusters(inode->i_sb,
+							 range->me_goal);
+
+	mlog(0, "Inode: %llu, start: %llu, len: %llu, cstart: %u, clen: %u, "
+	     "thresh: %u\n",
+	     (unsigned long long)OCFS2_I(inode)->ip_blkno,
+	     (unsigned long long)range->me_start,
+	     (unsigned long long)range->me_len,
+	     move_start, len_to_move, defrag_thresh);
+
+	cpos = move_start;
+	while (len_to_move) {
+		ret = ocfs2_get_clusters(inode, cpos, &phys_cpos, &alloc_size,
+					 &flags);
+		if (ret) {
+			mlog_errno(ret);
+			goto out;
+		}
+
+		if (alloc_size > len_to_move)
+			alloc_size = len_to_move;
+
+		/*
+		 * XXX: how to deal with a hole:
+		 *
+		 * - skip the hole of course
+		 * - force a new defragmentation
+		 */
+		if (!phys_cpos) {
+			if (do_defrag)
+				len_defraged = 0;
+
+			goto next;
+		}
+
+		if (do_defrag) {
+			ocfs2_calc_extent_defrag_len(&alloc_size, &len_defraged,
+						     defrag_thresh, &skip);
+			/*
+			 * skip large extents
+			 */
+			if (skip) {
+				skip = 0;
+				goto next;
+			}
+
+			mlog(0, "#Defrag: cpos: %u, phys_cpos: %u, "
+			     "alloc_size: %u, len_defraged: %u\n",
+			     cpos, phys_cpos, alloc_size, len_defraged);
+
+			ret = ocfs2_defrag_extent(context, cpos, phys_cpos,
+						  alloc_size, flags);
+		} else {
+			ret = ocfs2_move_extent(context, cpos, phys_cpos,
+						&new_phys_cpos, alloc_size,
+						flags);
+
+			new_phys_cpos += alloc_size;
+		}
+
+		if (ret < 0) {
+			mlog_errno(ret);
+			goto out;
+		}
+
+		context->clusters_moved += alloc_size;
+next:
+		cpos += alloc_size;
+		len_to_move -= alloc_size;
+	}
+
+	range->me_flags |= OCFS2_MOVE_EXT_FL_COMPLETE;
+
+out:
+	range->me_moved_len = ocfs2_clusters_to_bytes(osb->sb,
+						      context->clusters_moved);
+	range->me_new_offset = ocfs2_clusters_to_bytes(osb->sb,
+						       context->new_phys_cpos);
+
+	ocfs2_schedule_truncate_log_flush(osb, 1);
+	ocfs2_run_deallocs(osb, &context->dealloc);
+
+	return ret;
+}
+
+static int ocfs2_move_extents(struct ocfs2_move_extents_context *context)
+{
+	int status;
+	handle_t *handle;
+	struct inode *inode = context->inode;
+	struct ocfs2_dinode *di;
+	struct buffer_head *di_bh = NULL;
+	struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
+
+	if (!inode)
+		return -ENOENT;
+
+	if (ocfs2_is_hard_readonly(osb) || ocfs2_is_soft_readonly(osb))
+		return -EROFS;
+
+	mutex_lock(&inode->i_mutex);
+
+	/*
+	 * This prevents concurrent writes from other nodes
+	 */
+	status = ocfs2_rw_lock(inode, 1);
+	if (status) {
+		mlog_errno(status);
+		goto out;
+	}
+
+	status = ocfs2_inode_lock(inode, &di_bh, 1);
+	if (status) {
+		mlog_errno(status);
+		goto out_rw_unlock;
+	}
+
+	/*
+	 * rememer ip_xattr_sem also needs to be held if necessary
+	 */
+	down_write(&OCFS2_I(inode)->ip_alloc_sem);
+
+	status = __ocfs2_move_extents_range(di_bh, context);
+
+	up_write(&OCFS2_I(inode)->ip_alloc_sem);
+	if (status) {
+		mlog_errno(status);
+		goto out_inode_unlock;
+	}
+
+	/*
+	 * We update ctime for these changes
+	 */
+	handle = ocfs2_start_trans(osb, OCFS2_INODE_UPDATE_CREDITS);
+	if (IS_ERR(handle)) {
+		status = PTR_ERR(handle);
+		mlog_errno(status);
+		goto out_inode_unlock;
+	}
+
+	status = ocfs2_journal_access_di(handle, INODE_CACHE(inode), di_bh,
+					 OCFS2_JOURNAL_ACCESS_WRITE);
+	if (status) {
+		mlog_errno(status);
+		goto out_commit;
+	}
+
+	di = (struct ocfs2_dinode *)di_bh->b_data;
+	inode->i_ctime = CURRENT_TIME;
+	di->i_ctime = cpu_to_le64(inode->i_ctime.tv_sec);
+	di->i_ctime_nsec = cpu_to_le32(inode->i_ctime.tv_nsec);
+
+	ocfs2_journal_dirty(handle, di_bh);
+
+out_commit:
+	ocfs2_commit_trans(osb, handle);
+
+out_inode_unlock:
+	brelse(di_bh);
+	ocfs2_inode_unlock(inode, 1);
+out_rw_unlock:
+	ocfs2_rw_unlock(inode, 1);
+out:
+	mutex_unlock(&inode->i_mutex);
+
+	return status;
+}
+
+int ocfs2_ioctl_move_extents(struct file *filp, void __user *argp)
+{
+	int status;
+
+	struct inode *inode = filp->f_path.dentry->d_inode;
+	struct ocfs2_move_extents range;
+	struct ocfs2_move_extents_context *context = NULL;
+
+	status = mnt_want_write(filp->f_path.mnt);
+	if (status)
+		return status;
+
+	status = -EINVAL;
+
+	if ((!S_ISREG(inode->i_mode)) || !(filp->f_mode & FMODE_WRITE))
+		goto out;
+
+	if (inode->i_flags & (S_IMMUTABLE|S_APPEND)) {
+		status = -EPERM;
+		goto out;
+	}
+
+	context = kzalloc(sizeof(struct ocfs2_move_extents_context), GFP_NOFS);
+	if (!context) {
+		status = -ENOMEM;
+		mlog_errno(status);
+		goto out;
+	}
+
+	context->inode = inode;
+	context->file = filp;
+
+	if (!argp) {
+		memset(&range, 0, sizeof(range));
+		range.me_len = (u64)-1;
+		range.me_flags |= OCFS2_MOVE_EXT_FL_AUTO_DEFRAG;
+		context->auto_defrag = 1;
+	} else {
+		if (copy_from_user(&range, (struct ocfs2_move_extents *)argp,
+				   sizeof(range))) {
+			status = -EFAULT;
+			goto out;
+		}
+	}
+
+	if (range.me_start > i_size_read(inode))
+		goto out;
+
+	if (range.me_start + range.me_len > i_size_read(inode))
+			range.me_len = i_size_read(inode) - range.me_start;
+
+	context->range = &range;
+
+	if (range.me_flags & OCFS2_MOVE_EXT_FL_AUTO_DEFRAG) {
+		context->auto_defrag = 1;
+		if (!range.me_thresh)
+			/*
+			 * ok, the default theshold for the defragmentation
+			 * is 1M, since our maximum clustersize was 1M also.
+			 * any thought?
+			 */
+			range.me_thresh = 1024 * 1024;
+	} else {
+		/*
+		 * first best-effort attempt to validate and adjust the goal
+		 * (physical address in block), while it can't guarantee later
+		 * operation can succeed all the time since global_bitmap may
+		 * change a bit over time.
+		 */
+
+		status = ocfs2_validate_and_adjust_move_goal(inode, &range);
+		if (status)
+			goto out;
+	}
+
+	status = ocfs2_move_extents(context);
+	if (status)
+		mlog_errno(status);
+out:
+	/*
+	 * movement/defragmentation may end up being partially completed,
+	 * that's the reason why we need to return userspace the finished
+	 * length and new_offset even if failure happens somewhere.
+	 */
+	if (argp) {
+		if (copy_to_user((struct ocfs2_move_extents *)argp, &range,
+				sizeof(range)))
+			status = -EFAULT;
+	}
+
+	kfree(context);
+
+	mnt_drop_write(filp->f_path.mnt);
+
+	return status;
+}
diff --git a/fs/ocfs2/move_extents.h b/fs/ocfs2/move_extents.h
index eba4044..d9f0f29 100644
--- a/fs/ocfs2/move_extents.h
+++ b/fs/ocfs2/move_extents.h
@@ -17,4 +17,6 @@
 #ifndef OCFS2_MOVE_EXTENTS_H
 #define OCFS2_MOVE_EXTENTS_H
 
+int ocfs2_ioctl_move_extents(struct file *filp,  void __user *argp);
+
 #endif /* OCFS2_MOVE_EXTENTS_H */
-- 
1.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Ocfs2-devel] [PATCH 01/14] Ocfs2/refcounttree: Fix a bug for refcounttree to writeback clusters in a right number.
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 01/14] Ocfs2/refcounttree: Fix a bug for refcounttree to writeback clusters in a right number Tristan Ye
@ 2011-01-27 23:29   ` Mark Fasheh
  2011-01-28  1:43     ` Tristan Ye
  2011-02-20 10:40   ` Joel Becker
  1 sibling, 1 reply; 22+ messages in thread
From: Mark Fasheh @ 2011-01-27 23:29 UTC (permalink / raw)
  To: ocfs2-devel

On Fri, Jan 21, 2011 at 06:20:18PM +0800, Tristan Ye wrote:
> Current refcounttree codes actually didn't writeback the new pages out in
> write-back mode, due to a bug of always passing a ZERO number of clusters
> to 'ocfs2_cow_sync_writeback', the patch tries to pass a proper one in.
> 
> Signed-off-by: Tristan Ye <tristan.ye@oracle.com>

Seems to look reasonable to me, though Joel or Tao might want to give their
own quick SOB too.

Joel: I think we want this particular patch to go upstream sooner than
later, yes?

Signed-off-by: Mark Fasheh <mfasheh@suse.com>
	--Mark

--
Mark Fasheh

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Ocfs2-devel] [PATCH 02/14] Ocfs2/refcounttree: Publicate couple of funcs from refcounttree.c
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 02/14] Ocfs2/refcounttree: Publicate couple of funcs from refcounttree.c Tristan Ye
@ 2011-01-28  0:05   ` Mark Fasheh
  2011-01-28  1:46     ` Tristan Ye
  0 siblings, 1 reply; 22+ messages in thread
From: Mark Fasheh @ 2011-01-28  0:05 UTC (permalink / raw)
  To: ocfs2-devel

On Fri, Jan 21, 2011 at 06:20:19PM +0800, Tristan Ye wrote:
> The original goal of commonizing these funcs is to benefit defraging/extent_moving
> codes in the future,  based on the fact that reflink and defragmentation having
> the same Copy-On-Wrtie mechanism.
> 

This looks reasonable. You might want to note the change from struct
"ocfs2_cow_context *" in some parameters to struct "file *" though.
	--Mark

--
Mark Fasheh

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Ocfs2-devel] [PATCH 06/14] Ocfs2/move_extents: move a range of extent.
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 06/14] Ocfs2/move_extents: move a range of extent Tristan Ye
@ 2011-01-28  1:10   ` Mark Fasheh
  2011-01-28  1:51     ` Tristan Ye
  0 siblings, 1 reply; 22+ messages in thread
From: Mark Fasheh @ 2011-01-28  1:10 UTC (permalink / raw)
  To: ocfs2-devel

On Fri, Jan 21, 2011 at 06:20:23PM +0800, Tristan Ye wrote:
> The moving range of __ocfs2_move_extent() was within one extent always, it
> consists following parts:
> 
> 1. Duplicates the clusters in pages to new_blkoffset, where extent to be moved.
> 
> 2. Split the original extent with new extent, coalecse the nearby extents if possible.
> 
> 3. Append old clusters to truncate log, or decrease_refcount if the extent was refcounted.
> 
> Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
> ---
>  fs/ocfs2/move_extents.c |  104 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 104 insertions(+), 0 deletions(-)
> 
> diff --git a/fs/ocfs2/move_extents.c b/fs/ocfs2/move_extents.c
> index 9b30636..e28bd7d 100644
> --- a/fs/ocfs2/move_extents.c
> +++ b/fs/ocfs2/move_extents.c
> @@ -56,6 +56,110 @@ struct ocfs2_move_extents_context {
>  	struct ocfs2_cached_dealloc_ctxt dealloc;
>  };
>  
> +static int __ocfs2_move_extent(handle_t *handle,
> +			       struct ocfs2_move_extents_context *context,
> +			       u32 cpos, u32 len, u32 p_cpos, u32 new_p_cpos,
> +			       int ext_flags)
> +{
> +	int ret = 0, index;
> +	struct inode *inode = context->inode;
> +	struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
> +	struct ocfs2_extent_rec *rec, replace_rec;
> +	struct ocfs2_path *path = NULL;
> +	struct ocfs2_extent_list *el;
> +	u64 ino = ocfs2_metadata_cache_owner(context->et.et_ci);
> +	u64 old_blkno = ocfs2_clusters_to_blocks(inode->i_sb, p_cpos);
> +
> +	ret = ocfs2_duplicate_clusters_by_page(handle, context->file, cpos,
> +					       p_cpos, new_p_cpos, len);
> +	if (ret) {
> +		mlog_errno(ret);
> +		goto out;
> +	}
> +
> +	memset(&replace_rec, 0, sizeof(replace_rec));
> +	replace_rec.e_cpos = cpu_to_le32(cpos);
> +	replace_rec.e_leaf_clusters = cpu_to_le16(len);
> +	replace_rec.e_blkno = cpu_to_le64(ocfs2_clusters_to_blocks(inode->i_sb,
> +								   new_p_cpos));
> +
> +	path = ocfs2_new_path_from_et(&context->et);
> +	if (!path) {
> +		ret = -ENOMEM;
> +		mlog_errno(ret);
> +		goto out;
> +	}
> +
> +	ret = ocfs2_find_path(INODE_CACHE(inode), path, cpos);
> +	if (ret) {
> +		mlog_errno(ret);
> +		goto out;
> +	}
> +
> +	el = path_leaf_el(path);
> +
> +	index = ocfs2_search_extent_list(el, cpos);
> +	if (index == -1 || index >= le16_to_cpu(el->l_next_free_rec)) {
> +		ocfs2_error(inode->i_sb,
> +			    "Inode %llu has an extent at cpos %u which can no "
> +			    "longer be found.\n",
> +			    (unsigned long long)ino, cpos);
> +		ret = -EROFS;
> +		goto out;
> +	}
> +
> +	rec = &el->l_recs[index];
> +
> +	BUG_ON(ext_flags != rec->e_flags);
> +	/*
> +	 * after moving/defraging to new location, the extent is not going
> +	 * to be refcounted anymore.
> +	 */
> +	if (ext_flags & OCFS2_EXT_REFCOUNTED)
> +		replace_rec.e_flags = ext_flags & ~OCFS2_EXT_REFCOUNTED;
> +	else
> +		replace_rec.e_flags = ext_flags;

You can remove the if statement here and just leave it as:

	replace_rec.e_flags = ext_flags & ~OCFS2_EXT_REFCOUNTED;

which will do the same thing.
	--Mark

--
Mark Fasheh

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Ocfs2-devel] [PATCH 01/14] Ocfs2/refcounttree: Fix a bug for refcounttree to writeback clusters in a right number.
  2011-01-27 23:29   ` Mark Fasheh
@ 2011-01-28  1:43     ` Tristan Ye
  0 siblings, 0 replies; 22+ messages in thread
From: Tristan Ye @ 2011-01-28  1:43 UTC (permalink / raw)
  To: ocfs2-devel

Mark Fasheh wrote:
> On Fri, Jan 21, 2011 at 06:20:18PM +0800, Tristan Ye wrote:
>> Current refcounttree codes actually didn't writeback the new pages out in
>> write-back mode, due to a bug of always passing a ZERO number of clusters
>> to 'ocfs2_cow_sync_writeback', the patch tries to pass a proper one in.
>>
>> Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
>
> Seems to look reasonable to me, though Joel or Tao might want to give their
> own quick SOB too.

Mark,

    Thanks for your quick review;-)

>
> Joel: I think we want this particular patch to go upstream sooner than
> later, yes?

Yep, it's more like a stand-alone fix for reflink than one part of 
defragmentation patches.

It thus will be better to have this in next 'fixes' commit.


Tristan.

>
> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
> 	--Mark
>
> --
> Mark Fasheh

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Ocfs2-devel] [PATCH 02/14] Ocfs2/refcounttree: Publicate couple of funcs from refcounttree.c
  2011-01-28  0:05   ` Mark Fasheh
@ 2011-01-28  1:46     ` Tristan Ye
  0 siblings, 0 replies; 22+ messages in thread
From: Tristan Ye @ 2011-01-28  1:46 UTC (permalink / raw)
  To: ocfs2-devel

Mark Fasheh wrote:
> On Fri, Jan 21, 2011 at 06:20:19PM +0800, Tristan Ye wrote:
>   
>> The original goal of commonizing these funcs is to benefit defraging/extent_moving
>> codes in the future,  based on the fact that reflink and defragmentation having
>> the same Copy-On-Wrtie mechanism.
>>
>>     
>
> This looks reasonable. You might want to note the change from struct
> "ocfs2_cow_context *" in some parameters to struct "file *" though.
> 	--Mark
>
> --
> Mark Fasheh
>   

Yea, it will be better to become that descriptive.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Ocfs2-devel] [PATCH 06/14] Ocfs2/move_extents: move a range of extent.
  2011-01-28  1:10   ` Mark Fasheh
@ 2011-01-28  1:51     ` Tristan Ye
  0 siblings, 0 replies; 22+ messages in thread
From: Tristan Ye @ 2011-01-28  1:51 UTC (permalink / raw)
  To: ocfs2-devel

Mark Fasheh wrote:
> On Fri, Jan 21, 2011 at 06:20:23PM +0800, Tristan Ye wrote:
>> The moving range of __ocfs2_move_extent() was within one extent always, it
>> consists following parts:
>>
>> 1. Duplicates the clusters in pages to new_blkoffset, where extent to be moved.
>>
>> 2. Split the original extent with new extent, coalecse the nearby extents if possible.
>>
>> 3. Append old clusters to truncate log, or decrease_refcount if the extent was refcounted.
>>
>> Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
>> ---
>>  fs/ocfs2/move_extents.c |  104 +++++++++++++++++++++++++++++++++++++++++++++++
>>  1 files changed, 104 insertions(+), 0 deletions(-)
>>
>> diff --git a/fs/ocfs2/move_extents.c b/fs/ocfs2/move_extents.c
>> index 9b30636..e28bd7d 100644
>> --- a/fs/ocfs2/move_extents.c
>> +++ b/fs/ocfs2/move_extents.c
>> @@ -56,6 +56,110 @@ struct ocfs2_move_extents_context {
>>  	struct ocfs2_cached_dealloc_ctxt dealloc;
>>  };
>>  
>> +static int __ocfs2_move_extent(handle_t *handle,
>> +			       struct ocfs2_move_extents_context *context,
>> +			       u32 cpos, u32 len, u32 p_cpos, u32 new_p_cpos,
>> +			       int ext_flags)
>> +{
>> +	int ret = 0, index;
>> +	struct inode *inode = context->inode;
>> +	struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
>> +	struct ocfs2_extent_rec *rec, replace_rec;
>> +	struct ocfs2_path *path = NULL;
>> +	struct ocfs2_extent_list *el;
>> +	u64 ino = ocfs2_metadata_cache_owner(context->et.et_ci);
>> +	u64 old_blkno = ocfs2_clusters_to_blocks(inode->i_sb, p_cpos);
>> +
>> +	ret = ocfs2_duplicate_clusters_by_page(handle, context->file, cpos,
>> +					       p_cpos, new_p_cpos, len);
>> +	if (ret) {
>> +		mlog_errno(ret);
>> +		goto out;
>> +	}
>> +
>> +	memset(&replace_rec, 0, sizeof(replace_rec));
>> +	replace_rec.e_cpos = cpu_to_le32(cpos);
>> +	replace_rec.e_leaf_clusters = cpu_to_le16(len);
>> +	replace_rec.e_blkno = cpu_to_le64(ocfs2_clusters_to_blocks(inode->i_sb,
>> +								   new_p_cpos));
>> +
>> +	path = ocfs2_new_path_from_et(&context->et);
>> +	if (!path) {
>> +		ret = -ENOMEM;
>> +		mlog_errno(ret);
>> +		goto out;
>> +	}
>> +
>> +	ret = ocfs2_find_path(INODE_CACHE(inode), path, cpos);
>> +	if (ret) {
>> +		mlog_errno(ret);
>> +		goto out;
>> +	}
>> +
>> +	el = path_leaf_el(path);
>> +
>> +	index = ocfs2_search_extent_list(el, cpos);
>> +	if (index == -1 || index >= le16_to_cpu(el->l_next_free_rec)) {
>> +		ocfs2_error(inode->i_sb,
>> +			    "Inode %llu has an extent at cpos %u which can no "
>> +			    "longer be found.\n",
>> +			    (unsigned long long)ino, cpos);
>> +		ret = -EROFS;
>> +		goto out;
>> +	}
>> +
>> +	rec = &el->l_recs[index];
>> +
>> +	BUG_ON(ext_flags != rec->e_flags);
>> +	/*
>> +	 * after moving/defraging to new location, the extent is not going
>> +	 * to be refcounted anymore.
>> +	 */
>> +	if (ext_flags & OCFS2_EXT_REFCOUNTED)
>> +		replace_rec.e_flags = ext_flags & ~OCFS2_EXT_REFCOUNTED;
>> +	else
>> +		replace_rec.e_flags = ext_flags;
>
> You can remove the if statement here and just leave it as:
>
> 	replace_rec.e_flags = ext_flags & ~OCFS2_EXT_REFCOUNTED;

Definitely, I loved this optimization;-)

>
> which will do the same thing.
> 	--Mark
>
> --
> Mark Fasheh

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Ocfs2-devel] [PATCH 01/14] Ocfs2/refcounttree: Fix a bug for refcounttree to writeback clusters in a right number.
  2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 01/14] Ocfs2/refcounttree: Fix a bug for refcounttree to writeback clusters in a right number Tristan Ye
  2011-01-27 23:29   ` Mark Fasheh
@ 2011-02-20 10:40   ` Joel Becker
  1 sibling, 0 replies; 22+ messages in thread
From: Joel Becker @ 2011-02-20 10:40 UTC (permalink / raw)
  To: ocfs2-devel

On Fri, Jan 21, 2011 at 06:20:18PM +0800, Tristan Ye wrote:
> Current refcounttree codes actually didn't writeback the new pages out in
> write-back mode, due to a bug of always passing a ZERO number of clusters
> to 'ocfs2_cow_sync_writeback', the patch tries to pass a proper one in.
> 
> Signed-off-by: Tristan Ye <tristan.ye@oracle.com>

	This patch is now in the fixes branch of ocfs2.git.

Joel

-- 

Life's Little Instruction Book #232

	"Keep your promises."

			http://www.jlbec.org/
			jlbec at evilplan.org

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2011-02-20 10:40 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-21 10:20 [Ocfs2-devel] [PATCH 0/14] Ocfs2: Online defragmentaion V3 Tristan Ye
2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 01/14] Ocfs2/refcounttree: Fix a bug for refcounttree to writeback clusters in a right number Tristan Ye
2011-01-27 23:29   ` Mark Fasheh
2011-01-28  1:43     ` Tristan Ye
2011-02-20 10:40   ` Joel Becker
2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 02/14] Ocfs2/refcounttree: Publicate couple of funcs from refcounttree.c Tristan Ye
2011-01-28  0:05   ` Mark Fasheh
2011-01-28  1:46     ` Tristan Ye
2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 03/14] Ocfs2/move_extents: Adding new ioctl code 'OCFS2_IOC_MOVE_EXT' to ocfs2 Tristan Ye
2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 04/14] Ocfs2/move_extents: Add basic framework and source files for extent moving Tristan Ye
2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 05/14] Ocfs2/move_extents: lock allocators and reserve metadata blocks and data clusters for extents moving Tristan Ye
2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 06/14] Ocfs2/move_extents: move a range of extent Tristan Ye
2011-01-28  1:10   ` Mark Fasheh
2011-01-28  1:51     ` Tristan Ye
2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 07/14] Ocfs2/move_extents: defrag " Tristan Ye
2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 08/14] Ocfs2/move_extents: find the victim alloc group, where the given #blk fits Tristan Ye
2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 09/14] Ocfs2/move_extents: helper to validate and adjust moving goal Tristan Ye
2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 10/14] Ocfs2/move_extents: helper to probe a proper region to move in an alloc group Tristan Ye
2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 11/14] Ocfs2/move_extents: helpers to update the group descriptor and global bitmap inode Tristan Ye
2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 12/14] Ocfs2/move_extents: move entire/partial extent Tristan Ye
2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 13/14] Ocfs2/move_extents: helper to calculate the defraging length in one run Tristan Ye
2011-01-21 10:20 ` [Ocfs2-devel] [PATCH 14/14] Ocfs2/move_extents: move/defrag extents within a certain range Tristan Ye

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.