All of lore.kernel.org
 help / color / mirror / Atom feed
* New resize interface implementation
@ 2011-08-11  3:28 Yongqiang Yang
  2011-08-11  3:28 ` [PATCH 01/13] ext4: add a function which extends a group without checking parameters Yongqiang Yang
                   ` (13 more replies)
  0 siblings, 14 replies; 25+ messages in thread
From: Yongqiang Yang @ 2011-08-11  3:28 UTC (permalink / raw)
  To: linux-ext4; +Cc: aedilger, tytso

Hi all,

This patch series adds new resize implementation to ext4.

-- What's new resize implementation?
   It is a new online resize interface for ext4.  It can be used via
   ioctl with EXT4_IOC_RESIZE_FS and a 64 bit integer indicating size
   of the resized fs in block.

-- Difference between current resize and new resize.
   New resize lets kernel do all work, like allocating bitmaps and
   inode tables and can support flex_bg and BLOCK_UNINIT features.
   Besides these, new resize is much faster than current resize.

   Below are benchmarks I made on my personal computer, fses with
   flex_bg size = 16 were resized to 230GB evry time. The first
   row shows the size of a fs from which the fs was resized to 230GB.
   The datas were collected by 'time resize2fs'.

                      new resize
                20GB          50GB      100GB
      real    0m3.558s     0m2.891s    0m0.394s
      user    0m0.004s     0m0.000s    0m0.394s
      sys     0m0.048s     0m0.048s    0m0.028s

                      current resize
                20GB          50GB      100GB
      real    5m2.770s     4m43.757s  3m14.840s
      user    0m0.040s     0m0.032s   0m0.024s
      sys     0m0.464s     0m0.432s   0m0.324s

   According to data above, new resize is faster than current resize in both
   user and sys time.  New resize performs well in sys time, because it
   supports BLOCK_UNINIT and adds multi-groups each time.

-- About supporting new features.
   YES! New resize can support new feature like bigalloc and exclude bitmap
   easily.  Because it lets kernel do all work.

[PATCH 01/13] ext4: add a function which extends a group without
[PATCH 02/13] ext4: add a function which adds a new desc to a fs
[PATCH 03/13] ext4: add a function which sets up a new group desc
[PATCH 04/13] ext4: add a function which updates super block
[PATCH 05/13] ext4: add a structure which will be used by
[PATCH 06/13] ext4: add a function which sets up group blocks of a
[PATCH 07/13] ext4: add a function which adds several group
[PATCH 08/13] ext4: add a function which sets up a flex groups each
[PATCH 09/13] ext4: enable ext4_update_super() to handle a flex
[PATCH 10/13] ext4: pass verify_reserved_gdb() the number of group
[PATCH 11/13] ext4: add a new function which allocates bitmaps and
[PATCH 12/13] ext4: add a new function which adds a flex group to a
[PATCH 13/13] ext4: add new online resize interface


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 01/13] ext4: add a function which extends a group without checking parameters
  2011-08-11  3:28 New resize interface implementation Yongqiang Yang
@ 2011-08-11  3:28 ` Yongqiang Yang
  2011-08-11  5:47   ` Andreas Dilger
  2011-08-11  3:28 ` [PATCH 02/13] ext4: add a function which adds a new desc to a fs Yongqiang Yang
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 25+ messages in thread
From: Yongqiang Yang @ 2011-08-11  3:28 UTC (permalink / raw)
  To: linux-ext4; +Cc: aedilger, tytso, Yongqiang Yang

This patch added a function named __ext4_group_extend() whose code
is copied from ext4_group_extend().  __ext4_group_extend() assumes
the parameter is valid and has been checked by caller.

__ext4_group_extend() will be used by new resize implementation. It
can also be used by ext4_group_extend(), but this patch series does
not do this.

Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
---
 fs/ext4/resize.c |   53 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 53 insertions(+), 0 deletions(-)

diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
index 707d3f1..6ffbdb6 100644
--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -969,6 +969,59 @@ exit_put:
 } /* ext4_group_add */
 
 /*
+ * extend a group without checking assuming that checking has been done.
+ */
+static int __ext4_group_extend(struct super_block *sb,
+			       ext4_fsblk_t o_blocks_count, ext4_grpblk_t add)
+{
+	struct ext4_super_block *es = EXT4_SB(sb)->s_es;
+	handle_t *handle;
+	int err = 0, err2;
+
+	/* We will update the superblock, one block bitmap, and
+	 * one group descriptor via ext4_ext4_group_add_blocks().
+	 */
+	handle = ext4_journal_start_sb(sb, 3);
+	if (IS_ERR(handle)) {
+		err = PTR_ERR(handle);
+		ext4_warning(sb, "error %d on journal start", err);
+		goto out;
+	}
+
+	err = ext4_journal_get_write_access(handle, EXT4_SB(sb)->s_sbh);
+	if (err) {
+		ext4_warning(sb, "error %d on journal write access", err);
+		ext4_journal_stop(handle);
+		goto out;
+	}
+
+	ext4_blocks_count_set(es, o_blocks_count + add);
+	ext4_debug("freeing blocks %llu through %llu\n", o_blocks_count,
+		   o_blocks_count + add);
+	/* We add the blocks to the bitmap and set the group need init bit */
+	err = ext4_group_add_blocks(handle, sb, o_blocks_count, add);
+	if (err)
+		goto exit_journal;
+	ext4_handle_dirty_super(handle, sb);
+	ext4_debug("freed blocks %llu through %llu\n", o_blocks_count,
+		   o_blocks_count + add);
+exit_journal:
+	err2 = ext4_journal_stop(handle);
+	if (err2 && !err)
+		err = err2;
+
+	if (!err) {
+		if (test_opt(sb, DEBUG))
+			printk(KERN_DEBUG "EXT4-fs: extended group to %llu "
+			       "blocks\n", ext4_blocks_count(es));
+		update_backups(sb, EXT4_SB(sb)->s_sbh->b_blocknr, (char *)es,
+			       sizeof(struct ext4_super_block));
+	}
+out:
+	return err;
+}
+
+/*
  * Extend the filesystem to the new number of blocks specified.  This entry
  * point is only used to extend the current filesystem to the end of the last
  * existing group.  It can be accessed via ioctl, or by "remount,resize=<size>"
-- 
1.7.5.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 02/13] ext4: add a function which adds a new desc to a fs
  2011-08-11  3:28 New resize interface implementation Yongqiang Yang
  2011-08-11  3:28 ` [PATCH 01/13] ext4: add a function which extends a group without checking parameters Yongqiang Yang
@ 2011-08-11  3:28 ` Yongqiang Yang
  2011-08-11  5:49   ` Andreas Dilger
  2011-08-11  3:28 ` [PATCH 03/13] ext4: add a function which sets up a new group desc Yongqiang Yang
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 25+ messages in thread
From: Yongqiang Yang @ 2011-08-11  3:28 UTC (permalink / raw)
  To: linux-ext4; +Cc: aedilger, tytso, Yongqiang Yang

This patch adds a function named ext4_add_new_desc() which adds
a new desc to a fs and whose code is copied from ext4_group_add().

The function will be used by new resize implementation.

Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
---
 fs/ext4/resize.c |   42 ++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 42 insertions(+), 0 deletions(-)

diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
index 6ffbdb6..4fcd515 100644
--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -735,6 +735,48 @@ exit_err:
 	}
 }
 
+/*
+ * ext4_add_new_desc() adds group descriptor of group @group
+ *
+ * @handle: journal handle
+ * @sb; super block
+ * @group: the group no. of the first group desc to be added
+ * @resize_inode: the resize inode
+ */
+static int ext4_add_new_desc(handle_t *handle, struct super_block *sb,
+			     ext4_group_t group, struct inode *resize_inode)
+{
+	struct ext4_sb_info *sbi = EXT4_SB(sb);
+	struct ext4_super_block *es = sbi->s_es;
+	struct buffer_head *gdb_bh;
+	int gdb_off, gdb_num, err = 0;
+	int reserved_gdb = ext4_bg_has_super(sb, group) ?
+		le16_to_cpu(es->s_reserved_gdt_blocks) : 0;
+
+	gdb_off = group % EXT4_DESC_PER_BLOCK(sb);
+	gdb_num = group / EXT4_DESC_PER_BLOCK(sb);
+
+	/*
+	 * We will only either add reserved group blocks to a backup group
+	 * or remove reserved blocks for the first group in a new group block.
+	 * Doing both would be mean more complex code, and sane people don't
+	 * use non-sparse filesystems anymore.  This is already checked above.
+	 */
+	if (gdb_off) {
+		gdb_bh = sbi->s_group_desc[gdb_num];
+		err = ext4_journal_get_write_access(handle, gdb_bh);
+		if (err)
+			goto out;
+
+		if (reserved_gdb && ext4_bg_num_gdb(sb, group))
+			err = reserve_backup_gdb(handle, resize_inode, group);
+	} else
+		err = add_new_gdb(handle, resize_inode, group);
+
+out:
+	return err;
+}
+
 /* Add group descriptor data to an existing or new group descriptor block.
  * Ensure we handle all possible error conditions _before_ we start modifying
  * the filesystem, because we cannot abort the transaction and not have it
-- 
1.7.5.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 03/13] ext4: add a function which sets up a new group desc
  2011-08-11  3:28 New resize interface implementation Yongqiang Yang
  2011-08-11  3:28 ` [PATCH 01/13] ext4: add a function which extends a group without checking parameters Yongqiang Yang
  2011-08-11  3:28 ` [PATCH 02/13] ext4: add a function which adds a new desc to a fs Yongqiang Yang
@ 2011-08-11  3:28 ` Yongqiang Yang
  2011-08-11  6:42   ` Andreas Dilger
  2011-08-11  3:28 ` [PATCH 04/13] ext4: add a function which updates super block Yongqiang Yang
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 25+ messages in thread
From: Yongqiang Yang @ 2011-08-11  3:28 UTC (permalink / raw)
  To: linux-ext4; +Cc: aedilger, tytso, Yongqiang Yang

This patch adds a function named ext4_setup_new_desc() which sets
up a new group descriptor and whose code is sopied from ext4_group_add().

The function will be used by new resize implementation.

Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
---
 fs/ext4/resize.c |   54 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 54 insertions(+), 0 deletions(-)

diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
index 4fcd515..6320baa 100644
--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -777,6 +777,60 @@ out:
 	return err;
 }
 
+/*
+ * ext4_setup_new_desc() sets up group descriptors specified by @input.
+ *
+ * @handle: journal handle
+ * @sb: super block
+ */
+static int ext4_setup_new_desc(handle_t *handle, struct super_block *sb,
+			       struct ext4_new_group_data *input)
+{
+	struct ext4_sb_info *sbi = EXT4_SB(sb);
+	ext4_group_t group;
+	struct ext4_group_desc *gdp;
+	struct buffer_head *gdb_bh;
+	int gdb_off, gdb_num, err = 0;
+
+	group = input->group;
+
+	gdb_off = group % EXT4_DESC_PER_BLOCK(sb);
+	gdb_num = group / EXT4_DESC_PER_BLOCK(sb);
+
+	/*
+	 * get_write_access() has been called on gdb_bh by ext4_add_new_desc().
+	 */
+	gdb_bh = sbi->s_group_desc[gdb_num];
+	/* Update group descriptor block for new group */
+	gdp = (struct ext4_group_desc *)((char *)gdb_bh->b_data +
+				 gdb_off * EXT4_DESC_SIZE(sb));
+
+	memset(gdp, 0, EXT4_DESC_SIZE(sb));
+	 /* LV FIXME */
+	memset(gdp, 0, EXT4_DESC_SIZE(sb));
+	ext4_block_bitmap_set(sb, gdp, input->block_bitmap); /* LV FIXME */
+	ext4_inode_bitmap_set(sb, gdp, input->inode_bitmap); /* LV FIXME */
+	ext4_inode_table_set(sb, gdp, input->inode_table); /* LV FIXME */
+	ext4_free_blks_set(sb, gdp, input->free_blocks_count);
+	ext4_free_inodes_set(sb, gdp, EXT4_INODES_PER_GROUP(sb));
+	gdp->bg_flags = cpu_to_le16(EXT4_BG_INODE_ZEROED);
+	gdp->bg_checksum = ext4_group_desc_csum(sbi, input->group, gdp);
+
+	err = ext4_handle_dirty_metadata(handle, NULL, gdb_bh);
+	if (unlikely(err)) {
+		ext4_std_error(sb, err);
+		return err;
+	}
+
+	/*
+	 * We can allocate memory for mb_alloc based on the new group
+	 * descriptor
+	 */
+	err = ext4_mb_add_groupinfo(sb, group, gdp);
+
+	return err;
+}
+
 /* Add group descriptor data to an existing or new group descriptor block.
  * Ensure we handle all possible error conditions _before_ we start modifying
  * the filesystem, because we cannot abort the transaction and not have it
-- 
1.7.5.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 04/13] ext4: add a function which updates super block
  2011-08-11  3:28 New resize interface implementation Yongqiang Yang
                   ` (2 preceding siblings ...)
  2011-08-11  3:28 ` [PATCH 03/13] ext4: add a function which sets up a new group desc Yongqiang Yang
@ 2011-08-11  3:28 ` Yongqiang Yang
  2011-08-11  3:28 ` [PATCH 05/13] ext4: add a structure which will be used by 64bit-resize interface Yongqiang Yang
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Yongqiang Yang @ 2011-08-11  3:28 UTC (permalink / raw)
  To: linux-ext4; +Cc: aedilger, tytso, Yongqiang Yang

This patch adds a function named ext4_update_super() which updates
super block and whose code is copied from ext4_group_add().

The function will be used by new resize implementation.

Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
---
 fs/ext4/resize.c |   72 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 72 insertions(+), 0 deletions(-)

diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
index 6320baa..14be865 100644
--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -831,6 +831,78 @@ static int ext4_setup_new_desc(handle_t *handle, struct super_block *sb,
 	return err;
 }
 
+/*
+ * ext4_update_super() updates super so that new the added group can be seen
+ *   by the filesystem.
+ *
+ * @sb: super block
+ */
+static void ext4_update_super(struct super_block *sb,
+			      struct ext4_new_group_data *input)
+{
+	struct ext4_sb_info *sbi = EXT4_SB(sb);
+	struct ext4_super_block *es = sbi->s_es;
+
+	/*
+	 * Make the new blocks and inodes valid next.  We do this before
+	 * increasing the group count so that once the group is enabled,
+	 * all of its blocks and inodes are already valid.
+	 *
+	 * We always allocate group-by-group, then block-by-block or
+	 * inode-by-inode within a group, so enabling these
+	 * blocks/inodes before the group is live won't actually let us
+	 * allocate the new space yet.
+	 */
+	ext4_blocks_count_set(es, ext4_blocks_count(es) +
+		input->blocks_count);
+	le32_add_cpu(&es->s_inodes_count, EXT4_INODES_PER_GROUP(sb));
+
+	/*
+	 * We need to protect s_groups_count against other CPUs seeing
+	 * inconsistent state in the superblock.
+	 *
+	 * The precise rules we use are:
+	 *
+	 * * Writers must perform a smp_wmb() after updating all dependent
+	 *   data and before modifying the groups count
+	 *
+	 * * Readers must perform an smp_rmb() after reading the groups count
+	 *   and before reading any dependent data.
+	 *
+	 * NB. These rules can be relaxed when checking the group count
+	 * while freeing data, as we can only allocate from a block
+	 * group after serialising against the group count, and we can
+	 * only then free after serialising in turn against that
+	 * allocation.
+	 */
+	smp_wmb();
+
+	/* Update the global fs size fields */
+	sbi->s_groups_count++;
+
+	/* Update the reserved block counts only once the new group is
+	 * active. */
+	ext4_r_blocks_count_set(es, ext4_r_blocks_count(es) +
+		input->reserved_blocks);
+
+	/* Update the free space counts */
+	percpu_counter_add(&sbi->s_freeblocks_counter,
+			   input->free_blocks_count);
+	percpu_counter_add(&sbi->s_freeinodes_counter,
+			   EXT4_INODES_PER_GROUP(sb));
+
+	if (EXT4_HAS_INCOMPAT_FEATURE(sb, EXT4_FEATURE_INCOMPAT_FLEX_BG) &&
+	    sbi->s_log_groups_per_flex) {
+		ext4_group_t flex_group;
+		flex_group = ext4_flex_group(sbi, input->group);
+		atomic_add(input->free_blocks_count,
+			   &sbi->s_flex_groups[flex_group].free_blocks);
+		atomic_add(EXT4_INODES_PER_GROUP(sb),
+			   &sbi->s_flex_groups[flex_group].free_inodes);
+	}
+
+}
+
 /* Add group descriptor data to an existing or new group descriptor block.
  * Ensure we handle all possible error conditions _before_ we start modifying
  * the filesystem, because we cannot abort the transaction and not have it
-- 
1.7.5.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 05/13] ext4: add a structure which will be used by 64bit-resize interface
  2011-08-11  3:28 New resize interface implementation Yongqiang Yang
                   ` (3 preceding siblings ...)
  2011-08-11  3:28 ` [PATCH 04/13] ext4: add a function which updates super block Yongqiang Yang
@ 2011-08-11  3:28 ` Yongqiang Yang
  2011-08-11 10:57   ` Steven Liu
  2011-08-11  3:28 ` [PATCH 06/13] ext4: add a function which sets up group blocks of a flex groups Yongqiang Yang
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 25+ messages in thread
From: Yongqiang Yang @ 2011-08-11  3:28 UTC (permalink / raw)
  To: linux-ext4; +Cc: aedilger, tytso, Yongqiang Yang

This patch adds a structure which will be used by 64bit-resize interface.
Two functions which allocate and destroy the structure respectively are
added.

Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
---
 fs/ext4/resize.c |   56 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 56 insertions(+), 0 deletions(-)

diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
index 14be865..c586e51 100644
--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -134,6 +134,62 @@ static int verify_group_input(struct super_block *sb,
 	return err;
 }
 
+/*
+ * ext4_new_flex_group_data is used by 64bit-resize interface to add a flex
+ * group each time.
+ */
+struct ext4_new_flex_group_data {
+	struct ext4_new_group_data *groups;	/* new_group_data for groups
+						   in the flex group */
+	__u16 *bg_flags;			/* block group flags of groups
+						   in @groups */
+	ext4_group_t count;			/* number of groups in @groups
+						 */
+};
+
+/*
+ * alloc_flex_gd() allocates a ext4_new_flex_group_data with size of
+ * @flexbg_size.
+ *
+ * Returns NULL on failure otherwise address of the allocated structure.
+ */
+static struct ext4_new_flex_group_data *alloc_flex_gd(unsigned long flexbg_size)
+{
+	struct ext4_new_flex_group_data *flex_gd;
+
+	flex_gd = kmalloc(sizeof(*flex_gd), GFP_NOFS);
+	if (flex_gd == NULL)
+		goto out3;
+
+	flex_gd->count = flexbg_size;
+
+	flex_gd->groups = kmalloc(sizeof(struct ext4_new_group_data) *
+				  flexbg_size, GFP_NOFS);
+	if (flex_gd->groups == NULL)
+		goto out2;
+
+	flex_gd->bg_flags = kmalloc(flexbg_size * sizeof(__u16), GFP_NOFS);
+	if (flex_gd->bg_flags == NULL)
+		goto out1;
+
+	return flex_gd;
+
+out1:
+	kfree(flex_gd->bg_flags);
+out2:
+	kfree(flex_gd->groups);
+out3:
+	kfree(flex_gd);
+	return NULL;
+}
+
+void free_flex_gd(struct ext4_new_flex_group_data *flex_gd)
+{
+	kfree(flex_gd->bg_flags);
+	kfree(flex_gd->groups);
+	kfree(flex_gd);
+}
+
 static struct buffer_head *bclean(handle_t *handle, struct super_block *sb,
 				  ext4_fsblk_t blk)
 {
-- 
1.7.5.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 06/13] ext4: add a function which sets up group blocks of a flex groups
  2011-08-11  3:28 New resize interface implementation Yongqiang Yang
                   ` (4 preceding siblings ...)
  2011-08-11  3:28 ` [PATCH 05/13] ext4: add a structure which will be used by 64bit-resize interface Yongqiang Yang
@ 2011-08-11  3:28 ` Yongqiang Yang
  2011-08-11  3:28 ` [PATCH 07/13] ext4: add a function which adds several group descriptors Yongqiang Yang
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Yongqiang Yang @ 2011-08-11  3:28 UTC (permalink / raw)
  To: linux-ext4; +Cc: aedilger, tytso, Yongqiang Yang

This patch adds a function named setup_new_flex_group_blocks() which
sets up group blocks of a flex groups.

Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
---
 fs/ext4/ext4.h   |    8 ++
 fs/ext4/resize.c |  249 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 257 insertions(+), 0 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index e717dfd..334525d 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -491,6 +491,14 @@ struct ext4_new_group_data {
 	__u32 free_blocks_count;
 };
 
+/* Indexes used to index group tables in ext4_new_group_data */
+enum {
+	BLOCK_BITMAP = 0,	/* block bitmap */
+	INODE_BITMAP,		/* inode bitmap */
+	INODE_TABLE,		/* inode tables */
+	GROUP_TABLE_COUNT,
+};
+
 /*
  * Flags used by ext4_map_blocks()
  */
diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
index c586e51..4acf7a8 100644
--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -235,6 +235,255 @@ static int extend_or_restart_transaction(handle_t *handle, int thresh)
 }
 
 /*
+ * set_flexbg_block_bitmap() mark @count blocks starting from @block used.
+ *
+ * Helper function for ext4_setup_new_group_blocks() which set .
+ *
+ * @sb: super block
+ * @handle: journal handle
+ * @flex_gd: flex group data
+ */
+static int set_flexbg_block_bitmap(struct super_block *sb, handle_t *handle,
+			struct ext4_new_flex_group_data *flex_gd,
+			ext4_fsblk_t block, ext4_group_t count)
+{
+	ext4_group_t count2;
+
+	ext4_debug("mark blocks [%llu/%u] used\n", block, count);
+	for (count2 = count; count > 0; count -= count2, block += count2) {
+		ext4_fsblk_t start;
+		struct buffer_head *bh;
+		ext4_group_t group;
+		int err;
+
+		ext4_get_group_no_and_offset(sb, block, &group, NULL);
+		start = ext4_group_first_block_no(sb, group);
+		group -= flex_gd->groups[0].group;
+
+		count2 = sb->s_blocksize * 8 - (block - start);
+		if (count2 > count)
+			count2 = count;
+
+		if (flex_gd->bg_flags[group] & EXT4_BG_BLOCK_UNINIT) {
+			BUG_ON(flex_gd->count > 1);
+			continue;
+		}
+
+		err = extend_or_restart_transaction(handle, 1);
+		if (err)
+			return err;
+
+		bh = sb_getblk(sb, flex_gd->groups[group].block_bitmap);
+		if (!bh)
+			return -EIO;
+
+		err = ext4_journal_get_write_access(handle, bh);
+		if (err)
+			return err;
+		ext4_debug("mark block bitmap %#04llx (+%llu/%u)\n", block,
+			   block - start, count2);
+		ext4_set_bits(bh->b_data, block - start, count2);
+
+		err = ext4_handle_dirty_metadata(handle, NULL, bh);
+		if (unlikely(err))
+			return err;
+		brelse(bh);
+	}
+
+	return 0;
+}
+
+/*
+ * Set up the block and inode bitmaps, and the inode table for the new groups.
+ * This doesn't need to be part of the main transaction, since we are only
+ * changing blocks outside the actual filesystem.  We still do journaling to
+ * ensure the recovery is correct in case of a failure just after resize.
+ * If any part of this fails, we simply abort the resize.
+ *
+ * setup_new_flex_group_blocks handles a flex group as follow:
+ *  1. copy super block and GDT, and initialize group tables if necessary.
+ *     In this step, we only set bits in blocks bitmaps for blocks taken by
+ *     super block and GDT.
+ *  2. allocate group tables in block bitmaps, that is, set bits in block
+ *     bitmap for blocks taken by group tables.
+ */
+static int setup_new_flex_group_blocks(struct super_block *sb,
+				struct ext4_new_flex_group_data *flex_gd)
+{
+	int group_table_count[] = {1, 1, EXT4_SB(sb)->s_itb_per_group};
+	ext4_fsblk_t start;
+	ext4_fsblk_t block;
+	struct ext4_sb_info *sbi = EXT4_SB(sb);
+	struct ext4_super_block *es = sbi->s_es;
+	struct ext4_new_group_data *group_data = flex_gd->groups;
+	__u16 *bg_flags = flex_gd->bg_flags;
+	handle_t *handle;
+	ext4_group_t group, count;
+	struct buffer_head *bh = NULL;
+	int reserved_gdb, i, j, err = 0, err2;
+
+	BUG_ON(!flex_gd->count || !group_data ||
+	       group_data[0].group != sbi->s_groups_count);
+
+	reserved_gdb = le16_to_cpu(es->s_reserved_gdt_blocks);
+
+	/* This transaction may be extended/restarted along the way */
+	handle = ext4_journal_start_sb(sb, EXT4_MAX_TRANS_DATA);
+	if (IS_ERR(handle))
+		return PTR_ERR(handle);
+
+	group = group_data[0].group;
+	for (i = 0; i < flex_gd->count; i++, group++) {
+		unsigned long gdblocks;
+
+		gdblocks = ext4_bg_num_gdb(sb, group);
+		start = ext4_group_first_block_no(sb, group);
+
+		/* Copy all of the GDT blocks into the backup in this group */
+		for (j = 0, block = start + 1; j < gdblocks; j++, block++) {
+			struct buffer_head *gdb;
+
+			ext4_debug("update backup group %#04llx\n", block);
+			err = extend_or_restart_transaction(handle, 1);
+			if (err)
+				goto out;
+
+			gdb = sb_getblk(sb, block);
+			if (!gdb) {
+				err = -EIO;
+				goto out;
+			}
+
+			err = ext4_journal_get_write_access(handle, gdb);
+			if (err) {
+				brelse(gdb);
+				goto out;
+			}
+			memcpy(gdb->b_data, sbi->s_group_desc[j]->b_data,
+			       gdb->b_size);
+			set_buffer_uptodate(gdb);
+
+			err = ext4_handle_dirty_metadata(handle, NULL, gdb);
+			if (unlikely(err)) {
+				brelse(gdb);
+				goto out;
+			}
+			brelse(gdb);
+		}
+
+		/* Zero out all of the reserved backup group descriptor
+		 * table blocks
+		 */
+		if (ext4_bg_has_super(sb, group)) {
+			err = sb_issue_zeroout(sb, gdblocks + start + 1,
+					reserved_gdb, GFP_NOFS);
+			if (err)
+				goto out;
+		}
+
+		/* Initialize group tables of the grop @group */
+		if (!(bg_flags[i] & EXT4_BG_INODE_ZEROED))
+			goto handle_bb;
+
+		/* Zero out all of the inode table blocks */
+		block = group_data[i].inode_table;
+		ext4_debug("clear inode table blocks %#04llx -> %#04lx\n",
+			    block, sbi->s_itb_per_group);
+		err = sb_issue_zeroout(sb, block, sbi->s_itb_per_group,
+				       GFP_NOFS);
+		if (err)
+			goto out;
+
+handle_bb:
+		if (bg_flags[i] & EXT4_BG_BLOCK_UNINIT)
+			goto handle_ib;
+
+		/* Initialize block bitmap of the @group */
+		block = group_data[i].block_bitmap;
+		err = extend_or_restart_transaction(handle, 1);
+		if (err)
+			goto out;
+
+		bh = bclean(handle, sb, block);
+		if (IS_ERR(bh)) {
+			err = PTR_ERR(bh);
+			goto out;
+		}
+		if (ext4_bg_has_super(sb, group)) {
+			ext4_debug("mark backup superblock %#04llx (+0)\n",
+				   start);
+			ext4_set_bits(bh->b_data, 0, gdblocks + reserved_gdb +
+						     1);
+		}
+		ext4_mark_bitmap_end(group_data[0].blocks_count,
+				     sb->s_blocksize * 8, bh->b_data);
+		err = ext4_handle_dirty_metadata(handle, NULL, bh);
+		if (err)
+			goto out;
+		brelse(bh);
+
+handle_ib:
+		if (bg_flags[i] & EXT4_BG_INODE_UNINIT)
+			continue;
+
+		/* Initialize inode bitmap of the @group */
+		block = group_data[i].inode_bitmap;
+		err = extend_or_restart_transaction(handle, 1);
+		if (err)
+			goto out;
+		/* Mark unused entries in inode bitmap used */
+		bh = bclean(handle, sb, block);
+		if (IS_ERR(bh)) {
+			err = PTR_ERR(bh);
+			goto out;
+		}
+		ext4_mark_bitmap_end(EXT4_INODES_PER_GROUP(sb),
+				     sb->s_blocksize * 8, bh->b_data);
+		err = ext4_handle_dirty_metadata(handle, NULL, bh);
+		if (err)
+			goto out;
+		brelse(bh);
+	}
+	bh = NULL;
+
+	/* Mark group tables in block bitmap */
+	for (j = 0; j < GROUP_TABLE_COUNT; j++) {
+		count = group_table_count[j];
+		start = (&group_data[0].block_bitmap)[j];
+		block = start;
+		for (i = 1; i < flex_gd->count; i++) {
+			block += group_table_count[j];
+			if (block == (&group_data[i].block_bitmap)[j]) {
+				count += group_table_count[j];
+				continue;
+			}
+			err = set_flexbg_block_bitmap(sb, handle,
+						flex_gd, start, count);
+			if (err)
+				goto out;
+			count = group_table_count[j];
+			start = group_data[i].block_bitmap;
+			block = start;
+		}
+
+		if (count) {
+			err = set_flexbg_block_bitmap(sb, handle,
+						flex_gd, start, count);
+			if (err)
+				goto out;
+		}
+	}
+
+out:
+	brelse(bh);
+	err2 = ext4_journal_stop(handle);
+	if (err2 && !err)
+		err = err2;
+
+	return err;
+}
+
+/*
  * Set up the block and inode bitmaps, and the inode table for the new group.
  * This doesn't need to be part of the main transaction, since we are only
  * changing blocks outside the actual filesystem.  We still do journaling to
-- 
1.7.5.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 07/13] ext4: add a function which adds several group descriptors
  2011-08-11  3:28 New resize interface implementation Yongqiang Yang
                   ` (5 preceding siblings ...)
  2011-08-11  3:28 ` [PATCH 06/13] ext4: add a function which sets up group blocks of a flex groups Yongqiang Yang
@ 2011-08-11  3:28 ` Yongqiang Yang
  2011-08-11  3:28 ` [PATCH 08/13] ext4: add a function which sets up a flex groups each time Yongqiang Yang
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Yongqiang Yang @ 2011-08-11  3:28 UTC (permalink / raw)
  To: linux-ext4; +Cc: aedilger, tytso, Yongqiang Yang

This patch adds a functon named ext4_add_new_descs() which adds
several  group descriptors each time.

Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
---
 fs/ext4/resize.c |   25 +++++++++++++++++++++++++
 1 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
index 4acf7a8..7e91d58 100644
--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -1083,6 +1083,31 @@ out:
 }
 
 /*
+ * ext4_add_new_descs() adds @count group descriptor of groups
+ * starting at @group
+ *
+ * @handle: journal handle
+ * @sb; super block
+ * @group: the group no. of the first group desc to be added
+ * @resize_inode: the resize inode
+ * @count: number of group descriptors to be added
+ */
+static int ext4_add_new_descs(handle_t *handle, struct super_block *sb,
+			ext4_group_t group, struct inode *resize_inode,
+			ext4_group_t count)
+{
+	int i, err = 0;
+
+	for (i = 0; i < count; i++) {
+		err = ext4_add_new_desc(handle, sb, group + i, resize_inode);
+		if (err)
+			return err;
+	}
+
+	return err;
+}
+
+/*
  * ext4_setup_new_desc() sets up group descriptors specified by @input.
  *
  * @handle: journal handle
-- 
1.7.5.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 08/13] ext4: add a function which sets up a flex groups each time
  2011-08-11  3:28 New resize interface implementation Yongqiang Yang
                   ` (6 preceding siblings ...)
  2011-08-11  3:28 ` [PATCH 07/13] ext4: add a function which adds several group descriptors Yongqiang Yang
@ 2011-08-11  3:28 ` Yongqiang Yang
  2011-08-11  3:28 ` [PATCH 09/13] ext4: enable ext4_update_super() to handle a flex groups Yongqiang Yang
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Yongqiang Yang @ 2011-08-11  3:28 UTC (permalink / raw)
  To: linux-ext4; +Cc: aedilger, tytso, Yongqiang Yang

This patch adds a function named ext4_setup_new_descs() which sets up
a flex groups each time.

Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
---
 fs/ext4/resize.c |   25 +++++++++++++++++++++++--
 1 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
index 7e91d58..5939b62 100644
--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -1114,7 +1114,8 @@ static int ext4_add_new_descs(handle_t *handle, struct super_block *sb,
  * @sb: super block
  */
 static int ext4_setup_new_desc(handle_t *handle, struct super_block *sb,
-			       struct ext4_new_group_data *input)
+			       struct ext4_new_group_data *input,
+			       __u16 bg_flags)
 {
 	struct ext4_sb_info *sbi = EXT4_SB(sb);
 	ext4_group_t group;
@@ -1143,7 +1144,7 @@ static int ext4_setup_new_desc(handle_t *handle, struct super_block *sb,
 	ext4_inode_table_set(sb, gdp, input->inode_table); /* LV FIXME */
 	ext4_free_blks_set(sb, gdp, input->free_blocks_count);
 	ext4_free_inodes_set(sb, gdp, EXT4_INODES_PER_GROUP(sb));
-	gdp->bg_flags = cpu_to_le16(EXT4_BG_INODE_ZEROED);
+	gdp->bg_flags = cpu_to_le16(bg_flags);
 	gdp->bg_checksum = ext4_group_desc_csum(sbi, input->group, gdp);
 
 	err = ext4_handle_dirty_metadata(handle, NULL, gdb_bh);
@@ -1162,6 +1163,26 @@ static int ext4_setup_new_desc(handle_t *handle, struct super_block *sb,
 }
 
 /*
+ * ext4_setup_new_descs setups group descriptors of a flex groups
+ */
+static int ext4_setup_new_descs(handle_t *handle, struct super_block *sb,
+				struct ext4_new_flex_group_data *flex_gd)
+{
+	struct ext4_new_group_data *group_data = flex_gd->groups;
+	__u16 *bg_flags = flex_gd->bg_flags;
+	int i, err = 0;
+
+	for (i = 0; i < flex_gd->count; i++) {
+		err = ext4_setup_new_desc(handle, sb, group_data + i,
+					  bg_flags[i]);
+		if (err)
+			return err;
+	}
+
+	return err;
+}
+
+/*
  * ext4_update_super() updates super so that new the added group can be seen
  *   by the filesystem.
  *
-- 
1.7.5.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 09/13] ext4: enable ext4_update_super() to handle a flex groups
  2011-08-11  3:28 New resize interface implementation Yongqiang Yang
                   ` (7 preceding siblings ...)
  2011-08-11  3:28 ` [PATCH 08/13] ext4: add a function which sets up a flex groups each time Yongqiang Yang
@ 2011-08-11  3:28 ` Yongqiang Yang
  2011-08-11  3:28 ` [PATCH 10/13] ext4: pass verify_reserved_gdb() the number of group decriptors Yongqiang Yang
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Yongqiang Yang @ 2011-08-11  3:28 UTC (permalink / raw)
  To: linux-ext4; +Cc: aedilger, tytso, Yongqiang Yang

This patch enables ext4_update_super() to handle a flex groups.

Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
---
 fs/ext4/resize.c |   58 +++++++++++++++++++++++++++++++++++++----------------
 1 files changed, 40 insertions(+), 18 deletions(-)

diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
index 5939b62..606de3a 100644
--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -1183,17 +1183,24 @@ static int ext4_setup_new_descs(handle_t *handle, struct super_block *sb,
 }
 
 /*
- * ext4_update_super() updates super so that new the added group can be seen
- *   by the filesystem.
+ * ext4_update_super() updates super block so that new added groups can be seen
+ * by the filesystem.
  *
  * @sb: super block
+ * @flex_gd: new added groups
  */
 static void ext4_update_super(struct super_block *sb,
-			      struct ext4_new_group_data *input)
+			     struct ext4_new_flex_group_data *flex_gd)
 {
+	ext4_fsblk_t blocks_count = 0;
+	ext4_fsblk_t free_blocks = 0;
+	ext4_fsblk_t reserved_blocks = 0;
+	struct ext4_new_group_data *group_data = flex_gd->groups;
 	struct ext4_sb_info *sbi = EXT4_SB(sb);
 	struct ext4_super_block *es = sbi->s_es;
+	int i;
 
+	BUG_ON(flex_gd->count == 0 || group_data == NULL);
 	/*
 	 * Make the new blocks and inodes valid next.  We do this before
 	 * increasing the group count so that once the group is enabled,
@@ -1204,9 +1211,19 @@ static void ext4_update_super(struct super_block *sb,
 	 * blocks/inodes before the group is live won't actually let us
 	 * allocate the new space yet.
 	 */
-	ext4_blocks_count_set(es, ext4_blocks_count(es) +
-		input->blocks_count);
-	le32_add_cpu(&es->s_inodes_count, EXT4_INODES_PER_GROUP(sb));
+	for (i = 0; i < flex_gd->count; i++) {
+		blocks_count += group_data[i].blocks_count;
+		free_blocks += group_data[i].free_blocks_count;
+	}
+
+	reserved_blocks = ext4_r_blocks_count(es) * 100;
+	do_div(reserved_blocks, ext4_blocks_count(es));
+	reserved_blocks *= blocks_count;
+	do_div(reserved_blocks, 100);
+
+	ext4_blocks_count_set(es, ext4_blocks_count(es) + blocks_count);
+	le32_add_cpu(&es->s_inodes_count, EXT4_INODES_PER_GROUP(sb) *
+		     flex_gd->count);
 
 	/*
 	 * We need to protect s_groups_count against other CPUs seeing
@@ -1214,11 +1231,11 @@ static void ext4_update_super(struct super_block *sb,
 	 *
 	 * The precise rules we use are:
 	 *
-	 * * Writers must perform a smp_wmb() after updating all dependent
-	 *   data and before modifying the groups count
+	 * * Writers must perform a smp_wmb() after updating all
+	 *   dependent data and before modifying the groups count
 	 *
-	 * * Readers must perform an smp_rmb() after reading the groups count
-	 *   and before reading any dependent data.
+	 * * Readers must perform an smp_rmb() after reading the groups
+	 *   count and before reading any dependent data.
 	 *
 	 * NB. These rules can be relaxed when checking the group count
 	 * while freeing data, as we can only allocate from a block
@@ -1229,29 +1246,34 @@ static void ext4_update_super(struct super_block *sb,
 	smp_wmb();
 
 	/* Update the global fs size fields */
-	sbi->s_groups_count++;
+	sbi->s_groups_count += flex_gd->count;
 
 	/* Update the reserved block counts only once the new group is
 	 * active. */
 	ext4_r_blocks_count_set(es, ext4_r_blocks_count(es) +
-		input->reserved_blocks);
+				reserved_blocks);
 
 	/* Update the free space counts */
 	percpu_counter_add(&sbi->s_freeblocks_counter,
-			   input->free_blocks_count);
+			   free_blocks);
 	percpu_counter_add(&sbi->s_freeinodes_counter,
-			   EXT4_INODES_PER_GROUP(sb));
+			   EXT4_INODES_PER_GROUP(sb) * flex_gd->count);
 
-	if (EXT4_HAS_INCOMPAT_FEATURE(sb, EXT4_FEATURE_INCOMPAT_FLEX_BG) &&
+	if (EXT4_HAS_INCOMPAT_FEATURE(sb,
+				      EXT4_FEATURE_INCOMPAT_FLEX_BG) &&
 	    sbi->s_log_groups_per_flex) {
 		ext4_group_t flex_group;
-		flex_group = ext4_flex_group(sbi, input->group);
-		atomic_add(input->free_blocks_count,
+		flex_group = ext4_flex_group(sbi, group_data[0].group);
+		atomic_add(free_blocks,
 			   &sbi->s_flex_groups[flex_group].free_blocks);
-		atomic_add(EXT4_INODES_PER_GROUP(sb),
+		atomic_add(EXT4_INODES_PER_GROUP(sb) * flex_gd->count,
 			   &sbi->s_flex_groups[flex_group].free_inodes);
 	}
 
+	if (test_opt(sb, DEBUG))
+		printk(KERN_DEBUG "EXT4-fs: added group %u:"
+		       "%llu blocks(%llu free %llu reserved)\n", flex_gd->count,
+		       blocks_count, free_blocks, reserved_blocks);
 }
 
 /* Add group descriptor data to an existing or new group descriptor block.
-- 
1.7.5.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 10/13] ext4: pass verify_reserved_gdb() the number of group decriptors
  2011-08-11  3:28 New resize interface implementation Yongqiang Yang
                   ` (8 preceding siblings ...)
  2011-08-11  3:28 ` [PATCH 09/13] ext4: enable ext4_update_super() to handle a flex groups Yongqiang Yang
@ 2011-08-11  3:28 ` Yongqiang Yang
  2011-08-11  3:28 ` [PATCH 11/13] ext4: add a new function which allocates bitmaps and inode tables Yongqiang Yang
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Yongqiang Yang @ 2011-08-11  3:28 UTC (permalink / raw)
  To: linux-ext4; +Cc: aedilger, tytso, Yongqiang Yang

The 64bit resizer adds a  flex group each time, so verify_reserved_gdb can
not use s_groups_count directly,  it should use the number of group decriptors
before the added group.

Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
---
 fs/ext4/resize.c |    7 ++++---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
index 606de3a..86edf19 100644
--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -656,10 +656,10 @@ static unsigned ext4_list_backups(struct super_block *sb, unsigned *three,
  * groups in current filesystem that have BACKUPS, or -ve error code.
  */
 static int verify_reserved_gdb(struct super_block *sb,
+			       ext4_group_t end,
 			       struct buffer_head *primary)
 {
 	const ext4_fsblk_t blk = primary->b_blocknr;
-	const ext4_group_t end = EXT4_SB(sb)->s_groups_count;
 	unsigned three = 1;
 	unsigned five = 5;
 	unsigned seven = 7;
@@ -734,7 +734,7 @@ static int add_new_gdb(handle_t *handle, struct inode *inode,
 	if (!gdb_bh)
 		return -EIO;
 
-	gdbackups = verify_reserved_gdb(sb, gdb_bh);
+	gdbackups = verify_reserved_gdb(sb, group, gdb_bh);
 	if (gdbackups < 0) {
 		err = gdbackups;
 		goto exit_bh;
@@ -897,7 +897,8 @@ static int reserve_backup_gdb(handle_t *handle, struct inode *inode,
 			err = -EIO;
 			goto exit_bh;
 		}
-		if ((gdbackups = verify_reserved_gdb(sb, primary[res])) < 0) {
+		gdbackups = verify_reserved_gdb(sb, group, primary[res]);
+		if (gdbackups < 0) {
 			brelse(primary[res]);
 			err = gdbackups;
 			goto exit_bh;
-- 
1.7.5.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 11/13] ext4: add a new function which allocates bitmaps and inode tables
  2011-08-11  3:28 New resize interface implementation Yongqiang Yang
                   ` (9 preceding siblings ...)
  2011-08-11  3:28 ` [PATCH 10/13] ext4: pass verify_reserved_gdb() the number of group decriptors Yongqiang Yang
@ 2011-08-11  3:28 ` Yongqiang Yang
  2011-08-11  3:28 ` [PATCH 12/13] ext4: add a new function which adds a flex group to a fs Yongqiang Yang
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Yongqiang Yang @ 2011-08-11  3:28 UTC (permalink / raw)
  To: linux-ext4; +Cc: aedilger, tytso, Yongqiang Yang

This patch adds a new function named ext4_allocates_group_table() which
allcoates block bitmaps, inode bitmaps and inode tables for a flex groups and is
used by resize code.

Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
---
 fs/ext4/resize.c |  111 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 111 insertions(+), 0 deletions(-)

diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
index 86edf19..d4b892f 100644
--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -190,6 +190,117 @@ void free_flex_gd(struct ext4_new_flex_group_data *flex_gd)
 	kfree(flex_gd);
 }
 
+/*
+ * ext4_allcoate_group_table() allocates block bitmaps, inode bitmaps and
+ * inode tables for a flex group.
+ *
+ * This function is used by 64bit-resize.  Note that this function allocates
+ * group tables from the 1st group of groups contained by @flexgd, which may
+ * be a partial of a flex group.
+ *
+ * @sb: super block of fs to which the groups belongs
+ */
+static void ext4_alloc_group_tables(struct super_block *sb,
+				struct ext4_new_flex_group_data *flex_gd,
+				int flexbg_size)
+{
+	struct ext4_new_group_data *group_data = flex_gd->groups;
+	struct ext4_super_block *es = EXT4_SB(sb)->s_es;
+	ext4_fsblk_t start_blk;
+	ext4_fsblk_t last_blk;
+	ext4_group_t src_group;
+	ext4_group_t bb_index = 0;
+	ext4_group_t ib_index = 0;
+	ext4_group_t it_index = 0;
+	ext4_group_t group;
+	ext4_group_t last_group;
+	unsigned overhead;
+
+	BUG_ON(flex_gd->count == 0 || group_data == NULL);
+
+	src_group = group_data[0].group;
+	last_group  = src_group + flex_gd->count - 1;
+
+	BUG_ON((flexbg_size > 1) && ((src_group & ~(flexbg_size - 1)) !=
+	       (last_group & ~(flexbg_size - 1))));
+next_group:
+	group = group_data[0].group;
+	start_blk = ext4_group_first_block_no(sb, src_group);
+	last_blk = start_blk + group_data[src_group - group].blocks_count;
+
+	overhead = ext4_bg_has_super(sb, src_group) ?
+		   (1 + ext4_bg_num_gdb(sb, src_group) +
+		    le16_to_cpu(es->s_reserved_gdt_blocks)) : 0;
+
+	start_blk += overhead;
+
+	BUG_ON(src_group >= group_data[0].group + flex_gd->count);
+	/* We collect contiguous blocks as much as possible. */
+	src_group++;
+	for (; src_group <= last_group; src_group++)
+		if (!ext4_bg_has_super(sb, src_group))
+			last_blk += group_data[src_group - group].blocks_count;
+		else
+			break;
+
+	/* Allocate block bitmaps */
+	for (; bb_index < flex_gd->count; bb_index++) {
+		if (start_blk >= last_blk)
+			goto next_group;
+		group_data[bb_index].block_bitmap = start_blk++;
+		ext4_get_group_no_and_offset(sb, start_blk - 1, &group, NULL);
+		group -= group_data[0].group;
+		group_data[group].free_blocks_count--;
+		if (flexbg_size > 1)
+			flex_gd->bg_flags[group] &= ~EXT4_BG_BLOCK_UNINIT;
+	}
+
+	/* Allocate inode bitmaps */
+	for (; ib_index < flex_gd->count; ib_index++) {
+		if (start_blk >= last_blk)
+			goto next_group;
+		group_data[ib_index].inode_bitmap = start_blk++;
+		ext4_get_group_no_and_offset(sb, start_blk - 1, &group, NULL);
+		group -= group_data[0].group;
+		group_data[group].free_blocks_count--;
+		if (flexbg_size > 1)
+			flex_gd->bg_flags[group] &= ~EXT4_BG_BLOCK_UNINIT;
+	}
+
+	/* Allocate inode tables */
+	for (; it_index < flex_gd->count; it_index++) {
+		if (start_blk + EXT4_SB(sb)->s_itb_per_group > last_blk)
+			goto next_group;
+		group_data[it_index].inode_table = start_blk;
+		ext4_get_group_no_and_offset(sb, start_blk, &group, NULL);
+		group -= group_data[0].group;
+		group_data[group].free_blocks_count -=
+					EXT4_SB(sb)->s_itb_per_group;
+		if (flexbg_size > 1)
+			flex_gd->bg_flags[group] &= ~EXT4_BG_BLOCK_UNINIT;
+
+		start_blk += EXT4_SB(sb)->s_itb_per_group;
+	}
+
+	if (test_opt(sb, DEBUG)) {
+		int i;
+		group = group_data[0].group;
+
+		printk(KERN_DEBUG "EXT4-fs: adding a flex group with "
+		       "%d groups, flexbg size is %d:\n", flex_gd->count,
+		       flexbg_size);
+
+		for (i = 0; i < flex_gd->count; i++) {
+			printk(KERN_DEBUG "adding %s group %u: %u "
+			       "blocks (%d free)\n",
+			       ext4_bg_has_super(sb, group + i) ? "normal" :
+			       "no-super", group + i,
+			       group_data[i].blocks_count,
+			       group_data[i].free_blocks_count);
+		}
+	}
+}
+
 static struct buffer_head *bclean(handle_t *handle, struct super_block *sb,
 				  ext4_fsblk_t blk)
 {
-- 
1.7.5.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 12/13] ext4: add a new function which adds a flex group to a fs
  2011-08-11  3:28 New resize interface implementation Yongqiang Yang
                   ` (10 preceding siblings ...)
  2011-08-11  3:28 ` [PATCH 11/13] ext4: add a new function which allocates bitmaps and inode tables Yongqiang Yang
@ 2011-08-11  3:28 ` Yongqiang Yang
  2011-08-11  3:28 ` [PATCH 13/13] ext4: add new online resize interface Yongqiang Yang
       [not found] ` <CAKgsxVQAwEetdyGcOciA8+wi_eLA7Fmq60kfDvEK1WzgYxdUMQ@mail.gmail.com>
  13 siblings, 0 replies; 25+ messages in thread
From: Yongqiang Yang @ 2011-08-11  3:28 UTC (permalink / raw)
  To: linux-ext4; +Cc: aedilger, tytso, Yongqiang Yang

This patch adds a new function named ext4_flex_group_add() which adds a
flex group to a fs.  The function is used by 64bit-resize interface.

Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
---
 fs/ext4/resize.c |   82 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 82 insertions(+), 0 deletions(-)

diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
index d4b892f..14082c0 100644
--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -1787,3 +1787,85 @@ int ext4_group_extend(struct super_block *sb, struct ext4_super_block *es,
 exit_put:
 	return err;
 } /* ext4_group_extend */
+
+/* Add a flex group to an fs. Ensure we handle all possible error conditions
+ * _before_ we start modifying the filesystem, because we cannot abort the
+ * transaction and not have it write the data to disk.
+ */
+static int ext4_flex_group_add(struct super_block *sb,
+			       struct inode *resize_inode,
+			       struct ext4_new_flex_group_data *flex_gd)
+{
+	struct ext4_sb_info *sbi = EXT4_SB(sb);
+	struct ext4_super_block *es = sbi->s_es;
+	ext4_fsblk_t o_blocks_count;
+	ext4_grpblk_t last;
+	ext4_group_t group;
+	handle_t *handle;
+	unsigned reserved_gdb;
+	int err = 0, err2 = 0, credit;
+
+	BUG_ON(!flex_gd->count || !flex_gd->groups || !flex_gd->bg_flags);
+
+	reserved_gdb = le16_to_cpu(es->s_reserved_gdt_blocks);
+	o_blocks_count = ext4_blocks_count(es);
+	ext4_get_group_no_and_offset(sb, o_blocks_count, &group, &last);
+	BUG_ON(last);
+
+	err = setup_new_flex_group_blocks(sb, flex_gd);
+	if (err)
+		goto exit;
+	/*
+	 * We will always be modifying at least the superblock and  GDT
+	 * block.  If we are adding a group past the last current GDT block,
+	 * we will also modify the inode and the dindirect block.  If we
+	 * are adding a group with superblock/GDT backups  we will also
+	 * modify each of the reserved GDT dindirect blocks.
+	 */
+	credit = flex_gd->count * 4 + reserved_gdb;
+	handle = ext4_journal_start_sb(sb, credit);
+	if (IS_ERR(handle)) {
+		err = PTR_ERR(handle);
+		goto exit;
+	}
+
+	err = ext4_journal_get_write_access(handle, sbi->s_sbh);
+	if (err)
+		goto exit_journal;
+
+	group = flex_gd->groups[0].group;
+	BUG_ON(group != EXT4_SB(sb)->s_groups_count);
+	err = ext4_add_new_descs(handle, sb, group,
+				resize_inode, flex_gd->count);
+	if (err)
+		goto exit_journal;
+
+	err = ext4_setup_new_descs(handle, sb, flex_gd);
+	if (err)
+		goto exit_journal;
+
+	ext4_update_super(sb, flex_gd);
+
+	err = ext4_handle_dirty_super(handle, sb);
+
+exit_journal:
+	err2 = ext4_journal_stop(handle);
+	if (!err)
+		err = err2;
+
+	if (!err) {
+		int i;
+		update_backups(sb, sbi->s_sbh->b_blocknr, (char *)es,
+			       sizeof(struct ext4_super_block));
+		for (i = 0; i < flex_gd->count; i++, group++) {
+			struct buffer_head *gdb_bh;
+			int gdb_num;
+			gdb_num = group / EXT4_BLOCKS_PER_GROUP(sb);
+			gdb_bh = sbi->s_group_desc[gdb_num];
+			update_backups(sb, gdb_bh->b_blocknr, gdb_bh->b_data,
+				       gdb_bh->b_size);
+		}
+	}
+exit:
+	return err;
+}
-- 
1.7.5.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 13/13] ext4: add new online resize interface
  2011-08-11  3:28 New resize interface implementation Yongqiang Yang
                   ` (11 preceding siblings ...)
  2011-08-11  3:28 ` [PATCH 12/13] ext4: add a new function which adds a flex group to a fs Yongqiang Yang
@ 2011-08-11  3:28 ` Yongqiang Yang
       [not found] ` <CAKgsxVQAwEetdyGcOciA8+wi_eLA7Fmq60kfDvEK1WzgYxdUMQ@mail.gmail.com>
  13 siblings, 0 replies; 25+ messages in thread
From: Yongqiang Yang @ 2011-08-11  3:28 UTC (permalink / raw)
  To: linux-ext4; +Cc: aedilger, tytso, Yongqiang Yang

This patch adds new online resize interface, whose input argument is a
64-bit integer indicating how many blocks there are in the resized fs.

In new resize impelmentation, all work like allocating group tables are done
by kernel side, so the new resize interface can support flex_bg feature and
prepares ground for suppoting resize with features like bigalloc and exclude
bitmap. Besides these, user-space tools just passes in the new number of blocks.

We delay initializing the bitmaps and inode tables of added groups if possible
and add multi groups (a flex groups) each time, so new resize is very fast like
mkfs.

Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
---
 Documentation/filesystems/ext4.txt |    7 ++
 fs/ext4/ext4.h                     |    2 +
 fs/ext4/ioctl.c                    |   31 +++++++
 fs/ext4/resize.c                   |  171 ++++++++++++++++++++++++++++++++++++
 4 files changed, 211 insertions(+), 0 deletions(-)

diff --git a/Documentation/filesystems/ext4.txt b/Documentation/filesystems/ext4.txt
index 232a575..d1548ab 100644
--- a/Documentation/filesystems/ext4.txt
+++ b/Documentation/filesystems/ext4.txt
@@ -590,6 +590,13 @@ Table of Ext4 specific ioctls
 			      behaviour may change in the future as it is
 			      not necessary and has been done this way only
 			      for sake of simplicity.
+
+ EXT4_IOC_RESIZE_FS	      Resize the filesystem to a new size.  The number
+			      of blocks of resized filesystem is passed in via
+			      64 bit integer argument.  The kernel allocates
+			      bitmaps and inode table, the userspace tool thus
+			      just passes the new number of blocks.
+
 ..............................................................................
 
 References
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 334525d..4c36b92 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -557,6 +557,7 @@ enum {
  /* note ioctl 11 reserved for filesystem-independent FIEMAP ioctl */
 #define EXT4_IOC_ALLOC_DA_BLKS		_IO('f', 12)
 #define EXT4_IOC_MOVE_EXT		_IOWR('f', 15, struct move_extent)
+#define EXT4_IOC_RESIZE_FS		_IOW('f', 16, __u64)
 
 #if defined(__KERNEL__) && defined(CONFIG_COMPAT)
 /*
@@ -1880,6 +1881,7 @@ extern int ext4_group_add(struct super_block *sb,
 extern int ext4_group_extend(struct super_block *sb,
 				struct ext4_super_block *es,
 				ext4_fsblk_t n_blocks_count);
+extern int ext4_resize_fs(struct super_block *sb, ext4_fsblk_t n_blocks_count);
 
 /* super.c */
 extern void *ext4_kvmalloc(size_t size, gfp_t flags);
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index f18bfe3..eeaa1a4 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -335,6 +335,37 @@ mext_out:
 		return err;
 	}
 
+	case EXT4_IOC_RESIZE_FS: {
+		ext4_fsblk_t n_blocks_count;
+		struct super_block *sb = inode->i_sb;
+		int err = 0, err2 = 0;
+
+		err = ext4_resize_begin(sb);
+		if (err)
+			return err;
+
+		if (copy_from_user(&n_blocks_count, (__u64 __user *)arg,
+				   sizeof(__u64)))
+			return -EFAULT;
+
+		err = mnt_want_write(filp->f_path.mnt);
+		if (err)
+			return err;
+
+		err = ext4_resize_fs(sb, n_blocks_count);
+		if (EXT4_SB(sb)->s_journal) {
+			jbd2_journal_lock_updates(EXT4_SB(sb)->s_journal);
+			err2 = jbd2_journal_flush(EXT4_SB(sb)->s_journal);
+			jbd2_journal_unlock_updates(EXT4_SB(sb)->s_journal);
+		}
+		if (err == 0)
+			err = err2;
+		mnt_drop_write(filp->f_path.mnt);
+		ext4_resize_end(sb);
+
+		return err;
+	}
+
 	case FITRIM:
 	{
 		struct super_block *sb = inode->i_sb;
diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
index 14082c0..84e5ea3 100644
--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -1869,3 +1869,174 @@ exit_journal:
 exit:
 	return err;
 }
+
+static int ext4_setup_next_flex_gd(struct super_block *sb,
+				    struct ext4_new_flex_group_data *flex_gd,
+				    ext4_fsblk_t n_blocks_count,
+				    unsigned long flexbg_size)
+{
+	struct ext4_super_block *es = EXT4_SB(sb)->s_es;
+	struct ext4_new_group_data *group_data = flex_gd->groups;
+	ext4_fsblk_t o_blocks_count;
+	ext4_group_t n_group;
+	ext4_group_t group;
+	ext4_group_t last_group;
+	ext4_grpblk_t last;
+	ext4_grpblk_t blocks_per_group;
+	unsigned long i;
+
+	blocks_per_group = EXT4_BLOCKS_PER_GROUP(sb);
+
+	o_blocks_count = ext4_blocks_count(es);
+
+	if (o_blocks_count == n_blocks_count)
+		return 0;
+
+	ext4_get_group_no_and_offset(sb, o_blocks_count, &group, &last);
+	BUG_ON(last);
+	ext4_get_group_no_and_offset(sb, n_blocks_count - 1, &n_group, &last);
+
+	last_group = group | (flexbg_size - 1);
+	if (last_group > n_group)
+		last_group = n_group;
+
+	flex_gd->count = last_group - group + 1;
+
+	for (i = 0; i < flex_gd->count; i++) {
+		int overhead;
+
+		group_data[i].group = group + i;
+		group_data[i].blocks_count = blocks_per_group;
+		overhead = ext4_bg_has_super(sb, group + i) ?
+			   (1 + ext4_bg_num_gdb(sb, group + i) +
+			    le16_to_cpu(es->s_reserved_gdt_blocks)) : 0;
+		group_data[i].free_blocks_count = blocks_per_group - overhead;
+		if (EXT4_HAS_RO_COMPAT_FEATURE(sb,
+					       EXT4_FEATURE_RO_COMPAT_GDT_CSUM))
+			flex_gd->bg_flags[i] = EXT4_BG_BLOCK_UNINIT |
+					       EXT4_BG_INODE_UNINIT;
+		else
+			flex_gd->bg_flags[i] = EXT4_BG_INODE_ZEROED;
+	}
+
+	if ((last_group == n_group) && (last != blocks_per_group - 1)) {
+		group_data[i - 1].blocks_count = last + 1;
+		group_data[i - 1].free_blocks_count -= blocks_per_group-
+					last - 1;
+	}
+
+	return 1;
+}
+
+/*
+ * ext4_resize_fs() resizes a fs to new size specified by @n_blocks_count
+ *
+ * @sb: super block of the fs to be resized
+ * @n_blocks_count: the number of blocks resides in the resized fs
+ */
+int ext4_resize_fs(struct super_block *sb, ext4_fsblk_t n_blocks_count)
+{
+	struct ext4_new_flex_group_data *flex_gd = NULL;
+	struct ext4_sb_info *sbi = EXT4_SB(sb);
+	struct ext4_super_block *es = sbi->s_es;
+	struct buffer_head *bh;
+	struct inode *resize_inode;
+	ext4_fsblk_t o_blocks_count;
+	ext4_group_t o_group;
+	ext4_group_t n_group;
+	ext4_grpblk_t offset;
+	unsigned long n_desc_blocks;
+	unsigned long o_desc_blocks;
+	unsigned long desc_blocks;
+	int err = 0, flexbg_size = 1;
+
+	o_blocks_count = ext4_blocks_count(es);
+
+	if (test_opt(sb, DEBUG))
+		printk(KERN_DEBUG "EXT4-fs: resizing filesystem from %llu "
+		       "upto %llu blocks\n", o_blocks_count, n_blocks_count);
+
+	if (n_blocks_count < o_blocks_count) {
+		/* On-line shrinking not supported */
+		ext4_warning(sb, "can't shrink FS - resize aborted");
+		return -EINVAL;
+	}
+
+	if (n_blocks_count == o_blocks_count)
+		/* Nothing need to do */
+		return 0;
+
+	ext4_get_group_no_and_offset(sb, n_blocks_count - 1, &n_group, &offset);
+	ext4_get_group_no_and_offset(sb, o_blocks_count, &o_group, &offset);
+
+	n_desc_blocks = (n_group + EXT4_DESC_PER_BLOCK(sb)) /
+			EXT4_DESC_PER_BLOCK(sb);
+	o_desc_blocks = (sbi->s_groups_count + EXT4_DESC_PER_BLOCK(sb) - 1) /
+			EXT4_DESC_PER_BLOCK(sb);
+	desc_blocks = n_desc_blocks - o_desc_blocks;
+
+	if (desc_blocks &&
+	    (!EXT4_HAS_COMPAT_FEATURE(sb, EXT4_FEATURE_COMPAT_RESIZE_INODE) ||
+	     le16_to_cpu(es->s_reserved_gdt_blocks) < desc_blocks)) {
+		ext4_warning(sb, "No reserved GDT blocks, can't resize");
+		return -EPERM;
+	}
+
+	resize_inode = ext4_iget(sb, EXT4_RESIZE_INO);
+	if (IS_ERR(resize_inode)) {
+		ext4_warning(sb, "Error opening resize inode");
+		return PTR_ERR(resize_inode);
+	}
+
+	/* See if the device is actually as big as what was requested */
+	bh = sb_bread(sb, n_blocks_count - 1);
+	if (!bh) {
+		ext4_warning(sb, "can't read last block, resize aborted");
+		return -ENOSPC;
+	}
+	brelse(bh);
+
+	if (offset != 0) {
+		/* extend the last group */
+		ext4_grpblk_t add;
+		add = EXT4_BLOCKS_PER_GROUP(sb) - offset;
+		err = __ext4_group_extend(sb, o_blocks_count, add);
+		if (err)
+			goto out;
+	}
+
+	if (EXT4_HAS_INCOMPAT_FEATURE(sb, EXT4_FEATURE_INCOMPAT_FLEX_BG) &&
+	    es->s_log_groups_per_flex)
+		flexbg_size = 1 << es->s_log_groups_per_flex;
+
+	o_blocks_count = ext4_blocks_count(es);
+	if (o_blocks_count == n_blocks_count)
+		goto out;
+
+	flex_gd = alloc_flex_gd(flexbg_size);
+	if (flex_gd == NULL) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	/* Add flex groups. Note that a regular group is a
+	 * flex group with 1 group.
+	 */
+	while (ext4_setup_next_flex_gd(sb, flex_gd, n_blocks_count,
+					      flexbg_size)) {
+		ext4_alloc_group_tables(sb, flex_gd, flexbg_size);
+		err = ext4_flex_group_add(sb, resize_inode, flex_gd);
+		if (unlikely(err))
+			break;
+	}
+
+out:
+	if (flex_gd)
+		free_flex_gd(flex_gd);
+
+	iput(resize_inode);
+	if (test_opt(sb, DEBUG))
+		printk(KERN_DEBUG "EXT4-fs: resized filesystem from %llu "
+		       "upto %llu blocks\n", o_blocks_count, n_blocks_count);
+	return err;
+}
-- 
1.7.5.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 01/13] ext4: add a function which extends a group without checking parameters
  2011-08-11  3:28 ` [PATCH 01/13] ext4: add a function which extends a group without checking parameters Yongqiang Yang
@ 2011-08-11  5:47   ` Andreas Dilger
  2011-08-11  6:27     ` Yongqiang Yang
  0 siblings, 1 reply; 25+ messages in thread
From: Andreas Dilger @ 2011-08-11  5:47 UTC (permalink / raw)
  To: Yongqiang Yang; +Cc: linux-ext4 List, Theodore Ts'o

On 2011-08-10, at 9:28 PM, Yongqiang Yang wrote:
> This patch added a function named __ext4_group_extend() whose code
> is copied from ext4_group_extend().  __ext4_group_extend() assumes
> the parameter is valid and has been checked by caller.
> 
> __ext4_group_extend() will be used by new resize implementation. It
> can also be used by ext4_group_extend(), but this patch series does
> not do this.

Since this is duplicating a lot of code from ext4_group_extend(), this
patch should be written in such a way that this new function is added,
and the duplicate code is removed from ext4_group_extend() and calls
the new function instead.

It looks like all of these patches are adding a completely duplicate set
of functions for doing the resizing, even though they are largely the
same as the existing code, and it will mean duplicate efforts to maintain
both copies of the code.

> Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
> ---
> fs/ext4/resize.c |   53 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 files changed, 53 insertions(+), 0 deletions(-)
> 
> diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
> index 707d3f1..6ffbdb6 100644
> --- a/fs/ext4/resize.c
> +++ b/fs/ext4/resize.c
> @@ -969,6 +969,59 @@ exit_put:
> } /* ext4_group_add */
> 
> /*
> + * extend a group without checking assuming that checking has been done.
> + */
> +static int __ext4_group_extend(struct super_block *sb,
> +			       ext4_fsblk_t o_blocks_count, ext4_grpblk_t add)
> +{
> +	struct ext4_super_block *es = EXT4_SB(sb)->s_es;
> +	handle_t *handle;
> +	int err = 0, err2;
> +
> +	/* We will update the superblock, one block bitmap, and
> +	 * one group descriptor via ext4_ext4_group_add_blocks().

Typo here: "ext4_ext4"

> +	 */
> +	handle = ext4_journal_start_sb(sb, 3);
> +	if (IS_ERR(handle)) {
> +		err = PTR_ERR(handle);
> +		ext4_warning(sb, "error %d on journal start", err);
> +		goto out;
> +	}
> +
> +	err = ext4_journal_get_write_access(handle, EXT4_SB(sb)->s_sbh);
> +	if (err) {
> +		ext4_warning(sb, "error %d on journal write access", err);
> +		ext4_journal_stop(handle);
> +		goto out;
> +	}
> +
> +	ext4_blocks_count_set(es, o_blocks_count + add);
> +	ext4_debug("freeing blocks %llu through %llu\n", o_blocks_count,
> +		   o_blocks_count + add);
> +	/* We add the blocks to the bitmap and set the group need init bit */
> +	err = ext4_group_add_blocks(handle, sb, o_blocks_count, add);
> +	if (err)
> +		goto exit_journal;
> +	ext4_handle_dirty_super(handle, sb);
> +	ext4_debug("freed blocks %llu through %llu\n", o_blocks_count,
> +		   o_blocks_count + add);
> +exit_journal:
> +	err2 = ext4_journal_stop(handle);
> +	if (err2 && !err)
> +		err = err2;
> +
> +	if (!err) {
> +		if (test_opt(sb, DEBUG))
> +			printk(KERN_DEBUG "EXT4-fs: extended group to %llu "
> +			       "blocks\n", ext4_blocks_count(es));
> +		update_backups(sb, EXT4_SB(sb)->s_sbh->b_blocknr, (char *)es,
> +			       sizeof(struct ext4_super_block));
> +	}
> +out:
> +	return err;
> +}
> +
> +/*
>  * Extend the filesystem to the new number of blocks specified.  This entry
>  * point is only used to extend the current filesystem to the end of the last
>  * existing group.  It can be accessed via ioctl, or by "remount,resize=<size>"
> -- 
> 1.7.5.1
> 


Cheers, Andreas






^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 02/13] ext4: add a function which adds a new desc to a fs
  2011-08-11  3:28 ` [PATCH 02/13] ext4: add a function which adds a new desc to a fs Yongqiang Yang
@ 2011-08-11  5:49   ` Andreas Dilger
  0 siblings, 0 replies; 25+ messages in thread
From: Andreas Dilger @ 2011-08-11  5:49 UTC (permalink / raw)
  To: Yongqiang Yang; +Cc: linux-ext4 List, Theodore Ts'o

On 2011-08-10, at 9:28 PM, Yongqiang Yang wrote:
> This patch adds a function named ext4_add_new_desc() which adds
> a new desc to a fs and whose code is copied from ext4_group_add().
> 
> The function will be used by new resize implementation.

Similarly, this is duplicating a hunk of code from the middle of
ext4_group_add(), and instead of just adding a second copy of that
code, ext4_group_add() should be changed to call this new function
to avoid the code duplication.

> Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
> ---
> fs/ext4/resize.c |   42 ++++++++++++++++++++++++++++++++++++++++++
> 1 files changed, 42 insertions(+), 0 deletions(-)
> 
> diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
> index 6ffbdb6..4fcd515 100644
> --- a/fs/ext4/resize.c
> +++ b/fs/ext4/resize.c
> @@ -735,6 +735,48 @@ exit_err:
> 	}
> }
> 
> +/*
> + * ext4_add_new_desc() adds group descriptor of group @group
> + *
> + * @handle: journal handle
> + * @sb; super block
> + * @group: the group no. of the first group desc to be added
> + * @resize_inode: the resize inode
> + */
> +static int ext4_add_new_desc(handle_t *handle, struct super_block *sb,
> +			     ext4_group_t group, struct inode *resize_inode)
> +{
> +	struct ext4_sb_info *sbi = EXT4_SB(sb);
> +	struct ext4_super_block *es = sbi->s_es;
> +	struct buffer_head *gdb_bh;
> +	int gdb_off, gdb_num, err = 0;
> +	int reserved_gdb = ext4_bg_has_super(sb, group) ?
> +		le16_to_cpu(es->s_reserved_gdt_blocks) : 0;
> +
> +	gdb_off = group % EXT4_DESC_PER_BLOCK(sb);
> +	gdb_num = group / EXT4_DESC_PER_BLOCK(sb);
> +
> +	/*
> +	 * We will only either add reserved group blocks to a backup group
> +	 * or remove reserved blocks for the first group in a new group block.
> +	 * Doing both would be mean more complex code, and sane people don't
> +	 * use non-sparse filesystems anymore.  This is already checked above.
> +	 */
> +	if (gdb_off) {
> +		gdb_bh = sbi->s_group_desc[gdb_num];
> +		err = ext4_journal_get_write_access(handle, gdb_bh);
> +		if (err)
> +			goto out;
> +
> +		if (reserved_gdb && ext4_bg_num_gdb(sb, group))
> +			err = reserve_backup_gdb(handle, resize_inode, group);
> +	} else
> +		err = add_new_gdb(handle, resize_inode, group);
> +
> +out:
> +	return err;
> +}
> +
> /* Add group descriptor data to an existing or new group descriptor block.
>  * Ensure we handle all possible error conditions _before_ we start modifying
>  * the filesystem, because we cannot abort the transaction and not have it
> -- 
> 1.7.5.1
> 


Cheers, Andreas






^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 01/13] ext4: add a function which extends a group without checking parameters
  2011-08-11  5:47   ` Andreas Dilger
@ 2011-08-11  6:27     ` Yongqiang Yang
  2011-08-21 17:07       ` Ted Ts'o
  0 siblings, 1 reply; 25+ messages in thread
From: Yongqiang Yang @ 2011-08-11  6:27 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: linux-ext4 List, Theodore Ts'o

On Thu, Aug 11, 2011 at 1:47 PM, Andreas Dilger <adilger@dilger.ca> wrote:
> On 2011-08-10, at 9:28 PM, Yongqiang Yang wrote:
>> This patch added a function named __ext4_group_extend() whose code
>> is copied from ext4_group_extend().  __ext4_group_extend() assumes
>> the parameter is valid and has been checked by caller.
>>
>> __ext4_group_extend() will be used by new resize implementation. It
>> can also be used by ext4_group_extend(), but this patch series does
>> not do this.
>
> Since this is duplicating a lot of code from ext4_group_extend(), this
> patch should be written in such a way that this new function is added,
> and the duplicate code is removed from ext4_group_extend() and calls
> the new function instead.
>
> It looks like all of these patches are adding a completely duplicate set
> of functions for doing the resizing, even though they are largely the
> same as the existing code, and it will mean duplicate efforts to maintain
> both copies of the code.
YES!  This needs some feedbacks, I thought the old resize interface
will be removed after the new resize interface goes into upstream.  So
I did not touch
old interface.  If we will remain the old interface, I can make some
patches which let old resize use common code instead.

Thank you for your review.
Yongqiang.
>
>> Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
>> ---
>> fs/ext4/resize.c |   53 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>> 1 files changed, 53 insertions(+), 0 deletions(-)
>>
>> diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
>> index 707d3f1..6ffbdb6 100644
>> --- a/fs/ext4/resize.c
>> +++ b/fs/ext4/resize.c
>> @@ -969,6 +969,59 @@ exit_put:
>> } /* ext4_group_add */
>>
>> /*
>> + * extend a group without checking assuming that checking has been done.
>> + */
>> +static int __ext4_group_extend(struct super_block *sb,
>> +                            ext4_fsblk_t o_blocks_count, ext4_grpblk_t add)
>> +{
>> +     struct ext4_super_block *es = EXT4_SB(sb)->s_es;
>> +     handle_t *handle;
>> +     int err = 0, err2;
>> +
>> +     /* We will update the superblock, one block bitmap, and
>> +      * one group descriptor via ext4_ext4_group_add_blocks().
>
> Typo here: "ext4_ext4"
>
>> +      */
>> +     handle = ext4_journal_start_sb(sb, 3);
>> +     if (IS_ERR(handle)) {
>> +             err = PTR_ERR(handle);
>> +             ext4_warning(sb, "error %d on journal start", err);
>> +             goto out;
>> +     }
>> +
>> +     err = ext4_journal_get_write_access(handle, EXT4_SB(sb)->s_sbh);
>> +     if (err) {
>> +             ext4_warning(sb, "error %d on journal write access", err);
>> +             ext4_journal_stop(handle);
>> +             goto out;
>> +     }
>> +
>> +     ext4_blocks_count_set(es, o_blocks_count + add);
>> +     ext4_debug("freeing blocks %llu through %llu\n", o_blocks_count,
>> +                o_blocks_count + add);
>> +     /* We add the blocks to the bitmap and set the group need init bit */
>> +     err = ext4_group_add_blocks(handle, sb, o_blocks_count, add);
>> +     if (err)
>> +             goto exit_journal;
>> +     ext4_handle_dirty_super(handle, sb);
>> +     ext4_debug("freed blocks %llu through %llu\n", o_blocks_count,
>> +                o_blocks_count + add);
>> +exit_journal:
>> +     err2 = ext4_journal_stop(handle);
>> +     if (err2 && !err)
>> +             err = err2;
>> +
>> +     if (!err) {
>> +             if (test_opt(sb, DEBUG))
>> +                     printk(KERN_DEBUG "EXT4-fs: extended group to %llu "
>> +                            "blocks\n", ext4_blocks_count(es));
>> +             update_backups(sb, EXT4_SB(sb)->s_sbh->b_blocknr, (char *)es,
>> +                            sizeof(struct ext4_super_block));
>> +     }
>> +out:
>> +     return err;
>> +}
>> +
>> +/*
>>  * Extend the filesystem to the new number of blocks specified.  This entry
>>  * point is only used to extend the current filesystem to the end of the last
>>  * existing group.  It can be accessed via ioctl, or by "remount,resize=<size>"
>> --
>> 1.7.5.1
>>
>
>
> Cheers, Andreas
>
>
>
>
>
>



-- 
Best Wishes
Yongqiang Yang
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 03/13] ext4: add a function which sets up a new group desc
  2011-08-11  3:28 ` [PATCH 03/13] ext4: add a function which sets up a new group desc Yongqiang Yang
@ 2011-08-11  6:42   ` Andreas Dilger
  2011-08-12  8:49     ` Yongqiang Yang
  0 siblings, 1 reply; 25+ messages in thread
From: Andreas Dilger @ 2011-08-11  6:42 UTC (permalink / raw)
  To: Yongqiang Yang; +Cc: linux-ext4 List, Theodore Ts'o

On 2011-08-10, at 9:28 PM, Yongqiang Yang wrote:
> This patch adds a function named ext4_setup_new_desc() which sets
> up a new group descriptor and whose code is sopied from ext4_group_add().
> 
> The function will be used by new resize implementation.

Again, duplicating a big hunk of ext4_group_add().  Similar comments apply.

Another question is whether this new resize code is safe from crashes?
One of the original design goals of the resize code is that it would never
leave a filesystem inconsistent if it crashed in the middle.

The way that these patches are looking, it seems that they may not be safe
in this regard, and possibly leave the filesystem in an inconsistent state
if they crash in the middle.  Maybe I'm missing something?

> Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
> ---
> fs/ext4/resize.c |   54 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 files changed, 54 insertions(+), 0 deletions(-)
> 
> diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
> index 4fcd515..6320baa 100644
> --- a/fs/ext4/resize.c
> +++ b/fs/ext4/resize.c
> @@ -777,6 +777,60 @@ out:
> 	return err;
> }
> 
> +/*
> + * ext4_setup_new_desc() sets up group descriptors specified by @input.
> + *
> + * @handle: journal handle
> + * @sb: super block
> + */
> +static int ext4_setup_new_desc(handle_t *handle, struct super_block *sb,
> +			       struct ext4_new_group_data *input)
> +{
> +	struct ext4_sb_info *sbi = EXT4_SB(sb);
> +	ext4_group_t group;
> +	struct ext4_group_desc *gdp;
> +	struct buffer_head *gdb_bh;
> +	int gdb_off, gdb_num, err = 0;
> +
> +	group = input->group;
> +
> +	gdb_off = group % EXT4_DESC_PER_BLOCK(sb);
> +	gdb_num = group / EXT4_DESC_PER_BLOCK(sb);
> +
> +	/*
> +	 * get_write_access() has been called on gdb_bh by ext4_add_new_desc().
> +	 */
> +	gdb_bh = sbi->s_group_desc[gdb_num];
> +	/* Update group descriptor block for new group */
> +	gdp = (struct ext4_group_desc *)((char *)gdb_bh->b_data +
> +				 gdb_off * EXT4_DESC_SIZE(sb));
> +
> +	memset(gdp, 0, EXT4_DESC_SIZE(sb));
> +	 /* LV FIXME */
> +	memset(gdp, 0, EXT4_DESC_SIZE(sb));
> +	ext4_block_bitmap_set(sb, gdp, input->block_bitmap); /* LV FIXME */
> +	ext4_inode_bitmap_set(sb, gdp, input->inode_bitmap); /* LV FIXME */
> +	ext4_inode_table_set(sb, gdp, input->inode_table); /* LV FIXME */
> +	ext4_free_blks_set(sb, gdp, input->free_blocks_count);
> +	ext4_free_inodes_set(sb, gdp, EXT4_INODES_PER_GROUP(sb));
> +	gdp->bg_flags = cpu_to_le16(EXT4_BG_INODE_ZEROED);
> +	gdp->bg_checksum = ext4_group_desc_csum(sbi, input->group, gdp);
> +
> +	err = ext4_handle_dirty_metadata(handle, NULL, gdb_bh);
> +	if (unlikely(err)) {
> +		ext4_std_error(sb, err);
> +		return err;
> +	}
> +
> +	/*
> +	 * We can allocate memory for mb_alloc based on the new group
> +	 * descriptor
> +	 */
> +	err = ext4_mb_add_groupinfo(sb, group, gdp);
> +
> +	return err;
> +}
> +
> /* Add group descriptor data to an existing or new group descriptor block.
>  * Ensure we handle all possible error conditions _before_ we start modifying
>  * the filesystem, because we cannot abort the transaction and not have it
> -- 
> 1.7.5.1
> 


Cheers, Andreas






^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 05/13] ext4: add a structure which will be used by 64bit-resize interface
  2011-08-11  3:28 ` [PATCH 05/13] ext4: add a structure which will be used by 64bit-resize interface Yongqiang Yang
@ 2011-08-11 10:57   ` Steven Liu
  0 siblings, 0 replies; 25+ messages in thread
From: Steven Liu @ 2011-08-11 10:57 UTC (permalink / raw)
  To: Yongqiang Yang; +Cc: linux-ext4, aedilger, tytso

TO: Yongqiang

2011/8/11 Yongqiang Yang <xiaoqiangnk@gmail.com>:
> This patch adds a structure which will be used by 64bit-resize interface.
> Two functions which allocate and destroy the structure respectively are
> added.
>
> Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
> ---
>  fs/ext4/resize.c |   56 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 56 insertions(+), 0 deletions(-)
>
> diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
> index 14be865..c586e51 100644
> --- a/fs/ext4/resize.c
> +++ b/fs/ext4/resize.c
> @@ -134,6 +134,62 @@ static int verify_group_input(struct super_block *sb,
>        return err;
>  }
>
> +/*
> + * ext4_new_flex_group_data is used by 64bit-resize interface to add a flex
> + * group each time.
> + */
> +struct ext4_new_flex_group_data {
> +       struct ext4_new_group_data *groups;     /* new_group_data for groups
> +                                                  in the flex group */
> +       __u16 *bg_flags;                        /* block group flags of groups
> +                                                  in @groups */
> +       ext4_group_t count;                     /* number of groups in @groups
> +                                                */
> +};
> +
> +/*
> + * alloc_flex_gd() allocates a ext4_new_flex_group_data with size of
> + * @flexbg_size.
> + *
> + * Returns NULL on failure otherwise address of the allocated structure.
> + */
> +static struct ext4_new_flex_group_data *alloc_flex_gd(unsigned long flexbg_size)
> +{
> +       struct ext4_new_flex_group_data *flex_gd;
> +
> +       flex_gd = kmalloc(sizeof(*flex_gd), GFP_NOFS);
> +       if (flex_gd == NULL) {
                    printk( KERN_WARNING "not enough memory for flex_gd\n" );
> +               goto out3;
             }
> +
> +       flex_gd->count = flexbg_size;
> +
> +       flex_gd->groups = kmalloc(sizeof(struct ext4_new_group_data) *
> +                                 flexbg_size, GFP_NOFS);
> +       if (flex_gd->groups == NULL) {
                    printk( KERN_WARNING "not enough memory for
flex_gd->groups\n" );
> +               goto out2;
            }
> +
> +       flex_gd->bg_flags = kmalloc(flexbg_size * sizeof(__u16), GFP_NOFS);
> +       if (flex_gd->bg_flags == NULL) {
                    printk( KERN_WARNING "not enough memory for
flex_gd->bg_flags\n" );
> +               goto out1;
             }
> +
> +       return flex_gd;
> +
out1:
     kfree(flex_gd->groups);
> +out2:
> +       kfree(flex_gd);
 out3:
> +       return NULL;
> +}
> +
> +void free_flex_gd(struct ext4_new_flex_group_data *flex_gd)
> +{
> +       kfree(flex_gd->bg_flags);
> +       kfree(flex_gd->groups);
> +       kfree(flex_gd);
> +}
> +
>  static struct buffer_head *bclean(handle_t *handle, struct super_block *sb,
>                                  ext4_fsblk_t blk)
>  {



What about add some message for kmalloc failure, and  goto the label
looks like the above ?
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 03/13] ext4: add a function which sets up a new group desc
  2011-08-11  6:42   ` Andreas Dilger
@ 2011-08-12  8:49     ` Yongqiang Yang
  0 siblings, 0 replies; 25+ messages in thread
From: Yongqiang Yang @ 2011-08-12  8:49 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: linux-ext4 List, Theodore Ts'o

On Thu, Aug 11, 2011 at 2:42 PM, Andreas Dilger <adilger@dilger.ca> wrote:
> On 2011-08-10, at 9:28 PM, Yongqiang Yang wrote:
>> This patch adds a function named ext4_setup_new_desc() which sets
>> up a new group descriptor and whose code is sopied from ext4_group_add().
>>
>> The function will be used by new resize implementation.
>
> Again, duplicating a big hunk of ext4_group_add().  Similar comments apply.
>
> Another question is whether this new resize code is safe from crashes?
> One of the original design goals of the resize code is that it would never
> leave a filesystem inconsistent if it crashed in the middle.
>
> The way that these patches are looking, it seems that they may not be safe
> in this regard, and possibly leave the filesystem in an inconsistent state
> if they crash in the middle.  Maybe I'm missing something?
If journal is used, journal can bring the crashed fs to consistent
state like old resize.  The logic of new resize is the same as old
resize except adding multi groups each time.

I will check it further.

Thanks!
Yongqiang.
>
>> Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
>> ---
>> fs/ext4/resize.c |   54 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> 1 files changed, 54 insertions(+), 0 deletions(-)
>>
>> diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
>> index 4fcd515..6320baa 100644
>> --- a/fs/ext4/resize.c
>> +++ b/fs/ext4/resize.c
>> @@ -777,6 +777,60 @@ out:
>>       return err;
>> }
>>
>> +/*
>> + * ext4_setup_new_desc() sets up group descriptors specified by @input.
>> + *
>> + * @handle: journal handle
>> + * @sb: super block
>> + */
>> +static int ext4_setup_new_desc(handle_t *handle, struct super_block *sb,
>> +                            struct ext4_new_group_data *input)
>> +{
>> +     struct ext4_sb_info *sbi = EXT4_SB(sb);
>> +     ext4_group_t group;
>> +     struct ext4_group_desc *gdp;
>> +     struct buffer_head *gdb_bh;
>> +     int gdb_off, gdb_num, err = 0;
>> +
>> +     group = input->group;
>> +
>> +     gdb_off = group % EXT4_DESC_PER_BLOCK(sb);
>> +     gdb_num = group / EXT4_DESC_PER_BLOCK(sb);
>> +
>> +     /*
>> +      * get_write_access() has been called on gdb_bh by ext4_add_new_desc().
>> +      */
>> +     gdb_bh = sbi->s_group_desc[gdb_num];
>> +     /* Update group descriptor block for new group */
>> +     gdp = (struct ext4_group_desc *)((char *)gdb_bh->b_data +
>> +                              gdb_off * EXT4_DESC_SIZE(sb));
>> +
>> +     memset(gdp, 0, EXT4_DESC_SIZE(sb));
>> +      /* LV FIXME */
>> +     memset(gdp, 0, EXT4_DESC_SIZE(sb));
>> +     ext4_block_bitmap_set(sb, gdp, input->block_bitmap); /* LV FIXME */
>> +     ext4_inode_bitmap_set(sb, gdp, input->inode_bitmap); /* LV FIXME */
>> +     ext4_inode_table_set(sb, gdp, input->inode_table); /* LV FIXME */
>> +     ext4_free_blks_set(sb, gdp, input->free_blocks_count);
>> +     ext4_free_inodes_set(sb, gdp, EXT4_INODES_PER_GROUP(sb));
>> +     gdp->bg_flags = cpu_to_le16(EXT4_BG_INODE_ZEROED);
>> +     gdp->bg_checksum = ext4_group_desc_csum(sbi, input->group, gdp);
>> +
>> +     err = ext4_handle_dirty_metadata(handle, NULL, gdb_bh);
>> +     if (unlikely(err)) {
>> +             ext4_std_error(sb, err);
>> +             return err;
>> +     }
>> +
>> +     /*
>> +      * We can allocate memory for mb_alloc based on the new group
>> +      * descriptor
>> +      */
>> +     err = ext4_mb_add_groupinfo(sb, group, gdp);
>> +
>> +     return err;
>> +}
>> +
>> /* Add group descriptor data to an existing or new group descriptor block.
>>  * Ensure we handle all possible error conditions _before_ we start modifying
>>  * the filesystem, because we cannot abort the transaction and not have it
>> --
>> 1.7.5.1
>>
>
>
> Cheers, Andreas
>
>
>
>
>
>



-- 
Best Wishes
Yongqiang Yang
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: New resize interface implementation
       [not found] ` <CAKgsxVQAwEetdyGcOciA8+wi_eLA7Fmq60kfDvEK1WzgYxdUMQ@mail.gmail.com>
@ 2011-08-17  7:28   ` Yongqiang Yang
       [not found]     ` <CAKgsxVSfrCt6tyxDdczLYFu0YeGXfuEAA=o_5T4TOkeNXSG8iA@mail.gmail.com>
  0 siblings, 1 reply; 25+ messages in thread
From: Yongqiang Yang @ 2011-08-17  7:28 UTC (permalink / raw)
  To: Justin Maggard; +Cc: Ext4 Developers List

On Tue, Aug 16, 2011 at 8:22 AM, Justin Maggard <jmaggard10@gmail.com> wrote:
> Hi,
Hi Justin,
> I saw your patch, and I am excited to see online resize support of very
> large filesystems.  I was hoping you could answer a few additional questions
> for me.
My pleasure.
> Does this patch set combined with your e2fsprogs patch add 64-bit resize
> support now, or does it just make it easier to add later?
YES. e2fsprgos's patch is ready too.

> If I am making a 64-bit ext4 filesystem today (20TB), and hoping to resize
> it next year to 30TB what features should I set?  In my searching it sounded
> like maybe I would need meta_bg, but it is not compatible with the default
> resize_inode.
You can understand meta_bg here http://linuxsoftware.co.nz/wiki/ext4.
Now, ext4 with meta_bg does not support resize.  It is in ext4's TODO list.
The feature you should set is resize_inode.

> Also, if I am making a <16TB filesystem today, should I turn on the 64-bit
> flag in order to expand to >16TB in the future?
Yes.  You should turn on 64 bit feature.  If the block number is 32
bit, the size it can support is 2^32 * 2^(log blocksize),  4K
blocksize as an example, it maximum size of a filesystem is 2^32 *
2^12 = 2^44 = 16TB.

> Thank you for your time,
You are welcome.
> -Justin
> On Wed, Aug 10, 2011 at 8:28 PM, Yongqiang Yang <xiaoqiangnk@gmail.com>
> wrote:
>>
>> Hi all,
>>
>> This patch series adds new resize implementation to ext4.
>>
>> -- What's new resize implementation?
>>   It is a new online resize interface for ext4.  It can be used via
>>   ioctl with EXT4_IOC_RESIZE_FS and a 64 bit integer indicating size
>>   of the resized fs in block.
>>
>> -- Difference between current resize and new resize.
>>   New resize lets kernel do all work, like allocating bitmaps and
>>   inode tables and can support flex_bg and BLOCK_UNINIT features.
>>   Besides these, new resize is much faster than current resize.
>>
>>   Below are benchmarks I made on my personal computer, fses with
>>   flex_bg size = 16 were resized to 230GB evry time. The first
>>   row shows the size of a fs from which the fs was resized to 230GB.
>>   The datas were collected by 'time resize2fs'.
>>
>>                      new resize
>>                20GB          50GB      100GB
>>      real    0m3.558s     0m2.891s    0m0.394s
>>      user    0m0.004s     0m0.000s    0m0.394s
>>      sys     0m0.048s     0m0.048s    0m0.028s
>>
>>                      current resize
>>                20GB          50GB      100GB
>>      real    5m2.770s     4m43.757s  3m14.840s
>>      user    0m0.040s     0m0.032s   0m0.024s
>>      sys     0m0.464s     0m0.432s   0m0.324s
>>
>>   According to data above, new resize is faster than current resize in
>> both
>>   user and sys time.  New resize performs well in sys time, because it
>>   supports BLOCK_UNINIT and adds multi-groups each time.
>>
>> -- About supporting new features.
>>   YES! New resize can support new feature like bigalloc and exclude bitmap
>>   easily.  Because it lets kernel do all work.
>>
>> [PATCH 01/13] ext4: add a function which extends a group without
>> [PATCH 02/13] ext4: add a function which adds a new desc to a fs
>> [PATCH 03/13] ext4: add a function which sets up a new group desc
>> [PATCH 04/13] ext4: add a function which updates super block
>> [PATCH 05/13] ext4: add a structure which will be used by
>> [PATCH 06/13] ext4: add a function which sets up group blocks of a
>> [PATCH 07/13] ext4: add a function which adds several group
>> [PATCH 08/13] ext4: add a function which sets up a flex groups each
>> [PATCH 09/13] ext4: enable ext4_update_super() to handle a flex
>> [PATCH 10/13] ext4: pass verify_reserved_gdb() the number of group
>> [PATCH 11/13] ext4: add a new function which allocates bitmaps and
>> [PATCH 12/13] ext4: add a new function which adds a flex group to a
>> [PATCH 13/13] ext4: add new online resize interface
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>



-- 
Best Wishes
Yongqiang Yang
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 01/13] ext4: add a function which extends a group without checking parameters
  2011-08-11  6:27     ` Yongqiang Yang
@ 2011-08-21 17:07       ` Ted Ts'o
  2011-08-22  2:51         ` Andreas Dilger
  0 siblings, 1 reply; 25+ messages in thread
From: Ted Ts'o @ 2011-08-21 17:07 UTC (permalink / raw)
  To: Yongqiang Yang; +Cc: Andreas Dilger, linux-ext4 List

On Thu, Aug 11, 2011 at 02:27:41PM +0800, Yongqiang Yang wrote:
> YES!  This needs some feedbacks, I thought the old resize interface
> will be removed after the new resize interface goes into upstream.  So
> I did not touch
> old interface.  If we will remain the old interface, I can make some
> patches which let old resize use common code instead.

Yes, the backwards compatibility requirements for the kernel means
that we have to keep the old ioctl's working for at least 2-3 years...

   	   	    			    - Ted


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 01/13] ext4: add a function which extends a group without checking parameters
  2011-08-21 17:07       ` Ted Ts'o
@ 2011-08-22  2:51         ` Andreas Dilger
  0 siblings, 0 replies; 25+ messages in thread
From: Andreas Dilger @ 2011-08-22  2:51 UTC (permalink / raw)
  To: Ted Ts'o; +Cc: Yongqiang Yang, Andreas Dilger, linux-ext4List

On 2011-08-21, at 11:07 AM, Ted Ts'o <tytso@mit.edu> wrote:
> On Thu, Aug 11, 2011 at 02:27:41PM +0800, Yongqiang Yang wrote:
>> YES!  This needs some feedbacks, I thought the old resize interface
>> will be removed after the new resize interface goes into upstream.  So
>> I did not touch
>> old interface.  If we will remain the old interface, I can make some
>> patches which let old resize use common code instead.
> 
> Yes, the backwards compatibility requirements for the kernel means
> that we have to keep the old ioctl's working for at least 2-3 years...

I'd be happy with just wiring up the old ioctls to the new code, and having the kernel ignore the passed parameters and just do it's own thing to resize to the end of the group.

Cheers, Andreas

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: New resize interface implementation
       [not found]     ` <CAKgsxVSfrCt6tyxDdczLYFu0YeGXfuEAA=o_5T4TOkeNXSG8iA@mail.gmail.com>
@ 2011-08-26 20:06       ` Justin Maggard
  2011-09-01 12:30         ` Amir Goldstein
  0 siblings, 1 reply; 25+ messages in thread
From: Justin Maggard @ 2011-08-26 20:06 UTC (permalink / raw)
  To: ext4 development

On Wed, Aug 17, 2011 at 12:28 AM, Yongqiang Yang <xiaoqiangnk@gmail.com> wrote:
> On Tue, Aug 16, 2011 at 8:22 AM, Justin Maggard <jmaggard10@gmail.com> wrote:
> > Does this patch set combined with your e2fsprogs patch add 64-bit resize
> > support now, or does it just make it easier to add later?
> YES. e2fsprgos's patch is ready too.

So I finally got around to gather the hardware and patching all the
software components to try out this 64-bit expansion code.  The first
thing I noticed is that there is still a check to make sure the block
count is 32 bits.  However, I can get around it by specifying a size
string (something like "20T") rather than a block count, in which case
it will actually try the expansion.

> > If I am making a 64-bit ext4 filesystem today (20TB), and hoping to resize
> > it next year to 30TB what features should I set?  In my searching it sounded
> > like maybe I would need meta_bg, but it is not compatible with the default
> > resize_inode.
> You can understand meta_bg here http://linuxsoftware.co.nz/wiki/ext4.
> Now, ext4 with meta_bg does not support resize.  It is in ext4's TODO list.
> The feature you should set is resize_inode.
>
> > Also, if I am making a <16TB filesystem today, should I turn on the 64-bit
> > flag in order to expand to >16TB in the future?
> Yes.  You should turn on 64 bit feature.  If the block number is 32
> bit, the size it can support is 2^32 * 2^(log blocksize),  4K
> blocksize as an example, it maximum size of a filesystem is 2^32 *
> 2^12 = 2^44 = 16TB.

I think this is where the real problem is with this 64-bit resize
support.  With the 64-bit flag set, the most I can expand by online is
just 8TB over the life of the filesystem, because my reserved GDT
blocks get used up twice as fast as with a 32-bit filesystem.  Is
there any way around this?

-Justin
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: New resize interface implementation
  2011-08-26 20:06       ` Justin Maggard
@ 2011-09-01 12:30         ` Amir Goldstein
  0 siblings, 0 replies; 25+ messages in thread
From: Amir Goldstein @ 2011-09-01 12:30 UTC (permalink / raw)
  To: Justin Maggard; +Cc: ext4 development, Yongqiang Yang

On Fri, Aug 26, 2011 at 10:06 PM, Justin Maggard <jmaggard10@gmail.com> wrote:
> On Wed, Aug 17, 2011 at 12:28 AM, Yongqiang Yang <xiaoqiangnk@gmail.com> wrote:
>> On Tue, Aug 16, 2011 at 8:22 AM, Justin Maggard <jmaggard10@gmail.com> wrote:
>> > Does this patch set combined with your e2fsprogs patch add 64-bit resize
>> > support now, or does it just make it easier to add later?
>> YES. e2fsprgos's patch is ready too.
>
> So I finally got around to gather the hardware and patching all the
> software components to try out this 64-bit expansion code.  The first
> thing I noticed is that there is still a check to make sure the block
> count is 32 bits.  However, I can get around it by specifying a size
> string (something like "20T") rather than a block count, in which case
> it will actually try the expansion.

The fact that you can get around this check is a bug.
As you have observed, things won't be pretty if you try to resize over 16TB
using resize inode and I don't think it is intended to work.


>
>> > If I am making a 64-bit ext4 filesystem today (20TB), and hoping to resize
>> > it next year to 30TB what features should I set?  In my searching it sounded
>> > like maybe I would need meta_bg, but it is not compatible with the default
>> > resize_inode.
>> You can understand meta_bg here http://linuxsoftware.co.nz/wiki/ext4.
>> Now, ext4 with meta_bg does not support resize.  It is in ext4's TODO list.
>> The feature you should set is resize_inode.
>>
>> > Also, if I am making a <16TB filesystem today, should I turn on the 64-bit
>> > flag in order to expand to >16TB in the future?
>> Yes.  You should turn on 64 bit feature.  If the block number is 32
>> bit, the size it can support is 2^32 * 2^(log blocksize),  4K
>> blocksize as an example, it maximum size of a filesystem is 2^32 *
>> 2^12 = 2^44 = 16TB.
>
> I think this is where the real problem is with this 64-bit resize
> support.  With the 64-bit flag set, the most I can expand by online is
> just 8TB over the life of the filesystem, because my reserved GDT
> blocks get used up twice as fast as with a 32-bit filesystem.  Is
> there any way around this?

The maximum reserved GDT blocks is EXT2_ADDR_PER_BLOCK(sb),
which is 1024 by default, just enough for expanding the 64-bit fs by 8TB,
as you have observed.
But also, resize inode cannot store 64-bit block addresses of GDT backups
beyond 16TB, so your fs (resize inode in particular) are most likely corrupted.

There is no point in getting around these issues.
We should get on top of them and implement online resize of meta_bg.
If your intention is to create a 20TB fs today and resize it in the
(not very near) future
then you should probably use meta_bg instead of resize_inode.

Amir.
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2011-09-01 12:30 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-11  3:28 New resize interface implementation Yongqiang Yang
2011-08-11  3:28 ` [PATCH 01/13] ext4: add a function which extends a group without checking parameters Yongqiang Yang
2011-08-11  5:47   ` Andreas Dilger
2011-08-11  6:27     ` Yongqiang Yang
2011-08-21 17:07       ` Ted Ts'o
2011-08-22  2:51         ` Andreas Dilger
2011-08-11  3:28 ` [PATCH 02/13] ext4: add a function which adds a new desc to a fs Yongqiang Yang
2011-08-11  5:49   ` Andreas Dilger
2011-08-11  3:28 ` [PATCH 03/13] ext4: add a function which sets up a new group desc Yongqiang Yang
2011-08-11  6:42   ` Andreas Dilger
2011-08-12  8:49     ` Yongqiang Yang
2011-08-11  3:28 ` [PATCH 04/13] ext4: add a function which updates super block Yongqiang Yang
2011-08-11  3:28 ` [PATCH 05/13] ext4: add a structure which will be used by 64bit-resize interface Yongqiang Yang
2011-08-11 10:57   ` Steven Liu
2011-08-11  3:28 ` [PATCH 06/13] ext4: add a function which sets up group blocks of a flex groups Yongqiang Yang
2011-08-11  3:28 ` [PATCH 07/13] ext4: add a function which adds several group descriptors Yongqiang Yang
2011-08-11  3:28 ` [PATCH 08/13] ext4: add a function which sets up a flex groups each time Yongqiang Yang
2011-08-11  3:28 ` [PATCH 09/13] ext4: enable ext4_update_super() to handle a flex groups Yongqiang Yang
2011-08-11  3:28 ` [PATCH 10/13] ext4: pass verify_reserved_gdb() the number of group decriptors Yongqiang Yang
2011-08-11  3:28 ` [PATCH 11/13] ext4: add a new function which allocates bitmaps and inode tables Yongqiang Yang
2011-08-11  3:28 ` [PATCH 12/13] ext4: add a new function which adds a flex group to a fs Yongqiang Yang
2011-08-11  3:28 ` [PATCH 13/13] ext4: add new online resize interface Yongqiang Yang
     [not found] ` <CAKgsxVQAwEetdyGcOciA8+wi_eLA7Fmq60kfDvEK1WzgYxdUMQ@mail.gmail.com>
2011-08-17  7:28   ` New resize interface implementation Yongqiang Yang
     [not found]     ` <CAKgsxVSfrCt6tyxDdczLYFu0YeGXfuEAA=o_5T4TOkeNXSG8iA@mail.gmail.com>
2011-08-26 20:06       ` Justin Maggard
2011-09-01 12:30         ` Amir Goldstein

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.