All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] e2fsprogs: add EXT2_FLAG_BG_WAS_TRIMMED to optimize fstrim
@ 2020-05-27 14:08 Wang Shilong
  2020-05-27 14:08 ` [PATCH 2/2] tune2fs: add clear_was_trimmed option Wang Shilong
  2020-05-27 14:08 ` [PATCH v2] ext4: introduce EXT4_BG_WAS_TRIMMED to optimize trim Wang Shilong
  0 siblings, 2 replies; 4+ messages in thread
From: Wang Shilong @ 2020-05-27 14:08 UTC (permalink / raw)
  To: linux-ext4
  Cc: Wang Shilong, Shuichi Ihara, Andreas Dilger, Wang Shilong, Lukas Czerner

From: Wang Shilong <wshilong@ddn.com>

1)mkfs will set this flag if discard is triggered.
2)whenver we free blocks from block group, this flag will be cleared.
3)make dumpe2fs aware of new flag.

Cc: Shuichi Ihara <sihara@ddn.com>
Cc: Andreas Dilger <adilger@dilger.ca>
Cc: Wang Shilong <wangshilong1991@gmail.com>
Cc: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Wang Shilong <wshilong@ddn.com>
---
 lib/ext2fs/alloc_stats.c  | 8 ++++++--
 lib/ext2fs/alloc_tables.c | 2 ++
 lib/ext2fs/ext2_fs.h      | 1 +
 lib/ext2fs/ext2fs.h       | 1 +
 misc/dumpe2fs.c           | 2 ++
 misc/mke2fs.c             | 3 +++
 6 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/lib/ext2fs/alloc_stats.c b/lib/ext2fs/alloc_stats.c
index 3949f618..2a6e0557 100644
--- a/lib/ext2fs/alloc_stats.c
+++ b/lib/ext2fs/alloc_stats.c
@@ -69,10 +69,12 @@ void ext2fs_block_alloc_stats2(ext2_filsys fs, blk64_t blk, int inuse)
 #endif
 		return;
 	}
-	if (inuse > 0)
+	if (inuse > 0) {
 		ext2fs_mark_block_bitmap2(fs->block_map, blk);
-	else
+	} else {
 		ext2fs_unmark_block_bitmap2(fs->block_map, blk);
+		ext2fs_bg_flags_clear(fs, group, EXT2_BG_WAS_TRIMMED);
+	}
 	ext2fs_bg_free_blocks_count_set(fs, group, ext2fs_bg_free_blocks_count(fs, group) - inuse);
 	ext2fs_bg_flags_clear(fs, group, EXT2_BG_BLOCK_UNINIT);
 	ext2fs_group_desc_csum_set(fs, group);
@@ -138,6 +140,8 @@ void ext2fs_block_alloc_stats_range(ext2_filsys fs, blk64_t blk,
 			ext2fs_bg_free_blocks_count(fs, group) -
 			inuse*n/EXT2FS_CLUSTER_RATIO(fs));
 		ext2fs_bg_flags_clear(fs, group, EXT2_BG_BLOCK_UNINIT);
+		if (inuse < 0)
+			ext2fs_bg_flags_clear(fs, group, EXT2_BG_WAS_TRIMMED);
 		ext2fs_group_desc_csum_set(fs, group);
 		ext2fs_free_blocks_count_add(fs->super, -inuse * (blk64_t) n);
 		blk += n;
diff --git a/lib/ext2fs/alloc_tables.c b/lib/ext2fs/alloc_tables.c
index 971a6ceb..17087890 100644
--- a/lib/ext2fs/alloc_tables.c
+++ b/lib/ext2fs/alloc_tables.c
@@ -250,6 +250,8 @@ errcode_t ext2fs_allocate_group_table(ext2_filsys fs, dgrp_t group,
 		}
 		ext2fs_inode_table_loc_set(fs, group, new_blk);
 	}
+	if (fs->flags & EXT2_FLAG_BG_WAS_TRIMMED)
+		ext2fs_bg_flags_set(fs, group, EXT2_BG_WAS_TRIMMED);
 	ext2fs_group_desc_csum_set(fs, group);
 	return 0;
 }
diff --git a/lib/ext2fs/ext2_fs.h b/lib/ext2fs/ext2_fs.h
index 6c20ea77..86d6f438 100644
--- a/lib/ext2fs/ext2_fs.h
+++ b/lib/ext2fs/ext2_fs.h
@@ -211,6 +211,7 @@ struct ext4_group_desc
 #define EXT2_BG_INODE_UNINIT	0x0001 /* Inode table/bitmap not initialized */
 #define EXT2_BG_BLOCK_UNINIT	0x0002 /* Block bitmap not initialized */
 #define EXT2_BG_INODE_ZEROED	0x0004 /* On-disk itable initialized to zero */
+#define EXT2_BG_WAS_TRIMMED	0x0008	/* Block group was trimmed */
 
 /*
  * Data structures used by the directory indexing feature
diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
index 69c8a3ff..3dd99430 100644
--- a/lib/ext2fs/ext2fs.h
+++ b/lib/ext2fs/ext2fs.h
@@ -206,6 +206,7 @@ typedef struct ext2_file *ext2_file_t;
 #define EXT2_FLAG_IGNORE_SB_ERRORS	0x800000
 #define EXT2_FLAG_BBITMAP_TAIL_PROBLEM	0x1000000
 #define EXT2_FLAG_IBITMAP_TAIL_PROBLEM	0x2000000
+#define EXT2_FLAG_BG_WAS_TRIMMED	0x4000000
 
 /*
  * Special flag in the ext2 inode i_flag field that means that this is
diff --git a/misc/dumpe2fs.c b/misc/dumpe2fs.c
index d295ba4d..a305ec9b 100644
--- a/misc/dumpe2fs.c
+++ b/misc/dumpe2fs.c
@@ -131,6 +131,8 @@ static void print_bg_opts(ext2_filsys fs, dgrp_t i)
  		     &first);
 	print_bg_opt(bg_flags, EXT2_BG_INODE_ZEROED, "ITABLE_ZEROED",
  		     &first);
+	print_bg_opt(bg_flags, EXT2_BG_WAS_TRIMMED, "WAS_TRIMMED",
+		     &first);
 	if (!first)
 		fputc(']', stdout);
 	fputc('\n', stdout);
diff --git a/misc/mke2fs.c b/misc/mke2fs.c
index c90dcf0e..07ee620e 100644
--- a/misc/mke2fs.c
+++ b/misc/mke2fs.c
@@ -403,6 +403,7 @@ static errcode_t packed_allocate_tables(ext2_filsys fs)
 		ext2fs_block_alloc_stats_range(fs, goal,
 					       fs->inode_blocks_per_group, +1);
 		ext2fs_inode_table_loc_set(fs, i, goal);
+		ext2fs_bg_flags_set(fs, i, EXT2_BG_WAS_TRIMMED);
 		ext2fs_group_desc_csum_set(fs, i);
 	}
 	return 0;
@@ -3037,6 +3038,8 @@ int main (int argc, char *argv[])
 	/* Can't undo discard ... */
 	if (!noaction && discard && dev_size && (io_ptr != undo_io_manager)) {
 		retval = mke2fs_discard_device(fs);
+		if (!retval)
+			fs->flags |= EXT2_FLAG_BG_WAS_TRIMMED;
 		if (!retval && io_channel_discard_zeroes_data(fs->io)) {
 			if (verbose)
 				printf("%s",
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/2] tune2fs: add clear_was_trimmed option
  2020-05-27 14:08 [PATCH 1/2] e2fsprogs: add EXT2_FLAG_BG_WAS_TRIMMED to optimize fstrim Wang Shilong
@ 2020-05-27 14:08 ` Wang Shilong
  2020-05-27 14:08 ` [PATCH v2] ext4: introduce EXT4_BG_WAS_TRIMMED to optimize trim Wang Shilong
  1 sibling, 0 replies; 4+ messages in thread
From: Wang Shilong @ 2020-05-27 14:08 UTC (permalink / raw)
  To: linux-ext4
  Cc: Wang Shilong, Shuichi Ihara, Andreas Dilger, Wang Shilong, Lukas Czerner

From: Wang Shilong <wshilong@ddn.com>

It might be possible that admin want an option for clear
existed WAS_TRIMMED flag to force fstrim next run time.

Cc: Shuichi Ihara <sihara@ddn.com>
Cc: Andreas Dilger <adilger@dilger.ca>
Cc: Wang Shilong <wangshilong1991@gmail.com>
Cc: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Wang Shilong <wshilong@ddn.com>
---
 misc/tune2fs.8.in |  5 +++++
 misc/tune2fs.c    | 30 +++++++++++++++++++++++++++++-
 2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/misc/tune2fs.8.in b/misc/tune2fs.8.in
index 3cf1f5ed..b8025949 100644
--- a/misc/tune2fs.8.in
+++ b/misc/tune2fs.8.in
@@ -249,6 +249,11 @@ mounted using experimental kernel code, such as the ext4dev filesystem.
 .B ^test_fs
 Clear the test_fs flag, indicating the filesystem should only be mounted
 using production-level filesystem code.
+.TP
+.B clear_was_trimmed
+Clear block groups' WAS_TRIMMED flag, this will force fstrim to run every
+block group next time.
+.TP
 .RE
 .TP
 .B \-f
diff --git a/misc/tune2fs.c b/misc/tune2fs.c
index 314cc0d0..3e9814e3 100644
--- a/misc/tune2fs.c
+++ b/misc/tune2fs.c
@@ -101,6 +101,7 @@ static int rewrite_checksums;
 static int feature_64bit;
 static int fsck_requested;
 static char *undo_file;
+static int clear_was_trimmed;
 
 int journal_size, journal_flags;
 char *journal_device;
@@ -949,6 +950,26 @@ static void rewrite_metadata_checksums(ext2_filsys fs, unsigned int flags)
 	ext2fs_mark_super_dirty(fs);
 }
 
+static void clear_bg_was_trimmed(ext2_filsys fs)
+{
+	dgrp_t i;
+	int dirty = 0;
+
+	for (i = 0; i < fs->group_desc_count; i++) {
+		if (ext2fs_bg_flags_test(fs, i, EXT2_BG_WAS_TRIMMED)) {
+			ext2fs_bg_flags_clear(fs, i, EXT2_BG_WAS_TRIMMED);
+			ext2fs_group_desc_csum_set(fs, i);
+			dirty = 1;
+		}
+	}
+
+	if (dirty) {
+		fs->flags &= ~EXT2_FLAG_SUPER_ONLY;
+		ext2fs_mark_bb_dirty(fs);
+		ext2fs_mark_super_dirty(fs);
+	}
+}
+
 static void enable_uninit_bg(ext2_filsys fs)
 {
 	struct ext2_group_desc *gd;
@@ -2207,6 +2228,9 @@ static int parse_extended_opts(ext2_filsys fs, const char *opts)
 				continue;
 			}
 			ext_mount_opts = strdup(arg);
+		}  else if (!strcmp(token, "clear_was_trimmed") ||
+			    !strcmp(token, "clear_was-trimmed")) {
+			clear_was_trimmed = 1;
 		} else
 			r_usage++;
 	}
@@ -2224,7 +2248,8 @@ static int parse_extended_opts(ext2_filsys fs, const char *opts)
 			"\tstripe_width=<RAID stride*data disks in blocks>\n"
 			"\tforce_fsck\n"
 			"\ttest_fs\n"
-			"\t^test_fs\n"));
+			"\t^test_fs\n"
+			"\tclear_was_trimmed\n"));
 		free(buf);
 		return 1;
 	}
@@ -3361,6 +3386,9 @@ _("Warning: The journal is dirty. You may wish to replay the journal like:\n\n"
 		}
 	}
 
+	if (clear_was_trimmed)
+		clear_bg_was_trimmed(fs);
+
 	if (rewrite_checksums)
 		rewrite_metadata_checksums(fs, rewrite_checksums);
 
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v2] ext4: introduce EXT4_BG_WAS_TRIMMED to optimize trim
  2020-05-27 14:08 [PATCH 1/2] e2fsprogs: add EXT2_FLAG_BG_WAS_TRIMMED to optimize fstrim Wang Shilong
  2020-05-27 14:08 ` [PATCH 2/2] tune2fs: add clear_was_trimmed option Wang Shilong
@ 2020-05-27 14:08 ` Wang Shilong
  2020-05-27 19:29   ` Andreas Dilger
  1 sibling, 1 reply; 4+ messages in thread
From: Wang Shilong @ 2020-05-27 14:08 UTC (permalink / raw)
  To: linux-ext4
  Cc: Wang Shilong, Shuichi Ihara, Andreas Dilger, Wang Shilong, Lukas Czerner

From: Wang Shilong <wshilong@ddn.com>

Currently WAS_TRIMMED flag is not persistent, whenever filesystem was
remounted, fstrim need walk all block groups again, the problem with
this is FSTRIM could be slow on very large LUN SSD based filesystem.

To avoid this kind of problem, we introduce a block group flag
EXT4_BG_WAS_TRIMMED, the side effect of this is we need introduce
extra one block group dirty write after trimming block group.

And When clearing TRIMMED flag, block group will be journalled
anyway, so it won't introduce any overhead.

Cc: Shuichi Ihara <sihara@ddn.com>
Cc: Andreas Dilger <adilger@dilger.ca>
Cc: Wang Shilong <wangshilong1991@gmail.com>
Cc: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Wang Shilong <wshilong@ddn.com>
---
v1->v2:
call ext4_journal_get_write_access() before modify buffer.
---
 fs/ext4/ext4.h      | 18 +++++++-------
 fs/ext4/ext4_jbd2.h |  3 ++-
 fs/ext4/mballoc.c   | 59 +++++++++++++++++++++++++++++++++------------
 3 files changed, 55 insertions(+), 25 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index ad2dbf6e4924..23c2dc529a28 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -357,6 +357,7 @@ struct flex_groups {
 #define EXT4_BG_INODE_UNINIT	0x0001 /* Inode table/bitmap not in use */
 #define EXT4_BG_BLOCK_UNINIT	0x0002 /* Block bitmap not in use */
 #define EXT4_BG_INODE_ZEROED	0x0004 /* On-disk itable initialized to zero */
+#define EXT4_BG_WAS_TRIMMED	0x0008 /* block group was trimmed */
 
 /*
  * Macro-instructions used to manage group descriptors
@@ -3112,9 +3113,8 @@ struct ext4_group_info {
 };
 
 #define EXT4_GROUP_INFO_NEED_INIT_BIT		0
-#define EXT4_GROUP_INFO_WAS_TRIMMED_BIT		1
-#define EXT4_GROUP_INFO_BBITMAP_CORRUPT_BIT	2
-#define EXT4_GROUP_INFO_IBITMAP_CORRUPT_BIT	3
+#define EXT4_GROUP_INFO_BBITMAP_CORRUPT_BIT	1
+#define EXT4_GROUP_INFO_IBITMAP_CORRUPT_BIT	2
 #define EXT4_GROUP_INFO_BBITMAP_CORRUPT		\
 	(1 << EXT4_GROUP_INFO_BBITMAP_CORRUPT_BIT)
 #define EXT4_GROUP_INFO_IBITMAP_CORRUPT		\
@@ -3127,12 +3127,12 @@ struct ext4_group_info {
 #define EXT4_MB_GRP_IBITMAP_CORRUPT(grp)	\
 	(test_bit(EXT4_GROUP_INFO_IBITMAP_CORRUPT_BIT, &((grp)->bb_state)))
 
-#define EXT4_MB_GRP_WAS_TRIMMED(grp)	\
-	(test_bit(EXT4_GROUP_INFO_WAS_TRIMMED_BIT, &((grp)->bb_state)))
-#define EXT4_MB_GRP_SET_TRIMMED(grp)	\
-	(set_bit(EXT4_GROUP_INFO_WAS_TRIMMED_BIT, &((grp)->bb_state)))
-#define EXT4_MB_GRP_CLEAR_TRIMMED(grp)	\
-	(clear_bit(EXT4_GROUP_INFO_WAS_TRIMMED_BIT, &((grp)->bb_state)))
+#define EXT4_MB_GDP_WAS_TRIMMED(gdp)	\
+	(gdp->bg_flags & cpu_to_le16(EXT4_BG_WAS_TRIMMED))
+#define EXT4_MB_GDP_SET_TRIMMED(gdp)	\
+	(gdp->bg_flags |= cpu_to_le16(EXT4_BG_WAS_TRIMMED))
+#define EXT4_MB_GDP_CLEAR_TRIMMED(gdp)	\
+	(gdp->bg_flags &= ~cpu_to_le16(EXT4_BG_WAS_TRIMMED))
 
 #define EXT4_MAX_CONTENTION		8
 #define EXT4_CONTENTION_THRESHOLD	2
diff --git a/fs/ext4/ext4_jbd2.h b/fs/ext4/ext4_jbd2.h
index 4b9002f0e84c..4094a5b247f7 100644
--- a/fs/ext4/ext4_jbd2.h
+++ b/fs/ext4/ext4_jbd2.h
@@ -123,7 +123,8 @@
 #define EXT4_HT_MOVE_EXTENTS     9
 #define EXT4_HT_XATTR           10
 #define EXT4_HT_EXT_CONVERT     11
-#define EXT4_HT_MAX             12
+#define EXT4_HT_FS_TRIM		12
+#define EXT4_HT_MAX             13
 
 /**
  *   struct ext4_journal_cb_entry - Base structure for callback information.
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 30d5d97548c4..a2b78a96da16 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -2829,15 +2829,6 @@ static void ext4_free_data_in_buddy(struct super_block *sb,
 	rb_erase(&entry->efd_node, &(db->bb_free_root));
 	mb_free_blocks(NULL, &e4b, entry->efd_start_cluster, entry->efd_count);
 
-	/*
-	 * Clear the trimmed flag for the group so that the next
-	 * ext4_trim_fs can trim it.
-	 * If the volume is mounted with -o discard, online discard
-	 * is supported and the free blocks will be trimmed online.
-	 */
-	if (!test_opt(sb, DISCARD))
-		EXT4_MB_GRP_CLEAR_TRIMMED(db);
-
 	if (!db->bb_free_root.rb_node) {
 		/* No more items in the per group rb tree
 		 * balance refcounts from ext4_mb_free_metadata()
@@ -4928,8 +4919,7 @@ void ext4_free_blocks(handle_t *handle, struct inode *inode,
 					 " group:%d block:%d count:%lu failed"
 					 " with %d", block_group, bit, count,
 					 err);
-		} else
-			EXT4_MB_GRP_CLEAR_TRIMMED(e4b.bd_info);
+		}
 
 		ext4_lock_group(sb, block_group);
 		mb_clear_bits(bitmap_bh->b_data, bit, count_clusters);
@@ -4939,6 +4929,14 @@ void ext4_free_blocks(handle_t *handle, struct inode *inode,
 	ret = ext4_free_group_clusters(sb, gdp) + count_clusters;
 	ext4_free_group_clusters_set(sb, gdp, ret);
 	ext4_block_bitmap_csum_set(sb, block_group, gdp, bitmap_bh);
+	/*
+	 * Clear the trimmed flag for the group so that the next
+	 * ext4_trim_fs can trim it.
+	 * If the volume is mounted with -o discard, online discard
+	 * is supported and the free blocks will be trimmed online.
+	 */
+	if (!test_opt(sb, DISCARD))
+		EXT4_MB_GDP_CLEAR_TRIMMED(gdp);
 	ext4_group_desc_csum_set(sb, block_group, gdp);
 	ext4_unlock_group(sb, block_group);
 
@@ -5192,8 +5190,15 @@ ext4_trim_all_free(struct super_block *sb, ext4_group_t group,
 	ext4_grpblk_t next, count = 0, free_count = 0;
 	struct ext4_buddy e4b;
 	int ret = 0;
+	struct ext4_group_desc *gdp;
+	struct buffer_head *gdp_bh;
 
 	trace_ext4_trim_all_free(sb, group, start, max);
+	gdp = ext4_get_group_desc(sb, group, &gdp_bh);
+	if (!gdp) {
+		ret = -EIO;
+		return ret;
+	}
 
 	ret = ext4_mb_load_buddy(sb, group, &e4b);
 	if (ret) {
@@ -5204,7 +5209,7 @@ ext4_trim_all_free(struct super_block *sb, ext4_group_t group,
 	bitmap = e4b.bd_bitmap;
 
 	ext4_lock_group(sb, group);
-	if (EXT4_MB_GRP_WAS_TRIMMED(e4b.bd_info) &&
+	if (EXT4_MB_GDP_WAS_TRIMMED(gdp) &&
 	    minblocks >= atomic_read(&EXT4_SB(sb)->s_last_trim_minblks))
 		goto out;
 
@@ -5243,14 +5248,38 @@ ext4_trim_all_free(struct super_block *sb, ext4_group_t group,
 			break;
 	}
 
-	if (!ret) {
+	if (!ret)
 		ret = count;
-		EXT4_MB_GRP_SET_TRIMMED(e4b.bd_info);
-	}
 out:
 	ext4_unlock_group(sb, group);
 	ext4_mb_unload_buddy(&e4b);
+	if (ret > 0) {
+		int err;
+		handle_t *handle;
 
+		handle = ext4_journal_start_sb(sb, EXT4_HT_FS_TRIM, 1);
+		if (IS_ERR(handle)) {
+			ret = PTR_ERR(handle);
+			goto out_return;
+		}
+		err = ext4_journal_get_write_access(handle, gdp_bh);
+		if (err) {
+			ret = err;
+			goto out_journal;
+		}
+		ext4_lock_group(sb, group);
+		EXT4_MB_GDP_SET_TRIMMED(gdp);
+		ext4_group_desc_csum_set(sb, group, gdp);
+		ext4_unlock_group(sb, group);
+		err = ext4_handle_dirty_metadata(handle, NULL, gdp_bh);
+		if (err)
+			ret = err;
+out_journal:
+		err = ext4_journal_stop(handle);
+		if (err)
+			ret = err;
+	}
+out_return:
 	ext4_debug("trimmed %d blocks in the group %d\n",
 		count, group);
 
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] ext4: introduce EXT4_BG_WAS_TRIMMED to optimize trim
  2020-05-27 14:08 ` [PATCH v2] ext4: introduce EXT4_BG_WAS_TRIMMED to optimize trim Wang Shilong
@ 2020-05-27 19:29   ` Andreas Dilger
  0 siblings, 0 replies; 4+ messages in thread
From: Andreas Dilger @ 2020-05-27 19:29 UTC (permalink / raw)
  To: Wang Shilong
  Cc: Ext4 Developers List, Wang Shilong, Shuichi Ihara, Lukas Czerner

[-- Attachment #1: Type: text/plain, Size: 2912 bytes --]

On May 27, 2020, at 8:08 AM, Wang Shilong <wangshilong1991@gmail.com> wrote:
> 
> From: Wang Shilong <wshilong@ddn.com>
> 
> Currently WAS_TRIMMED flag is not persistent, whenever filesystem was
> remounted, fstrim need walk all block groups again, the problem with
> this is FSTRIM could be slow on very large LUN SSD based filesystem.
> 
> To avoid this kind of problem, we introduce a block group flag
> EXT4_BG_WAS_TRIMMED, the side effect of this is we need introduce
> extra one block group dirty write after trimming block group.
> 
> And When clearing TRIMMED flag, block group will be journalled
> anyway, so it won't introduce any overhead.

> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index ad2dbf6e4924..23c2dc529a28 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -357,6 +357,7 @@ struct flex_groups {
> #define EXT4_BG_INODE_UNINIT	0x0001 /* Inode table/bitmap not in use */
> #define EXT4_BG_BLOCK_UNINIT	0x0002 /* Block bitmap not in use */
> #define EXT4_BG_INODE_ZEROED	0x0004 /* On-disk itable initialized to zero */
> +#define EXT4_BG_WAS_TRIMMED	0x0008 /* block group was trimmed */
> 
> /*
>  * Macro-instructions used to manage group descriptors
> @@ -3112,9 +3113,8 @@ struct ext4_group_info {
> };
> 
> #define EXT4_GROUP_INFO_NEED_INIT_BIT		0
> -#define EXT4_GROUP_INFO_WAS_TRIMMED_BIT		1
> -#define EXT4_GROUP_INFO_BBITMAP_CORRUPT_BIT	2
> -#define EXT4_GROUP_INFO_IBITMAP_CORRUPT_BIT	3
> +#define EXT4_GROUP_INFO_BBITMAP_CORRUPT_BIT	1
> +#define EXT4_GROUP_INFO_IBITMAP_CORRUPT_BIT	2

(minor) I don't think you need to renumber these bits, just remove the
WAS_TRIMMED_BIT and leave the others as-is.  Not a big deal either way.

> diff --git a/fs/ext4/ext4_jbd2.h b/fs/ext4/ext4_jbd2.h
> index 4b9002f0e84c..4094a5b247f7 100644
> --- a/fs/ext4/ext4_jbd2.h
> +++ b/fs/ext4/ext4_jbd2.h
> 
> @@ -4939,6 +4929,14 @@ void ext4_free_blocks(handle_t *handle, struct inode *inode,
> 	ret = ext4_free_group_clusters(sb, gdp) + count_clusters;
> 	ext4_free_group_clusters_set(sb, gdp, ret);
> 	ext4_block_bitmap_csum_set(sb, block_group, gdp, bitmap_bh);
> +	/*
> +	 * Clear the trimmed flag for the group so that the next
> +	 * ext4_trim_fs can trim it.
> +	 * If the volume is mounted with -o discard, online discard
> +	 * is supported and the free blocks will be trimmed online.
> +	 */
> +	if (!test_opt(sb, DISCARD))
> +		EXT4_MB_GDP_CLEAR_TRIMMED(gdp);

As discussed in my other email, I think a follow-on patch should track (in
ext4_group_info in-memory counter) the number of blocks freed in each group,
and only clear the WAS_TRIMMED flag if there are several blocks freed, or if
the group becomes totally "empty" (i.e. free < num_itable_blocks + 2).  That
will avoid overly-aggressive full group trim operations (i.e. we don't want
to trim if only a single block was freed).

Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-05-27 19:29 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-27 14:08 [PATCH 1/2] e2fsprogs: add EXT2_FLAG_BG_WAS_TRIMMED to optimize fstrim Wang Shilong
2020-05-27 14:08 ` [PATCH 2/2] tune2fs: add clear_was_trimmed option Wang Shilong
2020-05-27 14:08 ` [PATCH v2] ext4: introduce EXT4_BG_WAS_TRIMMED to optimize trim Wang Shilong
2020-05-27 19:29   ` Andreas Dilger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.