linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 v2 0/4] do not use s_dirt in FAT FS
@ 2012-04-13 14:19 Artem Bityutskiy
  2012-04-13 14:19 ` [PATCH v2 1/4] fat: introduce special inode for managing the FSINFO block Artem Bityutskiy
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Artem Bityutskiy @ 2012-04-13 14:19 UTC (permalink / raw)
  To: OGAWA Hirofumi, Andrew Morton
  Cc: Linux Kernel Maling List, Linux FS Maling List

This is version 2 of the patch-set which makes FAT file-system stop using
the VFS '->write_super()' method for writing out the FSINFO block. The fist
version can be found here: http://lkml.org/lkml/2012/4/11/147

Comparing to v1, this patch takes a completely different approach. Instead of
using a delayed job, we introduce a special inode for the FSINFO block, mark
it as dirty when needed, and use generic inode write-back mechanisms to write
the FSINFO block via '->write_inode()'. I think this is much cleaner.

Let me recap why I am doing this, and the current status of this exercises.

The final goal is to get rid of the 'sync_supers()' kernel thread. This kernel
thread wakes up every 5 seconds (by default) and calls '->write_super()' for
all mounted file-systems. And the bad thing is that this is done even if all
the superblocks are clean. Moreover, some file-systems do not even need this
end they do not register the '->write_super()' method at all (e.g., btrfs).

So 'sync_supers()' most often just generates useless wake-ups and wastes power.
I am trying to make all file-systems independent of '->write_super()' and plan
to remove 'sync_supers()' and '->write_super' completely once there are no more
users.

The '->write_supers()' method is mostly used by baroque file-systems like hfs,
udf, etc. Modern file-systems like btrfs and xfs do not use it. This justifies
removing this stuff from VFS completely and make every FS self-manage own
superblock.

Tested with xfstests.

Note: in the past I was trying to upstream patches which optimized 'sync_super()',
but Al Viro wanted me to kill it completely instead, which I am trying to do
now, see http://lkml.org/lkml/2010/7/22/96

======
Overall status:

1. ext4: patches submitted, waiting for reply from Ted Ts'o:
   https://lkml.org/lkml/2012/4/2/111
2. ext2: patches are in the ext2 tree maintained by Jan Kara:
   git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git for_next
   However, one patch is still not there:
   http://www.spinics.net/lists/linux-ext4/msg31492.html

TODO: affs, exofs, hfs, hfsplus, jffs2, reiserfs, sysv, udf, ufs
======

 fs/fat/fat.h             |    1 +
 fs/fat/fatent.c          |   22 +++++++++++++-----
 fs/fat/inode.c           |   54 ++++++++++++++++++++-------------------------
 include/linux/msdos_fs.h |    3 +-
 4 files changed, 43 insertions(+), 37 deletions(-)

Thanks,
Artem.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 1/4] fat: introduce special inode for managing the FSINFO block
  2012-04-13 14:19 [PATCH v2 v2 0/4] do not use s_dirt in FAT FS Artem Bityutskiy
@ 2012-04-13 14:19 ` Artem Bityutskiy
  2012-04-13 14:19 ` [PATCH v2 2/4] fat: introduce mark_fsinfo_dirty helper Artem Bityutskiy
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 18+ messages in thread
From: Artem Bityutskiy @ 2012-04-13 14:19 UTC (permalink / raw)
  To: OGAWA Hirofumi, Andrew Morton
  Cc: Linux Kernel Maling List, Linux FS Maling List, Artem Bityutskiy

From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>

This patch is just a preparation for further changes. It introduces a special
inode ('fsinfo_inode') in FAT file-system which we'll later use for managing
the FSINFO block. Note, this there is already one special inode ('fat_inode')
which is used for managing the FAT tables.

Introduce new 'MSDOS_FSINFO_INO' constant for this special inode. It is safe to
do because FAT file-system does not store inode numbers on the media but
generates them run-time.

I've also cleaned up the comment to existing 'MSDOS_ROOT_INO' constant, while
on it.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
---
 fs/fat/fat.h             |    1 +
 fs/fat/inode.c           |   12 ++++++++++++
 include/linux/msdos_fs.h |    3 ++-
 3 files changed, 15 insertions(+), 1 deletions(-)

diff --git a/fs/fat/fat.h b/fs/fat/fat.h
index 66994f3..951d12b 100644
--- a/fs/fat/fat.h
+++ b/fs/fat/fat.h
@@ -82,6 +82,7 @@ struct msdos_sb_info {
 	int fatent_shift;
 	struct fatent_operations *fatent_ops;
 	struct inode *fat_inode;
+	struct inode *fsinfo_inode;
 
 	struct ratelimit_state ratelimit;
 
diff --git a/fs/fat/inode.c b/fs/fat/inode.c
index 21687e3..91e9a8a 100644
--- a/fs/fat/inode.c
+++ b/fs/fat/inode.c
@@ -490,6 +490,7 @@ static void fat_put_super(struct super_block *sb)
 	if (sb->s_dirt)
 		fat_write_super(sb);
 
+	iput(sbi->fsinfo_inode);
 	iput(sbi->fat_inode);
 
 	unload_nls(sbi->nls_disk);
@@ -1244,6 +1245,7 @@ int fat_fill_super(struct super_block *sb, void *data, int silent, int isvfat,
 		   void (*setup)(struct super_block *))
 {
 	struct inode *root_inode = NULL, *fat_inode = NULL;
+	struct inode *fsinfo_inode = NULL;
 	struct buffer_head *bh;
 	struct fat_boot_sector *b;
 	struct msdos_sb_info *sbi;
@@ -1490,6 +1492,14 @@ int fat_fill_super(struct super_block *sb, void *data, int silent, int isvfat,
 		goto out_fail;
 	MSDOS_I(fat_inode)->i_pos = 0;
 	sbi->fat_inode = fat_inode;
+
+	fsinfo_inode = new_inode(sb);
+	if (!fsinfo_inode)
+		goto out_fail;
+	fsinfo_inode->i_ino = MSDOS_FSINFO_INO;
+	sbi->fsinfo_inode = fsinfo_inode;
+	insert_inode_hash(fsinfo_inode);
+
 	root_inode = new_inode(sb);
 	if (!root_inode)
 		goto out_fail;
@@ -1516,6 +1526,8 @@ out_invalid:
 		fat_msg(sb, KERN_INFO, "Can't find a valid FAT filesystem");
 
 out_fail:
+	if (fsinfo_inode)
+		iput(fsinfo_inode);
 	if (fat_inode)
 		iput(fat_inode);
 	unload_nls(sbi->nls_io);
diff --git a/include/linux/msdos_fs.h b/include/linux/msdos_fs.h
index 34066e6..11cc2ac 100644
--- a/include/linux/msdos_fs.h
+++ b/include/linux/msdos_fs.h
@@ -21,8 +21,9 @@
 #define CT_LE_W(v)	cpu_to_le16(v)
 #define CT_LE_L(v)	cpu_to_le32(v)
 
+#define MSDOS_ROOT_INO	 1	/* The root inode number */
+#define MSDOS_FSINFO_INO 2	/* Used for managing the FSINFO block */
 
-#define MSDOS_ROOT_INO	1	/* == MINIX_ROOT_INO */
 #define MSDOS_DIR_BITS	5	/* log2(sizeof(struct msdos_dir_entry)) */
 
 /* directory limit */
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 2/4] fat: introduce mark_fsinfo_dirty helper
  2012-04-13 14:19 [PATCH v2 v2 0/4] do not use s_dirt in FAT FS Artem Bityutskiy
  2012-04-13 14:19 ` [PATCH v2 1/4] fat: introduce special inode for managing the FSINFO block Artem Bityutskiy
@ 2012-04-13 14:19 ` Artem Bityutskiy
  2012-04-13 14:19 ` [PATCH v2 3/4] fat: mark superblock as dirty less often Artem Bityutskiy
  2012-04-13 14:19 ` [PATCH v2 4/4] fat: switch to fsinfo_inode Artem Bityutskiy
  3 siblings, 0 replies; 18+ messages in thread
From: Artem Bityutskiy @ 2012-04-13 14:19 UTC (permalink / raw)
  To: OGAWA Hirofumi, Andrew Morton
  Cc: Linux Kernel Maling List, Linux FS Maling List, Artem Bityutskiy

From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>

This is a preparation patch which introduces a 'mark_fsinfo_dirty()'
helper function which just sets the 's_dirt' flag to 1 so far. I'll add more
code to this helper later, so I do not mark it as 'inline'.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
---
 fs/fat/fatent.c |   13 +++++++++----
 1 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/fs/fat/fatent.c b/fs/fat/fatent.c
index 2e81ac0..e49d274 100644
--- a/fs/fat/fatent.c
+++ b/fs/fat/fatent.c
@@ -308,6 +308,11 @@ void fat_ent_access_init(struct super_block *sb)
 	}
 }
 
+static void mark_fsinfo_dirty(struct super_block *sb)
+{
+	sb->s_dirt = 1;
+}
+
 static inline int fat_ent_update_ptr(struct super_block *sb,
 				     struct fat_entry *fatent,
 				     int offset, sector_t blocknr)
@@ -498,7 +503,7 @@ int fat_alloc_clusters(struct inode *inode, int *cluster, int nr_cluster)
 				sbi->prev_free = entry;
 				if (sbi->free_clusters != -1)
 					sbi->free_clusters--;
-				sb->s_dirt = 1;
+				mark_fsinfo_dirty(sb);
 
 				cluster[idx_clus] = entry;
 				idx_clus++;
@@ -520,7 +525,7 @@ int fat_alloc_clusters(struct inode *inode, int *cluster, int nr_cluster)
 	/* Couldn't allocate the free entries */
 	sbi->free_clusters = 0;
 	sbi->free_clus_valid = 1;
-	sb->s_dirt = 1;
+	mark_fsinfo_dirty(sb);
 	err = -ENOSPC;
 
 out:
@@ -587,7 +592,7 @@ int fat_free_clusters(struct inode *inode, int cluster)
 		ops->ent_put(&fatent, FAT_ENT_FREE);
 		if (sbi->free_clusters != -1) {
 			sbi->free_clusters++;
-			sb->s_dirt = 1;
+			mark_fsinfo_dirty(sb);
 		}
 
 		if (nr_bhs + fatent.nr_bhs > MAX_BUF_PER_PAGE) {
@@ -677,7 +682,7 @@ int fat_count_free_clusters(struct super_block *sb)
 	}
 	sbi->free_clusters = free;
 	sbi->free_clus_valid = 1;
-	sb->s_dirt = 1;
+	mark_fsinfo_dirty(sb);
 	fatent_brelse(&fatent);
 out:
 	unlock_fat(sbi);
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 3/4] fat: mark superblock as dirty less often
  2012-04-13 14:19 [PATCH v2 v2 0/4] do not use s_dirt in FAT FS Artem Bityutskiy
  2012-04-13 14:19 ` [PATCH v2 1/4] fat: introduce special inode for managing the FSINFO block Artem Bityutskiy
  2012-04-13 14:19 ` [PATCH v2 2/4] fat: introduce mark_fsinfo_dirty helper Artem Bityutskiy
@ 2012-04-13 14:19 ` Artem Bityutskiy
  2012-04-14  9:17   ` OGAWA Hirofumi
  2012-04-13 14:19 ` [PATCH v2 4/4] fat: switch to fsinfo_inode Artem Bityutskiy
  3 siblings, 1 reply; 18+ messages in thread
From: Artem Bityutskiy @ 2012-04-13 14:19 UTC (permalink / raw)
  To: OGAWA Hirofumi, Andrew Morton
  Cc: Linux Kernel Maling List, Linux FS Maling List, Artem Bityutskiy

From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>

This patch is a preparation for further changes. It touches few functions
in fatent.c and prevents them from marking the superblock as dirty
unnecessarily often. Namely, instead of marking it as dirty in the internal
tight loops - do it only once at the end of the functions. And instead of
marking it as dirty while holding the FAT table lock, do it outside the lock.

The reason for this patch is that marking the superblock as dirty will soon
become a little bit heavier operation, so it is cleaner to do this only when it
is necessary.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
---
 fs/fat/fatent.c |   12 ++++++------
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/fs/fat/fatent.c b/fs/fat/fatent.c
index e49d274..85a1ec4 100644
--- a/fs/fat/fatent.c
+++ b/fs/fat/fatent.c
@@ -503,7 +503,6 @@ int fat_alloc_clusters(struct inode *inode, int *cluster, int nr_cluster)
 				sbi->prev_free = entry;
 				if (sbi->free_clusters != -1)
 					sbi->free_clusters--;
-				mark_fsinfo_dirty(sb);
 
 				cluster[idx_clus] = entry;
 				idx_clus++;
@@ -525,11 +524,11 @@ int fat_alloc_clusters(struct inode *inode, int *cluster, int nr_cluster)
 	/* Couldn't allocate the free entries */
 	sbi->free_clusters = 0;
 	sbi->free_clus_valid = 1;
-	mark_fsinfo_dirty(sb);
 	err = -ENOSPC;
 
 out:
 	unlock_fat(sbi);
+	mark_fsinfo_dirty(sb);
 	fatent_brelse(&fatent);
 	if (!err) {
 		if (inode_needs_sync(inode))
@@ -590,10 +589,8 @@ int fat_free_clusters(struct inode *inode, int cluster)
 		}
 
 		ops->ent_put(&fatent, FAT_ENT_FREE);
-		if (sbi->free_clusters != -1) {
+		if (sbi->free_clusters != -1)
 			sbi->free_clusters++;
-			mark_fsinfo_dirty(sb);
-		}
 
 		if (nr_bhs + fatent.nr_bhs > MAX_BUF_PER_PAGE) {
 			if (sb->s_flags & MS_SYNCHRONOUS) {
@@ -622,6 +619,8 @@ error:
 	for (i = 0; i < nr_bhs; i++)
 		brelse(bhs[i]);
 	unlock_fat(sbi);
+	if (!err)
+		mark_fsinfo_dirty(sb);
 
 	return err;
 }
@@ -682,9 +681,10 @@ int fat_count_free_clusters(struct super_block *sb)
 	}
 	sbi->free_clusters = free;
 	sbi->free_clus_valid = 1;
-	mark_fsinfo_dirty(sb);
 	fatent_brelse(&fatent);
 out:
 	unlock_fat(sbi);
+	if (!err)
+		mark_fsinfo_dirty(sb);
 	return err;
 }
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 4/4] fat: switch to fsinfo_inode
  2012-04-13 14:19 [PATCH v2 v2 0/4] do not use s_dirt in FAT FS Artem Bityutskiy
                   ` (2 preceding siblings ...)
  2012-04-13 14:19 ` [PATCH v2 3/4] fat: mark superblock as dirty less often Artem Bityutskiy
@ 2012-04-13 14:19 ` Artem Bityutskiy
  2012-04-14 10:19   ` OGAWA Hirofumi
  3 siblings, 1 reply; 18+ messages in thread
From: Artem Bityutskiy @ 2012-04-13 14:19 UTC (permalink / raw)
  To: OGAWA Hirofumi, Andrew Morton
  Cc: Linux Kernel Maling List, Linux FS Maling List, Artem Bityutskiy

From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>

Currently FAT file-system maps the VFS "superblock" abstraction to the FSINFO
block. The FSINFO block contains non-essential data about the amount of free
clusters and the next free cluster. FAT file-system can always find out this
information by scanning the FAT table, but having it in the FSINFO block may
speed things up sometimes. So FAT file-system relies on the VFS superblock
write-out services to make sure the FSINFO block is written out to the media
from time to time.

The whole "superblock write-out" VFS infrastructure is served by the
'sync_supers()' kernel thread, which wakes up every 5 (by default) seconds and
writes out all dirty superblock using the '->write_super()' call-back. But the
problem with this thread is that it wastes power by waking up the system every
5 seconds no matter what. So we want to kill it completely and thus, we need to
make file-systems to stop using the '->write_super' VFS service, and then
remove it together with the kernel thread.

This patch switches the FAT FSINFO block management from
'->write_super()'/'->s_dirt' to 'fsinfo_inode'/'->write_inode'. Now, instead of
setting the 's_dirt' flag, we just mark the special 'fsinfo_inode' inode as
dirty and let VFS invoke the '->write_inode' call-back when needed, where we
write-out the FSINFO block.

This patch also makes sure we do not mark the 'fsinfo_inode' inode as dirty if
we are not FAT32 (FAT16 and FAT12 do not have the FSINFO block) or if we are in
R/O mode.

As a bonus, we can also remove the '->sync_fs()' and '->write_super()' FAT
call-back function because they become unneeded.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
---
 fs/fat/fatent.c |    7 ++++++-
 fs/fat/inode.c  |   42 ++++++++++++------------------------------
 2 files changed, 18 insertions(+), 31 deletions(-)

diff --git a/fs/fat/fatent.c b/fs/fat/fatent.c
index 85a1ec4..efa1000 100644
--- a/fs/fat/fatent.c
+++ b/fs/fat/fatent.c
@@ -310,7 +310,12 @@ void fat_ent_access_init(struct super_block *sb)
 
 static void mark_fsinfo_dirty(struct super_block *sb)
 {
-	sb->s_dirt = 1;
+	struct msdos_sb_info *sbi = MSDOS_SB(sb);
+
+	if (sb->s_flags & MS_RDONLY || sbi->fat_bits != 32)
+		return;
+
+	__mark_inode_dirty(sbi->fsinfo_inode, I_DIRTY_SYNC);
 }
 
 static inline int fat_ent_update_ptr(struct super_block *sb,
diff --git a/fs/fat/inode.c b/fs/fat/inode.c
index 91e9a8a..9a87ec4 100644
--- a/fs/fat/inode.c
+++ b/fs/fat/inode.c
@@ -459,37 +459,10 @@ static void fat_evict_inode(struct inode *inode)
 	fat_detach(inode);
 }
 
-static void fat_write_super(struct super_block *sb)
-{
-	lock_super(sb);
-	sb->s_dirt = 0;
-
-	if (!(sb->s_flags & MS_RDONLY))
-		fat_clusters_flush(sb);
-	unlock_super(sb);
-}
-
-static int fat_sync_fs(struct super_block *sb, int wait)
-{
-	int err = 0;
-
-	if (sb->s_dirt) {
-		lock_super(sb);
-		sb->s_dirt = 0;
-		err = fat_clusters_flush(sb);
-		unlock_super(sb);
-	}
-
-	return err;
-}
-
 static void fat_put_super(struct super_block *sb)
 {
 	struct msdos_sb_info *sbi = MSDOS_SB(sb);
 
-	if (sb->s_dirt)
-		fat_write_super(sb);
-
 	iput(sbi->fsinfo_inode);
 	iput(sbi->fat_inode);
 
@@ -662,7 +635,18 @@ retry:
 
 static int fat_write_inode(struct inode *inode, struct writeback_control *wbc)
 {
-	return __fat_write_inode(inode, wbc->sync_mode == WB_SYNC_ALL);
+	int err;
+
+	if (inode->i_ino == MSDOS_FSINFO_INO) {
+		struct super_block *sb = inode->i_sb;
+
+		lock_super(sb);
+		err = fat_clusters_flush(sb);
+		unlock_super(sb);
+	} else
+		err = __fat_write_inode(inode, wbc->sync_mode == WB_SYNC_ALL);
+
+	return err;
 }
 
 int fat_sync_inode(struct inode *inode)
@@ -679,8 +663,6 @@ static const struct super_operations fat_sops = {
 	.write_inode	= fat_write_inode,
 	.evict_inode	= fat_evict_inode,
 	.put_super	= fat_put_super,
-	.write_super	= fat_write_super,
-	.sync_fs	= fat_sync_fs,
 	.statfs		= fat_statfs,
 	.remount_fs	= fat_remount,
 
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] fat: mark superblock as dirty less often
  2012-04-13 14:19 ` [PATCH v2 3/4] fat: mark superblock as dirty less often Artem Bityutskiy
@ 2012-04-14  9:17   ` OGAWA Hirofumi
  2012-04-14 10:24     ` Artem Bityutskiy
  0 siblings, 1 reply; 18+ messages in thread
From: OGAWA Hirofumi @ 2012-04-14  9:17 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: Andrew Morton, Linux Kernel Maling List, Linux FS Maling List,
	Artem Bityutskiy

Artem Bityutskiy <dedekind1@gmail.com> writes:

> From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
>
> This patch is a preparation for further changes. It touches few functions
> in fatent.c and prevents them from marking the superblock as dirty
> unnecessarily often. Namely, instead of marking it as dirty in the internal
> tight loops - do it only once at the end of the functions. And instead of
> marking it as dirty while holding the FAT table lock, do it outside the lock.
>
> The reason for this patch is that marking the superblock as dirty will soon
> become a little bit heavier operation, so it is cleaner to do this only when it
> is necessary.

For it, please use local variable like,


{
       	int fsinfo_dirty = 0;

	while (1) {
        	change free_clusters
                fsinfo_dirty = 1;
        }

        if (fsinfo_dirty)
        	mark_fsinfo_dirty()
}

instead of dirty it always.
-- 
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] fat: switch to fsinfo_inode
  2012-04-13 14:19 ` [PATCH v2 4/4] fat: switch to fsinfo_inode Artem Bityutskiy
@ 2012-04-14 10:19   ` OGAWA Hirofumi
  2012-04-14 10:29     ` Artem Bityutskiy
  0 siblings, 1 reply; 18+ messages in thread
From: OGAWA Hirofumi @ 2012-04-14 10:19 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: Andrew Morton, Linux Kernel Maling List, Linux FS Maling List,
	Artem Bityutskiy

Artem Bityutskiy <dedekind1@gmail.com> writes:

> From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
>
> Currently FAT file-system maps the VFS "superblock" abstraction to the FSINFO
> block. The FSINFO block contains non-essential data about the amount of free
> clusters and the next free cluster. FAT file-system can always find out this
> information by scanning the FAT table, but having it in the FSINFO block may
> speed things up sometimes. So FAT file-system relies on the VFS superblock
> write-out services to make sure the FSINFO block is written out to the media
> from time to time.
>
> The whole "superblock write-out" VFS infrastructure is served by the
> 'sync_supers()' kernel thread, which wakes up every 5 (by default) seconds and
> writes out all dirty superblock using the '->write_super()' call-back. But the
> problem with this thread is that it wastes power by waking up the system every
> 5 seconds no matter what. So we want to kill it completely and thus, we need to
> make file-systems to stop using the '->write_super' VFS service, and then
> remove it together with the kernel thread.
>
> This patch switches the FAT FSINFO block management from
> '->write_super()'/'->s_dirt' to 'fsinfo_inode'/'->write_inode'. Now, instead of
> setting the 's_dirt' flag, we just mark the special 'fsinfo_inode' inode as
> dirty and let VFS invoke the '->write_inode' call-back when needed, where we
> write-out the FSINFO block.
>
> This patch also makes sure we do not mark the 'fsinfo_inode' inode as dirty if
> we are not FAT32 (FAT16 and FAT12 do not have the FSINFO block) or if we are in
> R/O mode.
>
> As a bonus, we can also remove the '->sync_fs()' and '->write_super()' FAT
> call-back function because they become unneeded.

Hm, does this guarantee to flush FSINFO at umount?

FSINFO is last part of data dependency. I.e. inode change can dirty
FSINFO. So, FSINFO has to be flushed after normal inodes.

Roughly, the prefer order to flush would be,

1) user data
2) dirent
3) cluster chain
4) fsinfo
5) wait next period

current scheme (sync_supers thread, and writeback) also doesn't above
order though, I think the change to use inode is bad direction...

For above reason, I don't like to use inode for FSINFO is proper.

If we can do above order, it would be minimize chance of corruption on
disk structure. IOW, wrong order keeps corrupt on-disk structure.

So, I prefer to allow customize writeback order (this might not be your
target though). E.g. add ->writeback_sb_inodes handler to sb, and
helpers (iirc, reiserfs4 had it). The pesudo code of this is

fat_writeback()
{
	writeback_sb_inodes()
        fat_clusters_flush()   <= flush FSINFO, not only dirty buffer
}

super_operation = {
	.writeback = fat_writeback,
}

However, I feel this is too big to satisfy your target..., hmm...

Thanks.
-- 
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] fat: mark superblock as dirty less often
  2012-04-14  9:17   ` OGAWA Hirofumi
@ 2012-04-14 10:24     ` Artem Bityutskiy
  2012-04-14 10:37       ` OGAWA Hirofumi
  0 siblings, 1 reply; 18+ messages in thread
From: Artem Bityutskiy @ 2012-04-14 10:24 UTC (permalink / raw)
  To: OGAWA Hirofumi
  Cc: Andrew Morton, Linux Kernel Maling List, Linux FS Maling List,
	Artem Bityutskiy

[-- Attachment #1: Type: text/plain, Size: 1297 bytes --]

Hi, thanks for feed-back,

On Sat, 2012-04-14 at 18:17 +0900, OGAWA Hirofumi wrote:
> Artem Bityutskiy <dedekind1@gmail.com> writes:
> 
> > From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
> >
> > This patch is a preparation for further changes. It touches few functions
> > in fatent.c and prevents them from marking the superblock as dirty
> > unnecessarily often. Namely, instead of marking it as dirty in the internal
> > tight loops - do it only once at the end of the functions. And instead of
> > marking it as dirty while holding the FAT table lock, do it outside the lock.
> >
> > The reason for this patch is that marking the superblock as dirty will soon
> > become a little bit heavier operation, so it is cleaner to do this only when it
> > is necessary.
> 
> For it, please use local variable like,
> 
> 
> {
>        	int fsinfo_dirty = 0;
> 
> 	while (1) {
>         	change free_clusters
>                 fsinfo_dirty = 1;
>         }
> 
>         if (fsinfo_dirty)
>         	mark_fsinfo_dirty()
> }
> 
> instead of dirty it always.

But could you please explain why do we need an extra variable? What is
the problem with doing all our FAT table changes and then marking the
FSINFO as dirty?

-- 
Best Regards,
Artem Bityutskiy

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] fat: switch to fsinfo_inode
  2012-04-14 10:19   ` OGAWA Hirofumi
@ 2012-04-14 10:29     ` Artem Bityutskiy
  2012-04-14 10:36       ` OGAWA Hirofumi
  0 siblings, 1 reply; 18+ messages in thread
From: Artem Bityutskiy @ 2012-04-14 10:29 UTC (permalink / raw)
  To: OGAWA Hirofumi
  Cc: Andrew Morton, Linux Kernel Maling List, Linux FS Maling List,
	Artem Bityutskiy

[-- Attachment #1: Type: text/plain, Size: 2723 bytes --]

On Sat, 2012-04-14 at 19:19 +0900, OGAWA Hirofumi wrote:
> Artem Bityutskiy <dedekind1@gmail.com> writes:
> 
> > From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
> >
> > Currently FAT file-system maps the VFS "superblock" abstraction to the FSINFO
> > block. The FSINFO block contains non-essential data about the amount of free
> > clusters and the next free cluster. FAT file-system can always find out this
> > information by scanning the FAT table, but having it in the FSINFO block may
> > speed things up sometimes. So FAT file-system relies on the VFS superblock
> > write-out services to make sure the FSINFO block is written out to the media
> > from time to time.
> >
> > The whole "superblock write-out" VFS infrastructure is served by the
> > 'sync_supers()' kernel thread, which wakes up every 5 (by default) seconds and
> > writes out all dirty superblock using the '->write_super()' call-back. But the
> > problem with this thread is that it wastes power by waking up the system every
> > 5 seconds no matter what. So we want to kill it completely and thus, we need to
> > make file-systems to stop using the '->write_super' VFS service, and then
> > remove it together with the kernel thread.
> >
> > This patch switches the FAT FSINFO block management from
> > '->write_super()'/'->s_dirt' to 'fsinfo_inode'/'->write_inode'. Now, instead of
> > setting the 's_dirt' flag, we just mark the special 'fsinfo_inode' inode as
> > dirty and let VFS invoke the '->write_inode' call-back when needed, where we
> > write-out the FSINFO block.
> >
> > This patch also makes sure we do not mark the 'fsinfo_inode' inode as dirty if
> > we are not FAT32 (FAT16 and FAT12 do not have the FSINFO block) or if we are in
> > R/O mode.
> >
> > As a bonus, we can also remove the '->sync_fs()' and '->write_super()' FAT
> > call-back function because they become unneeded.
> 
> Hm, does this guarantee to flush FSINFO at umount?

Of course, and I checked it. It is just a dirty inode. If you do not
worry that any other inode won't get written-beck, then you should not
worry about this one.

> FSINFO is last part of data dependency. I.e. inode change can dirty
> FSINFO. So, FSINFO has to be flushed after normal inodes.

Sorry, I do not see how this can be true. You have a just bunch of dirty
inodes, and it does not matter in which order you flush them. See
__fat_write_inode() - it does not change the FAT table and does not
affect the FSINFO block.

Besides, the _current_ code first writes out FSINFO, because VFS calls
->sync_fs() first, then it starts writing back, then VFS calls
->sync_fs() for the second time.

-- 
Best Regards,
Artem Bityutskiy

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] fat: switch to fsinfo_inode
  2012-04-14 10:29     ` Artem Bityutskiy
@ 2012-04-14 10:36       ` OGAWA Hirofumi
  2012-04-14 11:01         ` Artem Bityutskiy
  0 siblings, 1 reply; 18+ messages in thread
From: OGAWA Hirofumi @ 2012-04-14 10:36 UTC (permalink / raw)
  To: dedekind1
  Cc: Andrew Morton, Linux Kernel Maling List, Linux FS Maling List,
	Artem Bityutskiy

Artem Bityutskiy <dedekind1@gmail.com> writes:

>> Hm, does this guarantee to flush FSINFO at umount?
>
> Of course, and I checked it. It is just a dirty inode. If you do not
> worry that any other inode won't get written-beck, then you should not
> worry about this one.
>
>> FSINFO is last part of data dependency. I.e. inode change can dirty
>> FSINFO. So, FSINFO has to be flushed after normal inodes.
>
> Sorry, I do not see how this can be true. You have a just bunch of dirty
> inodes, and it does not matter in which order you flush them. See
> __fat_write_inode() - it does not change the FAT table and does not
> affect the FSINFO block.
>
> Besides, the _current_ code first writes out FSINFO, because VFS calls
> ->sync_fs() first, then it starts writing back, then VFS calls
> ->sync_fs() for the second time.

Common case is delayed allocation though, in the case of FATfs, it would
be only truncate by last iput().
-- 
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] fat: mark superblock as dirty less often
  2012-04-14 10:24     ` Artem Bityutskiy
@ 2012-04-14 10:37       ` OGAWA Hirofumi
  2012-04-14 11:08         ` Artem Bityutskiy
  0 siblings, 1 reply; 18+ messages in thread
From: OGAWA Hirofumi @ 2012-04-14 10:37 UTC (permalink / raw)
  To: dedekind1
  Cc: Andrew Morton, Linux Kernel Maling List, Linux FS Maling List,
	Artem Bityutskiy

Artem Bityutskiy <dedekind1@gmail.com> writes:

> On Sat, 2012-04-14 at 18:17 +0900, OGAWA Hirofumi wrote:
>> Artem Bityutskiy <dedekind1@gmail.com> writes:
>> 
>> > From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
>> >
>> > This patch is a preparation for further changes. It touches few functions
>> > in fatent.c and prevents them from marking the superblock as dirty
>> > unnecessarily often. Namely, instead of marking it as dirty in the internal
>> > tight loops - do it only once at the end of the functions. And instead of
>> > marking it as dirty while holding the FAT table lock, do it
>> > outside the lock.
>> >
>> > The reason for this patch is that marking the superblock as dirty will soon
>> > become a little bit heavier operation, so it is cleaner to do this
>> > only when it
>> > is necessary.
>> 
>> For it, please use local variable like,
>> 
>> 
>> {
>>        	int fsinfo_dirty = 0;
>> 
>> 	while (1) {
>>         	change free_clusters
>>                 fsinfo_dirty = 1;
>>         }
>> 
>>         if (fsinfo_dirty)
>>         	mark_fsinfo_dirty()
>> }
>> 
>> instead of dirty it always.
>
> But could you please explain why do we need an extra variable? What is
> the problem with doing all our FAT table changes and then marking the
> FSINFO as dirty?

Above example may not be proper. I meant please dirty FSINFO only if
necessary. Your patch seems to be dirty even if code didn't change
FSINFO.
-- 
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] fat: switch to fsinfo_inode
  2012-04-14 10:36       ` OGAWA Hirofumi
@ 2012-04-14 11:01         ` Artem Bityutskiy
  2012-04-14 11:51           ` OGAWA Hirofumi
  0 siblings, 1 reply; 18+ messages in thread
From: Artem Bityutskiy @ 2012-04-14 11:01 UTC (permalink / raw)
  To: OGAWA Hirofumi
  Cc: Andrew Morton, Linux Kernel Maling List, Linux FS Maling List,
	Artem Bityutskiy

[-- Attachment #1: Type: text/plain, Size: 1558 bytes --]

On Sat, 2012-04-14 at 19:36 +0900, OGAWA Hirofumi wrote:
> Artem Bityutskiy <dedekind1@gmail.com> writes:
> 
> >> Hm, does this guarantee to flush FSINFO at umount?
> >
> > Of course, and I checked it. It is just a dirty inode. If you do not
> > worry that any other inode won't get written-beck, then you should not
> > worry about this one.
> >
> >> FSINFO is last part of data dependency. I.e. inode change can dirty
> >> FSINFO. So, FSINFO has to be flushed after normal inodes.
> >
> > Sorry, I do not see how this can be true. You have a just bunch of dirty
> > inodes, and it does not matter in which order you flush them. See
> > __fat_write_inode() - it does not change the FAT table and does not
> > affect the FSINFO block.
> >
> > Besides, the _current_ code first writes out FSINFO, because VFS calls
> > ->sync_fs() first, then it starts writing back, then VFS calls
> > ->sync_fs() for the second time.

This is actually not exactly correct, but anyway, the first
->sync_fs(sb, 0) may come to FAT FS while it is in the middle of writing
out the inodes.

BTW, fat_clusters_flush() does not wait on the FSINFO block, which I
think is a bug. I mean, it should call 'sync_dirty_buffer()'. I can
submit a separate patch later.

> Common case is delayed allocation though, in the case of FATfs, it would
> be only truncate by last iput().

Sorry, I do not understand what you mean. Do you still want me to take
care of the order or not? If yes, could you please explain why?

-- 
Best Regards,
Artem Bityutskiy

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] fat: mark superblock as dirty less often
  2012-04-14 10:37       ` OGAWA Hirofumi
@ 2012-04-14 11:08         ` Artem Bityutskiy
  0 siblings, 0 replies; 18+ messages in thread
From: Artem Bityutskiy @ 2012-04-14 11:08 UTC (permalink / raw)
  To: OGAWA Hirofumi
  Cc: Andrew Morton, Linux Kernel Maling List, Linux FS Maling List,
	Artem Bityutskiy

[-- Attachment #1: Type: text/plain, Size: 497 bytes --]

On Sat, 2012-04-14 at 19:37 +0900, OGAWA Hirofumi wrote:
> > But could you please explain why do we need an extra variable? What is
> > the problem with doing all our FAT table changes and then marking the
> > FSINFO as dirty?
> 
> Above example may not be proper. I meant please dirty FSINFO only if
> necessary. Your patch seems to be dirty even if code didn't change
> FSINFO.

Ah, yes, in 'fat_alloc_clusters()' indeed, thanks, I'll fix this.

-- 
Best Regards,
Artem Bityutskiy

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] fat: switch to fsinfo_inode
  2012-04-14 11:01         ` Artem Bityutskiy
@ 2012-04-14 11:51           ` OGAWA Hirofumi
  2012-04-14 12:36             ` Artem Bityutskiy
  0 siblings, 1 reply; 18+ messages in thread
From: OGAWA Hirofumi @ 2012-04-14 11:51 UTC (permalink / raw)
  To: dedekind1
  Cc: Andrew Morton, Linux Kernel Maling List, Linux FS Maling List,
	Artem Bityutskiy

Artem Bityutskiy <dedekind1@gmail.com> writes:

>> > Sorry, I do not see how this can be true. You have a just bunch of dirty
>> > inodes, and it does not matter in which order you flush them. See
>> > __fat_write_inode() - it does not change the FAT table and does not
>> > affect the FSINFO block.
>> >
>> > Besides, the _current_ code first writes out FSINFO, because VFS calls
>> > ->sync_fs() first, then it starts writing back, then VFS calls
>> > ->sync_fs() for the second time.
>
> This is actually not exactly correct, but anyway, the first
> ->sync_fs(sb, 0) may come to FAT FS while it is in the middle of writing
> out the inodes.
>
> BTW, fat_clusters_flush() does not wait on the FSINFO block, which I
> think is a bug. I mean, it should call 'sync_dirty_buffer()'. I can
> submit a separate patch later.

write_super() is update and dirty buffer. Actual flush is via blockdev
(by historical reason), right? So, I think we don't need to wait at
sync_fs() in FATfs case.

>> Common case is delayed allocation though, in the case of FATfs, it would
>> be only truncate by last iput().
>
> Sorry, I do not understand what you mean. Do you still want me to take
> care of the order or not? If yes, could you please explain why?

Yes, I still worry about order. About ->sync_fs(), you are looking the
following?

__sync_filesystem(sb, 0)
__sync_filesystem(sb, 1)

Those are doing

1) try flush dirty all data at first without blocking
	__sync_filesystem(sb, 0)
2) wait some data of (1), and complete inodes work
	part of __sync_filesystem(sb, 1)
3) write super block to buffer
	->sync_fs() in __sync_filesystem(sb, 1)
4) flush and wait all metadata on blockdev
	__sync_blockdev() in __sync_filesystem(sb, 1)

right?

In your patch, we lose (3). (And I can't remember why I did it though,
your patch lose write_super() in ->put_super() too.)
-- 
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] fat: switch to fsinfo_inode
  2012-04-14 11:51           ` OGAWA Hirofumi
@ 2012-04-14 12:36             ` Artem Bityutskiy
  2012-04-14 13:12               ` OGAWA Hirofumi
  0 siblings, 1 reply; 18+ messages in thread
From: Artem Bityutskiy @ 2012-04-14 12:36 UTC (permalink / raw)
  To: OGAWA Hirofumi
  Cc: Andrew Morton, Linux Kernel Maling List, Linux FS Maling List,
	Artem Bityutskiy

[-- Attachment #1: Type: text/plain, Size: 483 bytes --]

On Sat, 2012-04-14 at 20:51 +0900, OGAWA Hirofumi wrote:
> Yes, I still worry about order. About ->sync_fs(), you are looking the
> following?

Hirofumi, you still did not explain why the order matters. If it
matters, it should be easy to explain.

But I will look at this and think about the ordering, thanks for
feed-back. But if you could explain why writing out FSINFO before inodes
is an issue, it'd be very helpful.

Thanks!

-- 
Best Regards,
Artem Bityutskiy

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] fat: switch to fsinfo_inode
  2012-04-14 12:36             ` Artem Bityutskiy
@ 2012-04-14 13:12               ` OGAWA Hirofumi
  2012-04-14 13:54                 ` Artem Bityutskiy
  2012-05-04 10:13                 ` Artem Bityutskiy
  0 siblings, 2 replies; 18+ messages in thread
From: OGAWA Hirofumi @ 2012-04-14 13:12 UTC (permalink / raw)
  To: dedekind1
  Cc: Andrew Morton, Linux Kernel Maling List, Linux FS Maling List,
	Artem Bityutskiy

Artem Bityutskiy <dedekind1@gmail.com> writes:

> On Sat, 2012-04-14 at 20:51 +0900, OGAWA Hirofumi wrote:
>> Yes, I still worry about order. About ->sync_fs(), you are looking the
>> following?
>
> Hirofumi, you still did not explain why the order matters. If it
> matters, it should be easy to explain.
>
> But I will look at this and think about the ordering, thanks for
> feed-back. But if you could explain why writing out FSINFO before inodes
> is an issue, it'd be very helpful.

Hm, you want to this one?

If fsinfo is inode, the order can be

1) flush inodes
    1a) flush fsinfo inode
    1b) flush normal inodes
2) last iput(normal inodes)
      truncate()
           dirty fsinfo inode
-- 
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] fat: switch to fsinfo_inode
  2012-04-14 13:12               ` OGAWA Hirofumi
@ 2012-04-14 13:54                 ` Artem Bityutskiy
  2012-05-04 10:13                 ` Artem Bityutskiy
  1 sibling, 0 replies; 18+ messages in thread
From: Artem Bityutskiy @ 2012-04-14 13:54 UTC (permalink / raw)
  To: OGAWA Hirofumi
  Cc: Andrew Morton, Linux Kernel Maling List, Linux FS Maling List,
	Artem Bityutskiy

On Sat, 2012-04-14 at 22:12 +0900, OGAWA Hirofumi wrote:
> Hm, you want to this one?
> 
> If fsinfo is inode, the order can be
> 
> 1) flush inodes
>     1a) flush fsinfo inode
>     1b) flush normal inodes
> 2) last iput(normal inodes)
>       truncate()
>            dirty fsinfo inode

Clear, thanks, I'll look at this.

Artem.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] fat: switch to fsinfo_inode
  2012-04-14 13:12               ` OGAWA Hirofumi
  2012-04-14 13:54                 ` Artem Bityutskiy
@ 2012-05-04 10:13                 ` Artem Bityutskiy
  1 sibling, 0 replies; 18+ messages in thread
From: Artem Bityutskiy @ 2012-05-04 10:13 UTC (permalink / raw)
  To: OGAWA Hirofumi
  Cc: Andrew Morton, Linux Kernel Maling List, Linux FS Maling List

[-- Attachment #1: Type: text/plain, Size: 603 bytes --]

Hi,

I was having a leave, but now came back to this stuff.

On Sat, 2012-04-14 at 22:12 +0900, OGAWA Hirofumi wrote:
> If fsinfo is inode, the order can be
> 
> 1) flush inodes
>     1a) flush fsinfo inode
>     1b) flush normal inodes
> 2) last iput(normal inodes)
>       truncate()
>            dirty fsinfo inode

3) a bit later fsinfo inode is flushed again.

Sorry, I do not still see where is the issue.

Also, how is it different form the current situation where
you have absolutely the same.

I really still do not see the issue. 

-- 
Best Regards,
Artem Bityutskiy

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2012-05-04 10:09 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-13 14:19 [PATCH v2 v2 0/4] do not use s_dirt in FAT FS Artem Bityutskiy
2012-04-13 14:19 ` [PATCH v2 1/4] fat: introduce special inode for managing the FSINFO block Artem Bityutskiy
2012-04-13 14:19 ` [PATCH v2 2/4] fat: introduce mark_fsinfo_dirty helper Artem Bityutskiy
2012-04-13 14:19 ` [PATCH v2 3/4] fat: mark superblock as dirty less often Artem Bityutskiy
2012-04-14  9:17   ` OGAWA Hirofumi
2012-04-14 10:24     ` Artem Bityutskiy
2012-04-14 10:37       ` OGAWA Hirofumi
2012-04-14 11:08         ` Artem Bityutskiy
2012-04-13 14:19 ` [PATCH v2 4/4] fat: switch to fsinfo_inode Artem Bityutskiy
2012-04-14 10:19   ` OGAWA Hirofumi
2012-04-14 10:29     ` Artem Bityutskiy
2012-04-14 10:36       ` OGAWA Hirofumi
2012-04-14 11:01         ` Artem Bityutskiy
2012-04-14 11:51           ` OGAWA Hirofumi
2012-04-14 12:36             ` Artem Bityutskiy
2012-04-14 13:12               ` OGAWA Hirofumi
2012-04-14 13:54                 ` Artem Bityutskiy
2012-05-04 10:13                 ` Artem Bityutskiy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).