All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH -mm 0/5] nilfs2 settle matters related to disk format
@ 2009-03-05 16:07 Ryusuke Konishi
  2009-03-05 16:07 ` [PATCH -mm 1/5] nilfs2: super block operations fix endian bug Ryusuke Konishi
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Ryusuke Konishi @ 2009-03-05 16:07 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-fsdevel, linux-kernel, Ryusuke Konishi

This additional series is for stabilizing the current disk format.
This will

* fix a recently found endian bug
* clean up an obsolete posix-noncompliant file.
* apply simplification which involves removal of an on-disk flag.
* add an on-disk flag required by applications managing checkpoints, and
* introduce secondary super block for reliability improvement

The first bugfix once breaks compatibility on big endian machines, but
it's needed to eliminate the architecture dependency.  A remedy for
this defect is just mounting the old partition (created by big endian
machine) with the version applied the first patch.  The latest release
of out-of-tree module (e.g. nilfs-2.0.9 available from
http://www.nilfs.org/en/ ) can serve for the purpose.  Little endian
architectures, incidentally, are not affected by the bug.

The remaining patches keep compatibility though a few of them have
minute differences in disk format declarations (i.e. nilfs2_fs.h).

Xattr, posix-acl, atime, are not yet supported, but preparations to
keep backward compatibility are included.  I'm reviewing user's
feedbacks to find out other problems which may impact on the
compatibility.  Hopefully, this series would help to avoid future
confusion.

The corresponding userland package (nilfs-utils) is available from
the git repo in the following site:

  http://www.nilfs.org/git/

Regards,
Ryusuke Konishi
---
 Documentation/filesystems/nilfs2.txt |    2 -
 fs/nilfs2/inode.c                    |   35 +-----
 fs/nilfs2/nilfs.h                    |    7 +-
 fs/nilfs2/recovery.c                 |   13 +--
 fs/nilfs2/segbuf.c                   |   24 +----
 fs/nilfs2/segbuf.h                   |    6 +-
 fs/nilfs2/segment.c                  |  239 ++++------------------------------
 fs/nilfs2/segment.h                  |   18 +--
 fs/nilfs2/sufile.c                   |    8 +-
 fs/nilfs2/super.c                    |  231 ++++++++++++++-------------------
 fs/nilfs2/the_nilfs.c                |  179 ++++++++++++++++++++++----
 fs/nilfs2/the_nilfs.h                |   23 +++-
 include/linux/nilfs2_fs.h            |    9 +-
 13 files changed, 319 insertions(+), 475 deletions(-)



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH -mm 1/5] nilfs2: super block operations fix endian bug
  2009-03-05 16:07 [PATCH -mm 0/5] nilfs2 settle matters related to disk format Ryusuke Konishi
@ 2009-03-05 16:07 ` Ryusuke Konishi
  2009-03-05 16:07 ` [PATCH -mm 2/5] nilfs2: clean up sketch file Ryusuke Konishi
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Ryusuke Konishi @ 2009-03-05 16:07 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-fsdevel, linux-kernel, Ryusuke Konishi

This adds a missing endian conversion of checksum field in the super
block.  This fixes compatibility issue on big endian machines which
will come to surface after supporting recovery of super block.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
---
 fs/nilfs2/super.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/nilfs2/super.c b/fs/nilfs2/super.c
index d0639a6..b7519c3 100644
--- a/fs/nilfs2/super.c
+++ b/fs/nilfs2/super.c
@@ -287,9 +287,9 @@ int nilfs_commit_super(struct nilfs_sb_info *sbi)
 	sbp->s_free_blocks_count = cpu_to_le64(nfreeblocks);
 	sbp->s_wtime = cpu_to_le64(get_seconds());
 	sbp->s_sum = 0;
-	sbp->s_sum = crc32_le(nilfs->ns_crc_seed, (unsigned char *)sbp,
-			      le16_to_cpu(sbp->s_bytes));
-
+	sbp->s_sum = cpu_to_le32(crc32_le(nilfs->ns_crc_seed,
+					  (unsigned char *)sbp,
+					  le16_to_cpu(sbp->s_bytes)));
 	sbi->s_super->s_dirt = 0;
 	return nilfs_sync_super(sbi);
 }
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH -mm 2/5] nilfs2: clean up sketch file
  2009-03-05 16:07 [PATCH -mm 0/5] nilfs2 settle matters related to disk format Ryusuke Konishi
  2009-03-05 16:07 ` [PATCH -mm 1/5] nilfs2: super block operations fix endian bug Ryusuke Konishi
@ 2009-03-05 16:07 ` Ryusuke Konishi
  2009-03-05 16:07 ` [PATCH -mm 3/5] nilfs2: mark minor flag for checkpoint created by internal operation Ryusuke Konishi
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Ryusuke Konishi @ 2009-03-05 16:07 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-fsdevel, linux-kernel, Ryusuke Konishi

The sketch file is a file to mark checkpoints with user data.  It was
experimentally introduced in the original implementation, and now
obsolete.  The file was handled differently with regular files; the
file size got truncated when a checkpoint was created.

This stops the special treatment and will treat it as a regular file.
Most users are not affected because mkfs.nilfs2 no longer makes this
file.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
---
 Documentation/filesystems/nilfs2.txt |    2 -
 fs/nilfs2/inode.c                    |   35 +-----------------------
 fs/nilfs2/segment.c                  |   49 +---------------------------------
 fs/nilfs2/segment.h                  |    8 -----
 include/linux/nilfs2_fs.h            |    2 -
 5 files changed, 3 insertions(+), 93 deletions(-)

diff --git a/Documentation/filesystems/nilfs2.txt b/Documentation/filesystems/nilfs2.txt
index 3367fc4..55c4300 100644
--- a/Documentation/filesystems/nilfs2.txt
+++ b/Documentation/filesystems/nilfs2.txt
@@ -161,8 +161,6 @@ the following meta data files:
  4) Data address translation file  -- Maps virtual block numbers to usual
     (DAT)                             block numbers.  This file serves to
                                       make on-disk blocks relocatable.
- 5) Sketch file (sketch)           -- Keeps read-only data which can be
-                                      associated with checkpoints (optional)
 
 The following figure shows a typical organization of the logs:
 
diff --git a/fs/nilfs2/inode.c b/fs/nilfs2/inode.c
index b6536bb..a1922b1 100644
--- a/fs/nilfs2/inode.c
+++ b/fs/nilfs2/inode.c
@@ -418,30 +418,6 @@ int nilfs_read_inode_common(struct inode *inode,
 	return 0;
 }
 
-static int nilfs_read_sketch_inode(struct inode *inode)
-{
-	struct nilfs_sb_info *sbi = NILFS_SB(inode->i_sb);
-	int err = 0;
-
-	if (sbi->s_snapshot_cno) {
-		struct the_nilfs *nilfs = sbi->s_nilfs;
-		struct buffer_head *bh_cp;
-		struct nilfs_checkpoint *raw_cp;
-
-		err = nilfs_cpfile_get_checkpoint(
-			nilfs->ns_cpfile, sbi->s_snapshot_cno, 0, &raw_cp,
-			&bh_cp);
-		if (likely(!err)) {
-			if (!nilfs_checkpoint_sketch(raw_cp))
-				inode->i_size = 0;
-			nilfs_cpfile_put_checkpoint(
-				nilfs->ns_cpfile, sbi->s_snapshot_cno, bh_cp);
-		}
-		inode->i_flags |= S_NOCMTIME;
-	}
-	return err;
-}
-
 static int __nilfs_read_inode(struct super_block *sb, unsigned long ino,
 			      struct inode *inode)
 {
@@ -469,11 +445,6 @@ static int __nilfs_read_inode(struct super_block *sb, unsigned long ino,
 		inode->i_op = &nilfs_file_inode_operations;
 		inode->i_fop = &nilfs_file_operations;
 		inode->i_mapping->a_ops = &nilfs_aops;
-		if (unlikely(inode->i_ino == NILFS_SKETCH_INO)) {
-			err = nilfs_read_sketch_inode(inode);
-			if (unlikely(err))
-				goto failed_unmap;
-		}
 	} else if (S_ISDIR(inode->i_mode)) {
 		inode->i_op = &nilfs_dir_inode_operations;
 		inode->i_fop = &nilfs_dir_operations;
@@ -742,8 +713,7 @@ int nilfs_set_file_dirty(struct nilfs_sb_info *sbi, struct inode *inode,
 
 	atomic_add(nr_dirty, &sbi->s_nilfs->ns_ndirtyblks);
 
-	if (test_and_set_bit(NILFS_I_DIRTY, &ii->i_state) ||
-	    unlikely(inode->i_ino == NILFS_SKETCH_INO))
+	if (test_and_set_bit(NILFS_I_DIRTY, &ii->i_state))
 		return 0;
 
 	spin_lock(&sbi->s_inode_lock);
@@ -811,7 +781,6 @@ void nilfs_dirty_inode(struct inode *inode)
 		return;
 	}
 	nilfs_transaction_begin(inode->i_sb, &ti, 0);
-	if (likely(inode->i_ino != NILFS_SKETCH_INO))
-		nilfs_mark_inode_dirty(inode);
+	nilfs_mark_inode_dirty(inode);
 	nilfs_transaction_commit(inode->i_sb); /* never fails */
 }
diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
index 9a87410..981c34a 100644
--- a/fs/nilfs2/segment.c
+++ b/fs/nilfs2/segment.c
@@ -67,7 +67,6 @@ enum {
 	NILFS_ST_INIT = 0,
 	NILFS_ST_GC,		/* Collecting dirty blocks for GC */
 	NILFS_ST_FILE,
-	NILFS_ST_SKETCH,
 	NILFS_ST_IFILE,
 	NILFS_ST_CPFILE,
 	NILFS_ST_SUFILE,
@@ -887,8 +886,7 @@ static int nilfs_segctor_fill_in_checkpoint(struct nilfs_sc_info *sci)
 		cpu_to_le64(sci->sc_nblk_inc + sci->sc_nblk_this_inc);
 	raw_cp->cp_create = cpu_to_le64(sci->sc_seg_ctime);
 	raw_cp->cp_cno = cpu_to_le64(nilfs->ns_cno);
-	if (sci->sc_sketch_inode && i_size_read(sci->sc_sketch_inode) > 0)
-		nilfs_checkpoint_set_sketch(raw_cp);
+
 	nilfs_write_inode_common(sbi->s_ifile, &raw_cp->cp_ifile_inode, 1);
 	nilfs_cpfile_put_checkpoint(nilfs->ns_cpfile, nilfs->ns_cno, bh_cp);
 	return 0;
@@ -923,11 +921,6 @@ static void nilfs_segctor_fill_in_file_bmap(struct nilfs_sc_info *sci,
 		nilfs_fill_in_file_bmap(ifile, ii);
 		set_bit(NILFS_I_COLLECTED, &ii->i_state);
 	}
-	if (sci->sc_sketch_inode) {
-		ii = NILFS_I(sci->sc_sketch_inode);
-		if (test_bit(NILFS_I_DIRTY, &ii->i_state))
-			nilfs_fill_in_file_bmap(ifile, ii);
-	}
 }
 
 /*
@@ -1228,26 +1221,6 @@ static int nilfs_segctor_collect_blocks(struct nilfs_sc_info *sci, int mode)
 			sci->sc_stage.scnt = NILFS_ST_DONE;
 			return 0;
 		}
-		sci->sc_stage.scnt++;  /* Fall through */
-	case NILFS_ST_SKETCH:
-		if (mode == SC_LSEG_SR && sci->sc_sketch_inode) {
-			ii = NILFS_I(sci->sc_sketch_inode);
-			if (test_bit(NILFS_I_DIRTY, &ii->i_state)) {
-				sci->sc_sketch_inode->i_ctime.tv_sec
-					= sci->sc_seg_ctime;
-				sci->sc_sketch_inode->i_mtime.tv_sec
-					= sci->sc_seg_ctime;
-				err = nilfs_mark_inode_dirty(
-					sci->sc_sketch_inode);
-				if (unlikely(err))
-					goto break_or_fail;
-			}
-			err = nilfs_segctor_scan_file(sci,
-						      sci->sc_sketch_inode,
-						      &nilfs_sc_file_ops);
-			if (unlikely(err))
-				goto break_or_fail;
-		}
 		sci->sc_stage.scnt++;
 		sci->sc_stage.flags |= NILFS_CF_IFILE_STARTED;
 		/* Fall through */
@@ -2385,13 +2358,6 @@ static int nilfs_segctor_do_construct(struct nilfs_sc_info *sci, int mode)
 
 	} while (sci->sc_stage.scnt != NILFS_ST_DONE);
 
-	/* Clearing sketch data */
-	if (has_sr && sci->sc_sketch_inode) {
-		if (i_size_read(sci->sc_sketch_inode) == 0)
-			clear_bit(NILFS_I_DIRTY,
-				  &NILFS_I(sci->sc_sketch_inode)->i_state);
-		i_size_write(sci->sc_sketch_inode, 0);
-	}
  out:
 	nilfs_segctor_destroy_segment_buffers(sci);
 	nilfs_segctor_check_out_files(sci, sbi);
@@ -2971,11 +2937,6 @@ static int nilfs_segctor_init(struct nilfs_sc_info *sci,
 			      struct nilfs_recovery_info *ri)
 {
 	int err;
-	struct inode *inode = nilfs_iget(sci->sc_super, NILFS_SKETCH_INO);
-
-	sci->sc_sketch_inode = IS_ERR(inode) ? NULL : inode;
-	if (sci->sc_sketch_inode)
-		i_size_write(sci->sc_sketch_inode, 0);
 
 	sci->sc_seq_done = sci->sc_seq_request;
 	if (ri)
@@ -2987,10 +2948,6 @@ static int nilfs_segctor_init(struct nilfs_sc_info *sci,
 		if (ri)
 			list_splice_init(&sci->sc_active_segments,
 					 ri->ri_used_segments.prev);
-		if (sci->sc_sketch_inode) {
-			iput(sci->sc_sketch_inode);
-			sci->sc_sketch_inode = NULL;
-		}
 	}
 	return err;
 }
@@ -3090,10 +3047,6 @@ static void nilfs_segctor_destroy(struct nilfs_sc_info *sci)
 
 	WARN_ON(!list_empty(&sci->sc_segbufs));
 
-	if (sci->sc_sketch_inode) {
-		iput(sci->sc_sketch_inode);
-		sci->sc_sketch_inode = NULL;
-	}
 	down_write(&sbi->s_nilfs->ns_segctor_sem);
 
 	kfree(sci);
diff --git a/fs/nilfs2/segment.h b/fs/nilfs2/segment.h
index 2dd39da..fbd162d 100644
--- a/fs/nilfs2/segment.h
+++ b/fs/nilfs2/segment.h
@@ -108,7 +108,6 @@ struct nilfs_segsum_pointer {
  * @sc_nblk_this_inc: Number of blocks included in the current logical segment
  * @sc_seg_ctime: Creation time
  * @sc_flags: Internal flags
- * @sc_sketch_inode: Inode of the sketch file
  * @sc_state_lock: spinlock for sc_state and so on
  * @sc_state: Segctord state flags
  * @sc_flush_request: inode bitmap of metadata files to be flushed
@@ -158,13 +157,6 @@ struct nilfs_sc_info {
 
 	unsigned long		sc_flags;
 
-	/*
-	 * Pointer to an inode of the sketch.
-	 * This pointer is kept only while it contains data.
-	 * We protect it with a semaphore of the segment constructor.
-	 */
-	struct inode	       *sc_sketch_inode;
-
 	spinlock_t		sc_state_lock;
 	unsigned long		sc_state;
 	unsigned long		sc_flush_request;
diff --git a/include/linux/nilfs2_fs.h b/include/linux/nilfs2_fs.h
index aa93f0e..e9c84aa 100644
--- a/include/linux/nilfs2_fs.h
+++ b/include/linux/nilfs2_fs.h
@@ -494,7 +494,6 @@ nilfs_checkpoint_##name(const struct nilfs_checkpoint *cp)		\
 
 NILFS_CHECKPOINT_FNS(SNAPSHOT, snapshot)
 NILFS_CHECKPOINT_FNS(INVALID, invalid)
-NILFS_CHECKPOINT_FNS(SKETCH, sketch)
 
 /**
  * struct nilfs_cpinfo - checkpoint information
@@ -527,7 +526,6 @@ nilfs_cpinfo_##name(const struct nilfs_cpinfo *cpinfo)			\
 
 NILFS_CPINFO_FNS(SNAPSHOT, snapshot)
 NILFS_CPINFO_FNS(INVALID, invalid)
-NILFS_CPINFO_FNS(SKETCH, sketch)
 
 
 /**
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH -mm 3/5] nilfs2: mark minor flag for checkpoint created by internal operation
  2009-03-05 16:07 [PATCH -mm 0/5] nilfs2 settle matters related to disk format Ryusuke Konishi
  2009-03-05 16:07 ` [PATCH -mm 1/5] nilfs2: super block operations fix endian bug Ryusuke Konishi
  2009-03-05 16:07 ` [PATCH -mm 2/5] nilfs2: clean up sketch file Ryusuke Konishi
@ 2009-03-05 16:07 ` Ryusuke Konishi
  2009-03-05 16:07 ` [PATCH -mm 4/5] nilfs2: simplify handling of active state of segments Ryusuke Konishi
  2009-03-05 16:07 ` [PATCH -mm 5/5] nilfs2: introduce secondary super block Ryusuke Konishi
  4 siblings, 0 replies; 6+ messages in thread
From: Ryusuke Konishi @ 2009-03-05 16:07 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-fsdevel, linux-kernel, Ryusuke Konishi

Nilfs creates checkpoints even for garbage collection or metadata
updates such as checkpoint mode change.  So, user often sees
checkpoints created only by such internal operations.

This is inconvenient in some situations.  For example, application
that monitors checkpoints and changes them to snapshots, will fall
into an infinite loop because it cannot distinguish internally created
checkpoints.

This patch solves this sort of problem by adding a flag to checkpoint
for identification.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
---
 fs/nilfs2/segment.c       |    9 +++++++++
 fs/nilfs2/segment.h       |    3 +++
 include/linux/nilfs2_fs.h |    3 +++
 3 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
index 981c34a..2879704 100644
--- a/fs/nilfs2/segment.c
+++ b/fs/nilfs2/segment.c
@@ -462,6 +462,9 @@ static void nilfs_segctor_begin_finfo(struct nilfs_sc_info *sci,
 	sci->sc_binfo_ptr = sci->sc_finfo_ptr;
 	nilfs_segctor_map_segsum_entry(
 		sci, &sci->sc_binfo_ptr, sizeof(struct nilfs_finfo));
+
+	if (inode->i_sb && !test_bit(NILFS_SC_HAVE_DELTA, &sci->sc_flags))
+		set_bit(NILFS_SC_HAVE_DELTA, &sci->sc_flags);
 	/* skip finfo */
 }
 
@@ -887,6 +890,11 @@ static int nilfs_segctor_fill_in_checkpoint(struct nilfs_sc_info *sci)
 	raw_cp->cp_create = cpu_to_le64(sci->sc_seg_ctime);
 	raw_cp->cp_cno = cpu_to_le64(nilfs->ns_cno);
 
+	if (test_bit(NILFS_SC_HAVE_DELTA, &sci->sc_flags))
+		nilfs_checkpoint_clear_minor(raw_cp);
+	else
+		nilfs_checkpoint_set_minor(raw_cp);
+
 	nilfs_write_inode_common(sbi->s_ifile, &raw_cp->cp_ifile_inode, 1);
 	nilfs_cpfile_put_checkpoint(nilfs->ns_cpfile, nilfs->ns_cno, bh_cp);
 	return 0;
@@ -2091,6 +2099,7 @@ static void nilfs_segctor_complete_write(struct nilfs_sc_info *sci)
 		nilfs_set_last_segment(nilfs, segbuf->sb_pseg_start,
 				       segbuf->sb_sum.seg_seq, nilfs->ns_cno);
 
+		clear_bit(NILFS_SC_HAVE_DELTA, &sci->sc_flags);
 		clear_bit(NILFS_SC_DIRTY, &sci->sc_flags);
 		set_bit(NILFS_SC_SUPER_ROOT, &sci->sc_flags);
 	} else
diff --git a/fs/nilfs2/segment.h b/fs/nilfs2/segment.h
index fbd162d..bb7d417 100644
--- a/fs/nilfs2/segment.h
+++ b/fs/nilfs2/segment.h
@@ -185,6 +185,9 @@ enum {
 	NILFS_SC_SUPER_ROOT,	/* The latest segment has a super root */
 	NILFS_SC_PRIOR_FLUSH,	/* Requesting immediate flush without making a
 				   checkpoint */
+	NILFS_SC_HAVE_DELTA,	/* Next checkpoint will have update of files
+				   other than DAT, cpfile, sufile, or files
+				   moved by GC */
 };
 
 /* sc_state */
diff --git a/include/linux/nilfs2_fs.h b/include/linux/nilfs2_fs.h
index e9c84aa..cbce664 100644
--- a/include/linux/nilfs2_fs.h
+++ b/include/linux/nilfs2_fs.h
@@ -470,6 +470,7 @@ enum {
 	NILFS_CHECKPOINT_SNAPSHOT,
 	NILFS_CHECKPOINT_INVALID,
 	NILFS_CHECKPOINT_SKETCH,
+	NILFS_CHECKPOINT_MINOR,
 };
 
 #define NILFS_CHECKPOINT_FNS(flag, name)				\
@@ -494,6 +495,7 @@ nilfs_checkpoint_##name(const struct nilfs_checkpoint *cp)		\
 
 NILFS_CHECKPOINT_FNS(SNAPSHOT, snapshot)
 NILFS_CHECKPOINT_FNS(INVALID, invalid)
+NILFS_CHECKPOINT_FNS(MINOR, minor)
 
 /**
  * struct nilfs_cpinfo - checkpoint information
@@ -526,6 +528,7 @@ nilfs_cpinfo_##name(const struct nilfs_cpinfo *cpinfo)			\
 
 NILFS_CPINFO_FNS(SNAPSHOT, snapshot)
 NILFS_CPINFO_FNS(INVALID, invalid)
+NILFS_CPINFO_FNS(MINOR, minor)
 
 
 /**
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH -mm 4/5] nilfs2: simplify handling of active state of segments
  2009-03-05 16:07 [PATCH -mm 0/5] nilfs2 settle matters related to disk format Ryusuke Konishi
                   ` (2 preceding siblings ...)
  2009-03-05 16:07 ` [PATCH -mm 3/5] nilfs2: mark minor flag for checkpoint created by internal operation Ryusuke Konishi
@ 2009-03-05 16:07 ` Ryusuke Konishi
  2009-03-05 16:07 ` [PATCH -mm 5/5] nilfs2: introduce secondary super block Ryusuke Konishi
  4 siblings, 0 replies; 6+ messages in thread
From: Ryusuke Konishi @ 2009-03-05 16:07 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-fsdevel, linux-kernel, Ryusuke Konishi

will reduce some lines of segment constructor.  Previously, the state
was complexly controlled through a list of segments in order to keep
consistency in meta data of usage state of segments.  Instead, this
presents ``calculated'' active flags to userland cleaner program and
stop maintaining its real flag on disk.

Only by this fake flag, the cleaner cannot exactly know if each
segment is reclaimable or not. However, the recent extension of
nilfs_sustat ioctl struct
(nilfs2-extend-nilfs_sustat-ioctl-struct.patch) can pervent the
cleaner from reclaiming in-use segment wrongly.

So, now I can apply this for simplification.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
---
 fs/nilfs2/recovery.c  |   12 +---
 fs/nilfs2/segbuf.c    |   24 +-------
 fs/nilfs2/segbuf.h    |    6 +-
 fs/nilfs2/segment.c   |  173 +++---------------------------------------------
 fs/nilfs2/segment.h   |    5 +-
 fs/nilfs2/sufile.c    |    8 ++-
 fs/nilfs2/super.c     |    4 +-
 fs/nilfs2/the_nilfs.h |    5 ++
 8 files changed, 29 insertions(+), 208 deletions(-)

diff --git a/fs/nilfs2/recovery.c b/fs/nilfs2/recovery.c
index ef387b1..6ab4c8f 100644
--- a/fs/nilfs2/recovery.c
+++ b/fs/nilfs2/recovery.c
@@ -463,16 +463,6 @@ static int nilfs_prepare_segment_for_recovery(struct the_nilfs *nilfs,
 		nilfs_free_segment_entry(ent);
 	}
 
-	/*
-	 * The segment having the latest super root is active, and
-	 * should be deactivated on the next construction for recovery.
-	 */
-	err = -ENOMEM;
-	ent = nilfs_alloc_segment_entry(segnum[0]);
-	if (unlikely(!ent))
-		goto failed;
-	list_add_tail(&ent->list, &ri->ri_used_segments);
-
 	/* Allocate new segments for recovery */
 	err = nilfs_sufile_alloc(sufile, &segnum[0]);
 	if (unlikely(err))
@@ -757,7 +747,7 @@ int nilfs_recover_logical_segments(struct the_nilfs *nilfs,
 			goto failed;
 		}
 
-		err = nilfs_attach_segment_constructor(sbi, ri);
+		err = nilfs_attach_segment_constructor(sbi);
 		if (unlikely(err))
 			goto failed;
 
diff --git a/fs/nilfs2/segbuf.c b/fs/nilfs2/segbuf.c
index 3d3ea83..1e68821 100644
--- a/fs/nilfs2/segbuf.c
+++ b/fs/nilfs2/segbuf.c
@@ -64,27 +64,17 @@ struct nilfs_segment_buffer *nilfs_segbuf_new(struct super_block *sb)
 	INIT_LIST_HEAD(&segbuf->sb_list);
 	INIT_LIST_HEAD(&segbuf->sb_segsum_buffers);
 	INIT_LIST_HEAD(&segbuf->sb_payload_buffers);
-	segbuf->sb_segent = NULL;
 	return segbuf;
 }
 
 void nilfs_segbuf_free(struct nilfs_segment_buffer *segbuf)
 {
-	struct nilfs_segment_entry *ent = segbuf->sb_segent;
-
-	if (ent != NULL && list_empty(&ent->list)) {
-		/* free isolated segment list head */
-		nilfs_free_segment_entry(segbuf->sb_segent);
-		segbuf->sb_segent = NULL;
-	}
 	kmem_cache_free(nilfs_segbuf_cachep, segbuf);
 }
 
-int nilfs_segbuf_map(struct nilfs_segment_buffer *segbuf, __u64 segnum,
+void nilfs_segbuf_map(struct nilfs_segment_buffer *segbuf, __u64 segnum,
 		     unsigned long offset, struct the_nilfs *nilfs)
 {
-	struct nilfs_segment_entry *ent;
-
 	segbuf->sb_segnum = segnum;
 	nilfs_get_segment_range(nilfs, segnum, &segbuf->sb_fseg_start,
 				&segbuf->sb_fseg_end);
@@ -92,18 +82,6 @@ int nilfs_segbuf_map(struct nilfs_segment_buffer *segbuf, __u64 segnum,
 	segbuf->sb_pseg_start = segbuf->sb_fseg_start + offset;
 	segbuf->sb_rest_blocks =
 		segbuf->sb_fseg_end - segbuf->sb_pseg_start + 1;
-
-	/* Attach a segment list head */
-	ent = segbuf->sb_segent;
-	if (ent == NULL) {
-		segbuf->sb_segent = nilfs_alloc_segment_entry(segnum);
-		if (unlikely(!segbuf->sb_segent))
-			return -ENOMEM;
-	} else {
-		BUG_ON(ent->bh_su || !list_empty(&ent->list));
-		ent->segnum = segnum;
-	}
-	return 0;
 }
 
 void nilfs_segbuf_set_next_segnum(struct nilfs_segment_buffer *segbuf,
diff --git a/fs/nilfs2/segbuf.h b/fs/nilfs2/segbuf.h
index 25f2a5f..0c3076f 100644
--- a/fs/nilfs2/segbuf.h
+++ b/fs/nilfs2/segbuf.h
@@ -68,7 +68,6 @@ struct nilfs_segsum_info {
  * struct nilfs_segment_buffer - Segment buffer
  * @sb_super: back pointer to a superblock struct
  * @sb_list: List head to chain this structure
- * @sb_segent: Pointer for attaching a segment entry
  * @sb_sum: On-memory segment summary
  * @sb_segnum: Index number of the full segment
  * @sb_nextnum: Index number of the next full segment
@@ -83,7 +82,6 @@ struct nilfs_segsum_info {
 struct nilfs_segment_buffer {
 	struct super_block     *sb_super;
 	struct list_head	sb_list;
-	struct nilfs_segment_entry *sb_segent;
 
 	/* Segment information */
 	struct nilfs_segsum_info sb_sum;
@@ -125,8 +123,8 @@ int __init nilfs_init_segbuf_cache(void);
 void nilfs_destroy_segbuf_cache(void);
 struct nilfs_segment_buffer *nilfs_segbuf_new(struct super_block *);
 void nilfs_segbuf_free(struct nilfs_segment_buffer *);
-int nilfs_segbuf_map(struct nilfs_segment_buffer *, __u64, unsigned long,
-		     struct the_nilfs *);
+void nilfs_segbuf_map(struct nilfs_segment_buffer *, __u64, unsigned long,
+		      struct the_nilfs *);
 void nilfs_segbuf_set_next_segnum(struct nilfs_segment_buffer *, __u64,
 				  struct the_nilfs *);
 int nilfs_segbuf_reset(struct nilfs_segment_buffer *, unsigned, time_t);
diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
index 2879704..e43558d 100644
--- a/fs/nilfs2/segment.c
+++ b/fs/nilfs2/segment.c
@@ -1304,25 +1304,6 @@ static int nilfs_segctor_collect_blocks(struct nilfs_sc_info *sci, int mode)
 	return err;
 }
 
-static int nilfs_segctor_terminate_segment(struct nilfs_sc_info *sci,
-					   struct nilfs_segment_buffer *segbuf,
-					   struct inode *sufile)
-{
-	struct nilfs_segment_entry *ent = segbuf->sb_segent;
-	int err;
-
-	err = nilfs_open_segment_entry(ent, sufile);
-	if (unlikely(err))
-		return err;
-	nilfs_mdt_mark_buffer_dirty(ent->bh_su);
-	nilfs_mdt_mark_dirty(sufile);
-	nilfs_close_segment_entry(ent, sufile);
-
-	list_add_tail(&ent->list, &sci->sc_active_segments);
-	segbuf->sb_segent = NULL;
-	return 0;
-}
-
 static int nilfs_touch_segusage(struct inode *sufile, __u64 segnum)
 {
 	struct buffer_head *bh_su;
@@ -1342,7 +1323,6 @@ static int nilfs_segctor_begin_construction(struct nilfs_sc_info *sci,
 					    struct the_nilfs *nilfs)
 {
 	struct nilfs_segment_buffer *segbuf, *n;
-	struct inode *sufile = nilfs->ns_sufile;
 	__u64 nextnum;
 	int err;
 
@@ -1354,28 +1334,22 @@ static int nilfs_segctor_begin_construction(struct nilfs_sc_info *sci,
 	} else
 		segbuf = NILFS_FIRST_SEGBUF(&sci->sc_segbufs);
 
-	err = nilfs_segbuf_map(segbuf, nilfs->ns_segnum,
-			       nilfs->ns_pseg_offset, nilfs);
-	if (unlikely(err))
-		return err;
+	nilfs_segbuf_map(segbuf, nilfs->ns_segnum, nilfs->ns_pseg_offset,
+			 nilfs);
 
 	if (segbuf->sb_rest_blocks < NILFS_PSEG_MIN_BLOCKS) {
-		err = nilfs_segctor_terminate_segment(sci, segbuf, sufile);
-		if (unlikely(err))
-			return err;
-
 		nilfs_shift_to_next_segment(nilfs);
-		err = nilfs_segbuf_map(segbuf, nilfs->ns_segnum, 0, nilfs);
+		nilfs_segbuf_map(segbuf, nilfs->ns_segnum, 0, nilfs);
 	}
 	sci->sc_segbuf_nblocks = segbuf->sb_rest_blocks;
 
-	err = nilfs_touch_segusage(sufile, segbuf->sb_segnum);
+	err = nilfs_touch_segusage(nilfs->ns_sufile, segbuf->sb_segnum);
 	if (unlikely(err))
 		return err;
 
 	if (nilfs->ns_segnum == nilfs->ns_nextnum) {
 		/* Start from the head of a new full segment */
-		err = nilfs_sufile_alloc(sufile, &nextnum);
+		err = nilfs_sufile_alloc(nilfs->ns_sufile, &nextnum);
 		if (unlikely(err))
 			return err;
 	} else
@@ -1390,7 +1364,7 @@ static int nilfs_segctor_begin_construction(struct nilfs_sc_info *sci,
 		list_del_init(&segbuf->sb_list);
 		nilfs_segbuf_free(segbuf);
 	}
-	return err;
+	return 0;
 }
 
 static int nilfs_segctor_extend_segments(struct nilfs_sc_info *sci,
@@ -1421,10 +1395,7 @@ static int nilfs_segctor_extend_segments(struct nilfs_sc_info *sci,
 			goto failed;
 
 		/* map this buffer to region of segment on-disk */
-		err = nilfs_segbuf_map(segbuf, prev->sb_nextnum, 0, nilfs);
-		if (unlikely(err))
-			goto failed_segbuf;
-
+		nilfs_segbuf_map(segbuf, prev->sb_nextnum, 0, nilfs);
 		sci->sc_segbuf_nblocks += segbuf->sb_rest_blocks;
 
 		/* allocate the next next full segment */
@@ -2178,102 +2149,6 @@ static void nilfs_segctor_check_out_files(struct nilfs_sc_info *sci,
 }
 
 /*
- * Nasty routines to manipulate active flags on sufile.
- * These would be removed in a future release.
- */
-static void nilfs_segctor_reactivate_segments(struct nilfs_sc_info *sci,
-					      struct the_nilfs *nilfs)
-{
-	struct nilfs_segment_buffer *segbuf, *last;
-	struct nilfs_segment_entry *ent, *n;
-	struct inode *sufile = nilfs->ns_sufile;
-	struct list_head *head;
-
-	last = NILFS_LAST_SEGBUF(&sci->sc_segbufs);
-	nilfs_for_each_segbuf_before(segbuf, last, &sci->sc_segbufs) {
-		ent = segbuf->sb_segent;
-		if (!ent)
-			break; /* ignore unmapped segments (should check it?)*/
-		nilfs_segment_usage_set_active(ent->raw_su);
-		nilfs_close_segment_entry(ent, sufile);
-	}
-
-	head = &sci->sc_active_segments;
-	list_for_each_entry_safe(ent, n, head, list) {
-		nilfs_segment_usage_set_active(ent->raw_su);
-		nilfs_close_segment_entry(ent, sufile);
-	}
-}
-
-static int nilfs_segctor_deactivate_segments(struct nilfs_sc_info *sci,
-					     struct the_nilfs *nilfs)
-{
-	struct nilfs_segment_buffer *segbuf, *last;
-	struct nilfs_segment_entry *ent;
-	struct inode *sufile = nilfs->ns_sufile;
-	int err;
-
-	last = NILFS_LAST_SEGBUF(&sci->sc_segbufs);
-	nilfs_for_each_segbuf_before(segbuf, last, &sci->sc_segbufs) {
-		/*
-		 * Deactivate ongoing full segments.  The last segment is kept
-		 * active because it is a start point of recovery, and is not
-		 * relocatable until the super block points to a newer
-		 * checkpoint.
-		 */
-		ent = segbuf->sb_segent;
-		if (!ent)
-			break; /* ignore unmapped segments (should check it?)*/
-		err = nilfs_open_segment_entry(ent, sufile);
-		if (unlikely(err))
-			goto failed;
-		nilfs_segment_usage_clear_active(ent->raw_su);
-		BUG_ON(!buffer_dirty(ent->bh_su));
-	}
-
-	list_for_each_entry(ent, &sci->sc_active_segments, list) {
-		err = nilfs_open_segment_entry(ent, sufile);
-		if (unlikely(err))
-			goto failed;
-		nilfs_segment_usage_clear_active(ent->raw_su);
-		WARN_ON(!buffer_dirty(ent->bh_su));
-	}
-	return 0;
-
- failed:
-	nilfs_segctor_reactivate_segments(sci, nilfs);
-	return err;
-}
-
-static void nilfs_segctor_bead_completed_segments(struct nilfs_sc_info *sci)
-{
-	struct nilfs_segment_buffer *segbuf, *last;
-	struct nilfs_segment_entry *ent;
-
-	/* move each segbuf->sb_segent to the list of used active segments */
-	last = NILFS_LAST_SEGBUF(&sci->sc_segbufs);
-	nilfs_for_each_segbuf_before(segbuf, last, &sci->sc_segbufs) {
-		ent = segbuf->sb_segent;
-		if (!ent)
-			break; /* ignore unmapped segments (should check it?)*/
-		list_add_tail(&ent->list, &sci->sc_active_segments);
-		segbuf->sb_segent = NULL;
-	}
-}
-
-static void nilfs_segctor_commit_deactivate_segments(struct nilfs_sc_info *sci,
-						     struct the_nilfs *nilfs)
-{
-	struct nilfs_segment_entry *ent, *n;
-
-	list_for_each_entry_safe(ent, n, &sci->sc_active_segments, list) {
-		list_del(&ent->list);
-		nilfs_close_segment_entry(ent, nilfs->ns_sufile);
-		nilfs_free_segment_entry(ent);
-	}
-}
-
-/*
  * Main procedure of segment constructor
  */
 static int nilfs_segctor_do_construct(struct nilfs_sc_info *sci, int mode)
@@ -2322,11 +2197,6 @@ static int nilfs_segctor_do_construct(struct nilfs_sc_info *sci, int mode)
 		if (unlikely(err))
 			goto failed;
 
-		if (has_sr) {
-			err = nilfs_segctor_deactivate_segments(sci, nilfs);
-			if (unlikely(err))
-				goto failed;
-		}
 		if (sci->sc_stage.flags & NILFS_CF_IFILE_STARTED)
 			nilfs_segctor_fill_in_file_bmap(sci, sbi->s_ifile);
 
@@ -2353,12 +2223,10 @@ static int nilfs_segctor_do_construct(struct nilfs_sc_info *sci, int mode)
 		nilfs_segctor_complete_write(sci);
 
 		/* Commit segments */
-		nilfs_segctor_bead_completed_segments(sci);
 		if (has_sr) {
 			down_write(&nilfs->ns_sem);
 			nilfs_update_last_segment(sbi, 1);
 			up_write(&nilfs->ns_sem);
-			nilfs_segctor_commit_deactivate_segments(sci, nilfs);
 			nilfs_segctor_commit_free_segments(sci);
 			nilfs_segctor_clear_metadata_dirty(sci);
 		}
@@ -2379,8 +2247,6 @@ static int nilfs_segctor_do_construct(struct nilfs_sc_info *sci, int mode)
  failed_to_make_up:
 	if (sci->sc_stage.flags & NILFS_CF_IFILE_STARTED)
 		nilfs_redirty_inodes(&sci->sc_dirty_files);
-	if (has_sr)
-		nilfs_segctor_reactivate_segments(sci, nilfs);
 
  failed:
 	if (nilfs_doing_gc())
@@ -2942,23 +2808,11 @@ static void nilfs_segctor_kill_thread(struct nilfs_sc_info *sci)
 	}
 }
 
-static int nilfs_segctor_init(struct nilfs_sc_info *sci,
-			      struct nilfs_recovery_info *ri)
+static int nilfs_segctor_init(struct nilfs_sc_info *sci)
 {
-	int err;
-
 	sci->sc_seq_done = sci->sc_seq_request;
-	if (ri)
-		list_splice_init(&ri->ri_used_segments,
-				 sci->sc_active_segments.prev);
 
-	err = nilfs_segctor_start_thread(sci);
-	if (err) {
-		if (ri)
-			list_splice_init(&sci->sc_active_segments,
-					 ri->ri_used_segments.prev);
-	}
-	return err;
+	return nilfs_segctor_start_thread(sci);
 }
 
 /*
@@ -2982,7 +2836,6 @@ static struct nilfs_sc_info *nilfs_segctor_new(struct nilfs_sb_info *sbi)
 	INIT_LIST_HEAD(&sci->sc_dirty_files);
 	INIT_LIST_HEAD(&sci->sc_segbufs);
 	INIT_LIST_HEAD(&sci->sc_gc_inodes);
-	INIT_LIST_HEAD(&sci->sc_active_segments);
 	INIT_LIST_HEAD(&sci->sc_cleaning_segments);
 	INIT_LIST_HEAD(&sci->sc_copied_buffers);
 
@@ -3048,8 +2901,6 @@ static void nilfs_segctor_destroy(struct nilfs_sc_info *sci)
 			      "dirty file(s) after the final construction\n");
 		nilfs_dispose_list(sbi, &sci->sc_dirty_files, 1);
 	}
-	if (!list_empty(&sci->sc_active_segments))
-		nilfs_dispose_segment_list(&sci->sc_active_segments);
 
 	if (!list_empty(&sci->sc_cleaning_segments))
 		nilfs_dispose_segment_list(&sci->sc_cleaning_segments);
@@ -3064,7 +2915,6 @@ static void nilfs_segctor_destroy(struct nilfs_sc_info *sci)
 /**
  * nilfs_attach_segment_constructor - attach a segment constructor
  * @sbi: nilfs_sb_info
- * @ri: nilfs_recovery_info
  *
  * nilfs_attach_segment_constructor() allocates a struct nilfs_sc_info,
  * initilizes it, and starts the segment constructor.
@@ -3074,8 +2924,7 @@ static void nilfs_segctor_destroy(struct nilfs_sc_info *sci)
  *
  * %-ENOMEM - Insufficient memory available.
  */
-int nilfs_attach_segment_constructor(struct nilfs_sb_info *sbi,
-				     struct nilfs_recovery_info *ri)
+int nilfs_attach_segment_constructor(struct nilfs_sb_info *sbi)
 {
 	struct the_nilfs *nilfs = sbi->s_nilfs;
 	int err;
@@ -3087,7 +2936,7 @@ int nilfs_attach_segment_constructor(struct nilfs_sb_info *sbi,
 		return -ENOMEM;
 
 	nilfs_attach_writer(nilfs, sbi);
-	err = nilfs_segctor_init(NILFS_SC(sbi), ri);
+	err = nilfs_segctor_init(NILFS_SC(sbi));
 	if (err) {
 		nilfs_detach_writer(nilfs, sbi);
 		kfree(sbi->s_sc_info);
diff --git a/fs/nilfs2/segment.h b/fs/nilfs2/segment.h
index bb7d417..4a64eb8 100644
--- a/fs/nilfs2/segment.h
+++ b/fs/nilfs2/segment.h
@@ -90,7 +90,6 @@ struct nilfs_segsum_pointer {
  * @sc_nblk_inc: Block count of current generation
  * @sc_dirty_files: List of files to be written
  * @sc_gc_inodes: List of GC inodes having blocks to be written
- * @sc_active_segments: List of active segments that were already written out
  * @sc_cleaning_segments: List of segments to be freed through construction
  * @sc_copied_buffers: List of copied buffers (buffer heads) to freeze data
  * @sc_dsync_inode: inode whose data pages are written for a sync operation
@@ -132,7 +131,6 @@ struct nilfs_sc_info {
 
 	struct list_head	sc_dirty_files;
 	struct list_head	sc_gc_inodes;
-	struct list_head	sc_active_segments;
 	struct list_head	sc_cleaning_segments;
 	struct list_head	sc_copied_buffers;
 
@@ -232,8 +230,7 @@ extern int nilfs_segctor_add_segments_to_be_freed(struct nilfs_sc_info *,
 						  __u64 *, size_t);
 extern void nilfs_segctor_clear_segments_to_be_freed(struct nilfs_sc_info *);
 
-extern int nilfs_attach_segment_constructor(struct nilfs_sb_info *,
-					    struct nilfs_recovery_info *);
+extern int nilfs_attach_segment_constructor(struct nilfs_sb_info *);
 extern void nilfs_detach_segment_constructor(struct nilfs_sb_info *);
 
 /* recovery.c */
diff --git a/fs/nilfs2/sufile.c b/fs/nilfs2/sufile.c
index 4cf47e0..c774cf3 100644
--- a/fs/nilfs2/sufile.c
+++ b/fs/nilfs2/sufile.c
@@ -158,7 +158,6 @@ int nilfs_sufile_alloc(struct inode *sufile, __u64 *segnump)
 			if (!nilfs_segment_usage_clean(su))
 				continue;
 			/* found a clean segment */
-			nilfs_segment_usage_set_active(su);
 			nilfs_segment_usage_set_dirty(su);
 			kunmap_atomic(kaddr, KM_USER0);
 
@@ -591,6 +590,7 @@ ssize_t nilfs_sufile_get_suinfo(struct inode *sufile, __u64 segnum,
 	struct buffer_head *su_bh;
 	struct nilfs_segment_usage *su;
 	size_t susz = NILFS_MDT(sufile)->mi_entry_size;
+	struct the_nilfs *nilfs = NILFS_MDT(sufile)->mi_nilfs;
 	void *kaddr;
 	unsigned long nsegs, segusages_per_block;
 	ssize_t n;
@@ -623,7 +623,11 @@ ssize_t nilfs_sufile_get_suinfo(struct inode *sufile, __u64 segnum,
 		for (j = 0; j < n; j++, su = (void *)su + susz) {
 			si[i + j].sui_lastmod = le64_to_cpu(su->su_lastmod);
 			si[i + j].sui_nblocks = le32_to_cpu(su->su_nblocks);
-			si[i + j].sui_flags = le32_to_cpu(su->su_flags);
+			si[i + j].sui_flags = le32_to_cpu(su->su_flags) &
+				~(1UL << NILFS_SEGMENT_USAGE_ACTIVE);
+			if (nilfs_segment_is_active(nilfs, segnum + i + j))
+				si[i + j].sui_flags |=
+					(1UL << NILFS_SEGMENT_USAGE_ACTIVE);
 		}
 		kunmap_atomic(kaddr, KM_USER0);
 		brelse(su_bh);
diff --git a/fs/nilfs2/super.c b/fs/nilfs2/super.c
index b7519c3..ef31e9a 100644
--- a/fs/nilfs2/super.c
+++ b/fs/nilfs2/super.c
@@ -868,7 +868,7 @@ nilfs_fill_super(struct super_block *sb, void *data, int silent,
 	}
 
 	if (!(sb->s_flags & MS_RDONLY)) {
-		err = nilfs_attach_segment_constructor(sbi, NULL);
+		err = nilfs_attach_segment_constructor(sbi);
 		if (err)
 			goto failed_checkpoint;
 	}
@@ -1001,7 +1001,7 @@ static int nilfs_remount(struct super_block *sb, int *flags, char *data)
 		nilfs_clear_opt(sbi, SNAPSHOT);
 		sbi->s_snapshot_cno = 0;
 
-		err = nilfs_attach_segment_constructor(sbi, NULL);
+		err = nilfs_attach_segment_constructor(sbi);
 		if (err)
 			goto rw_remount_failed;
 
diff --git a/fs/nilfs2/the_nilfs.h b/fs/nilfs2/the_nilfs.h
index af566e7..d750e48 100644
--- a/fs/nilfs2/the_nilfs.h
+++ b/fs/nilfs2/the_nilfs.h
@@ -280,4 +280,9 @@ static inline __u64 nilfs_last_cno(struct the_nilfs *nilfs)
 	return cno;
 }
 
+static inline int nilfs_segment_is_active(struct the_nilfs *nilfs, __u64 n)
+{
+	return n == nilfs->ns_segnum || n == nilfs->ns_nextnum;
+}
+
 #endif /* _THE_NILFS_H */
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH -mm 5/5] nilfs2: introduce secondary super block
  2009-03-05 16:07 [PATCH -mm 0/5] nilfs2 settle matters related to disk format Ryusuke Konishi
                   ` (3 preceding siblings ...)
  2009-03-05 16:07 ` [PATCH -mm 4/5] nilfs2: simplify handling of active state of segments Ryusuke Konishi
@ 2009-03-05 16:07 ` Ryusuke Konishi
  4 siblings, 0 replies; 6+ messages in thread
From: Ryusuke Konishi @ 2009-03-05 16:07 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-fsdevel, linux-kernel, Ryusuke Konishi

The former versions didn't have extra super blocks.  This improves
the weak point by introducing another super block at unused region in
tail of the partition.

This doesn't break disk format compatibility; older versions just
ingore the secondary super block, and new versions just recover it if
it doesn't exist.  The partition created by an old mkfs may not have
unused region, but in that case, the secondary super block will not be
added.

This doesn't make more redundant copies of the super block; it is a
future work.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
---
 fs/nilfs2/nilfs.h         |    7 +-
 fs/nilfs2/recovery.c      |    1 -
 fs/nilfs2/segment.c       |    8 +-
 fs/nilfs2/segment.h       |    2 -
 fs/nilfs2/super.c         |  229 +++++++++++++++++++--------------------------
 fs/nilfs2/the_nilfs.c     |  179 ++++++++++++++++++++++++++++++-----
 fs/nilfs2/the_nilfs.h     |   18 +++-
 include/linux/nilfs2_fs.h |    4 +
 8 files changed, 273 insertions(+), 175 deletions(-)

diff --git a/fs/nilfs2/nilfs.h b/fs/nilfs2/nilfs.h
index 84e747d..dc85a51 100644
--- a/fs/nilfs2/nilfs.h
+++ b/fs/nilfs2/nilfs.h
@@ -275,13 +275,10 @@ extern void nilfs_error(struct super_block *, const char *, const char *, ...)
 extern void nilfs_warning(struct super_block *, const char *, const char *, ...)
        __attribute__ ((format (printf, 3, 4)));
 extern struct nilfs_super_block *
-nilfs_load_super_block(struct super_block *, struct buffer_head **);
-extern struct nilfs_super_block *
-nilfs_reload_super_block(struct super_block *, struct buffer_head **, int);
+nilfs_read_super_block(struct super_block *, u64, int, struct buffer_head **);
 extern int nilfs_store_magic_and_option(struct super_block *,
 					struct nilfs_super_block *, char *);
-extern void nilfs_update_last_segment(struct nilfs_sb_info *, int);
-extern int nilfs_commit_super(struct nilfs_sb_info *);
+extern int nilfs_commit_super(struct nilfs_sb_info *, int);
 extern int nilfs_attach_checkpoint(struct nilfs_sb_info *, __u64);
 extern void nilfs_detach_checkpoint(struct nilfs_sb_info *);
 
diff --git a/fs/nilfs2/recovery.c b/fs/nilfs2/recovery.c
index 6ab4c8f..6ade096 100644
--- a/fs/nilfs2/recovery.c
+++ b/fs/nilfs2/recovery.c
@@ -870,7 +870,6 @@ int nilfs_search_super_root(struct the_nilfs *nilfs, struct nilfs_sb_info *sbi,
 		if (scan_newer)
 			ri->ri_need_recovery = NILFS_RECOVERY_SR_UPDATED;
 		else {
-			nilfs->ns_prot_seq = ssi.seg_seq;
 			if (nilfs->ns_mount_state & NILFS_VALID_FS)
 				goto super_root_found;
 			scan_newer = 1;
diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
index e43558d..fb70ec3 100644
--- a/fs/nilfs2/segment.c
+++ b/fs/nilfs2/segment.c
@@ -2068,7 +2068,8 @@ static void nilfs_segctor_complete_write(struct nilfs_sc_info *sci)
 
 	if (update_sr) {
 		nilfs_set_last_segment(nilfs, segbuf->sb_pseg_start,
-				       segbuf->sb_sum.seg_seq, nilfs->ns_cno);
+				       segbuf->sb_sum.seg_seq, nilfs->ns_cno++);
+		sbi->s_super->s_dirt = 1;
 
 		clear_bit(NILFS_SC_HAVE_DELTA, &sci->sc_flags);
 		clear_bit(NILFS_SC_DIRTY, &sci->sc_flags);
@@ -2224,9 +2225,6 @@ static int nilfs_segctor_do_construct(struct nilfs_sc_info *sci, int mode)
 
 		/* Commit segments */
 		if (has_sr) {
-			down_write(&nilfs->ns_sem);
-			nilfs_update_last_segment(sbi, 1);
-			up_write(&nilfs->ns_sem);
 			nilfs_segctor_commit_free_segments(sci);
 			nilfs_segctor_clear_metadata_dirty(sci);
 		}
@@ -2564,7 +2562,7 @@ static int nilfs_segctor_construct(struct nilfs_sc_info *sci,
 		if (test_bit(NILFS_SC_SUPER_ROOT, &sci->sc_flags) &&
 		    nilfs_discontinued(nilfs)) {
 			down_write(&nilfs->ns_sem);
-			req->sb_err = nilfs_commit_super(sbi);
+			req->sb_err = nilfs_commit_super(sbi, 0);
 			up_write(&nilfs->ns_sem);
 		}
 	}
diff --git a/fs/nilfs2/segment.h b/fs/nilfs2/segment.h
index 4a64eb8..a98fc1e 100644
--- a/fs/nilfs2/segment.h
+++ b/fs/nilfs2/segment.h
@@ -206,8 +206,6 @@ enum {
 					   logical segment with a super root */
 #define NILFS_SC_DEFAULT_SR_FREQ    30  /* Maximum frequency of super root
 					   creation */
-#define NILFS_SC_DEFAULT_SB_FREQ    30  /* Minimum interval of periodical
-					   update of superblock (reserved) */
 
 /*
  * The default threshold amount of data, in block counts.
diff --git a/fs/nilfs2/super.c b/fs/nilfs2/super.c
index ef31e9a..e2ced82 100644
--- a/fs/nilfs2/super.c
+++ b/fs/nilfs2/super.c
@@ -103,8 +103,9 @@ void nilfs_error(struct super_block *sb, const char *function,
 		down_write(&nilfs->ns_sem);
 		if (!(nilfs->ns_mount_state & NILFS_ERROR_FS)) {
 			nilfs->ns_mount_state |= NILFS_ERROR_FS;
-			nilfs->ns_sbp->s_state |= cpu_to_le16(NILFS_ERROR_FS);
-			nilfs_commit_super(sbi);
+			nilfs->ns_sbp[0]->s_state |=
+				cpu_to_le16(NILFS_ERROR_FS);
+			nilfs_commit_super(sbi, 1);
 		}
 		up_write(&nilfs->ns_sem);
 
@@ -208,90 +209,106 @@ static void nilfs_clear_inode(struct inode *inode)
 	nilfs_btnode_cache_clear(&ii->i_btnode_cache);
 }
 
-/**
- * nilfs_update_last_segment - change pointer to the latest segment
- * @sbi: nilfs_sb_info
- * @update_cno: flag whether to update checkpoint number.
- *
- * nilfs_update_last_segment() changes information in the super block
- * after a partial segment is written out successfully. The super
- * block is marked dirty. It will be written out at the next VFS sync
- * operations such as sync_supers() and generic_shutdown_super().
- */
-void nilfs_update_last_segment(struct nilfs_sb_info *sbi, int update_cno)
-{
-	struct the_nilfs *nilfs = sbi->s_nilfs;
-	struct nilfs_super_block *sbp = nilfs->ns_sbp;
-
-	/* nilfs->sem must be locked by the caller. */
-	spin_lock(&nilfs->ns_last_segment_lock);
-	if (update_cno)
-		nilfs->ns_last_cno = nilfs->ns_cno++;
-	sbp->s_last_seq = cpu_to_le64(nilfs->ns_last_seq);
-	sbp->s_last_pseg = cpu_to_le64(nilfs->ns_last_pseg);
-	sbp->s_last_cno = cpu_to_le64(nilfs->ns_last_cno);
-	spin_unlock(&nilfs->ns_last_segment_lock);
-
-	sbi->s_super->s_dirt = 1; /* must be set if delaying the call of
-				     nilfs_commit_super() */
-}
-
-static int nilfs_sync_super(struct nilfs_sb_info *sbi)
+static int nilfs_sync_super(struct nilfs_sb_info *sbi, int dupsb)
 {
 	struct the_nilfs *nilfs = sbi->s_nilfs;
 	int err;
 	int barrier_done = 0;
 
 	if (nilfs_test_opt(sbi, BARRIER)) {
-		set_buffer_ordered(nilfs->ns_sbh);
+		set_buffer_ordered(nilfs->ns_sbh[0]);
 		barrier_done = 1;
 	}
  retry:
-	set_buffer_dirty(nilfs->ns_sbh);
-	err = sync_dirty_buffer(nilfs->ns_sbh);
+	set_buffer_dirty(nilfs->ns_sbh[0]);
+	err = sync_dirty_buffer(nilfs->ns_sbh[0]);
 	if (err == -EOPNOTSUPP && barrier_done) {
 		nilfs_warning(sbi->s_super, __func__,
 			      "barrier-based sync failed. "
 			      "disabling barriers\n");
 		nilfs_clear_opt(sbi, BARRIER);
 		barrier_done = 0;
-		clear_buffer_ordered(nilfs->ns_sbh);
+		clear_buffer_ordered(nilfs->ns_sbh[0]);
 		goto retry;
 	}
-	if (unlikely(err))
+	if (unlikely(err)) {
 		printk(KERN_ERR
 		       "NILFS: unable to write superblock (err=%d)\n", err);
-	else {
+		if (err == -EIO && nilfs->ns_sbh[1]) {
+			nilfs_fall_back_super_block(nilfs);
+			goto retry;
+		}
+	} else {
+		struct nilfs_super_block *sbp = nilfs->ns_sbp[0];
+
+		/*
+		 * The latest segment becomes trailable from the position
+		 * written in superblock.
+		 */
 		clear_nilfs_discontinued(nilfs);
-		spin_lock(&nilfs->ns_last_segment_lock);
-		nilfs->ns_prot_seq = le64_to_cpu(nilfs->ns_sbp->s_last_seq);
-		spin_unlock(&nilfs->ns_last_segment_lock);
+
+		/* update GC protection for recent segments */
+		if (nilfs->ns_sbh[1]) {
+			sbp = NULL;
+			if (dupsb) {
+				set_buffer_dirty(nilfs->ns_sbh[1]);
+				if (!sync_dirty_buffer(nilfs->ns_sbh[1]))
+					sbp = nilfs->ns_sbp[1];
+			}
+		}
+		if (sbp) {
+			spin_lock(&nilfs->ns_last_segment_lock);
+			nilfs->ns_prot_seq = le64_to_cpu(sbp->s_last_seq);
+			spin_unlock(&nilfs->ns_last_segment_lock);
+		}
 	}
 
 	return err;
 }
 
-int nilfs_commit_super(struct nilfs_sb_info *sbi)
+int nilfs_commit_super(struct nilfs_sb_info *sbi, int dupsb)
 {
 	struct the_nilfs *nilfs = sbi->s_nilfs;
-	struct nilfs_super_block *sbp = nilfs->ns_sbp;
+	struct nilfs_super_block **sbp = nilfs->ns_sbp;
 	sector_t nfreeblocks;
+	time_t t;
 	int err;
 
 	/* nilfs->sem must be locked by the caller. */
+	if (sbp[0]->s_magic != NILFS_SUPER_MAGIC) {
+		if (sbp[1] && sbp[1]->s_magic == NILFS_SUPER_MAGIC)
+			nilfs_swap_super_block(nilfs);
+		else {
+			printk(KERN_CRIT "NILFS: superblock broke on dev %s\n",
+			       sbi->s_super->s_id);
+			return -EIO;
+		}
+	}
 	err = nilfs_count_free_blocks(nilfs, &nfreeblocks);
 	if (unlikely(err)) {
 		printk(KERN_ERR "NILFS: failed to count free blocks\n");
 		return err;
 	}
-	sbp->s_free_blocks_count = cpu_to_le64(nfreeblocks);
-	sbp->s_wtime = cpu_to_le64(get_seconds());
-	sbp->s_sum = 0;
-	sbp->s_sum = cpu_to_le32(crc32_le(nilfs->ns_crc_seed,
-					  (unsigned char *)sbp,
-					  le16_to_cpu(sbp->s_bytes)));
+	spin_lock(&nilfs->ns_last_segment_lock);
+	sbp[0]->s_last_seq = cpu_to_le64(nilfs->ns_last_seq);
+	sbp[0]->s_last_pseg = cpu_to_le64(nilfs->ns_last_pseg);
+	sbp[0]->s_last_cno = cpu_to_le64(nilfs->ns_last_cno);
+	spin_unlock(&nilfs->ns_last_segment_lock);
+
+	t = get_seconds();
+	nilfs->ns_sbwtime[0] = t;
+	sbp[0]->s_free_blocks_count = cpu_to_le64(nfreeblocks);
+	sbp[0]->s_wtime = cpu_to_le64(t);
+	sbp[0]->s_sum = 0;
+	sbp[0]->s_sum = cpu_to_le32(crc32_le(nilfs->ns_crc_seed,
+					     (unsigned char *)sbp[0],
+					     nilfs->ns_sbsize));
+	if (dupsb && sbp[1]) {
+		memcpy(sbp[1], sbp[0], nilfs->ns_sbsize);
+		nilfs->ns_sbwtime[1] = t;
+	}
 	sbi->s_super->s_dirt = 0;
-	return nilfs_sync_super(sbi);
+	return nilfs_sync_super(sbi, dupsb);
 }
 
 static void nilfs_put_super(struct super_block *sb)
@@ -303,8 +320,8 @@ static void nilfs_put_super(struct super_block *sb)
 
 	if (!(sb->s_flags & MS_RDONLY)) {
 		down_write(&nilfs->ns_sem);
-		nilfs->ns_sbp->s_state = cpu_to_le16(nilfs->ns_mount_state);
-		nilfs_commit_super(sbi);
+		nilfs->ns_sbp[0]->s_state = cpu_to_le16(nilfs->ns_mount_state);
+		nilfs_commit_super(sbi, 1);
 		up_write(&nilfs->ns_sem);
 	}
 
@@ -330,7 +347,7 @@ static void nilfs_put_super(struct super_block *sb)
  *   2.    down_write(&nilfs->ns_sem)
  *
  * Inside NILFS, locking ns_sem is enough to protect s_dirt and the buffer
- * of the super block (nilfs->ns_sbp).
+ * of the super block (nilfs->ns_sbp[]).
  *
  * In most cases, VFS functions call lock_super() before calling these
  * methods.  So we must be careful not to bring on deadlocks when using
@@ -346,8 +363,19 @@ static void nilfs_write_super(struct super_block *sb)
 	struct the_nilfs *nilfs = sbi->s_nilfs;
 
 	down_write(&nilfs->ns_sem);
-	if (!(sb->s_flags & MS_RDONLY))
-		nilfs_commit_super(sbi);
+	if (!(sb->s_flags & MS_RDONLY)) {
+		struct nilfs_super_block **sbp = nilfs->ns_sbp;
+		u64 t = get_seconds();
+		int dupsb;
+
+		if (!nilfs_discontinued(nilfs) && t >= nilfs->ns_sbwtime[0] &&
+		    t < nilfs->ns_sbwtime[0] + NILFS_SB_FREQ) {
+			up_write(&nilfs->ns_sem);
+			return;
+		}
+		dupsb = sbp[1] && t > nilfs->ns_sbwtime[1] + NILFS_ALTSB_FREQ;
+		nilfs_commit_super(sbi, dupsb);
+	}
 	sb->s_dirt = 0;
 	up_write(&nilfs->ns_sem);
 }
@@ -436,7 +464,7 @@ static int nilfs_mark_recovery_complete(struct nilfs_sb_info *sbi)
 	down_write(&nilfs->ns_sem);
 	if (!(nilfs->ns_mount_state & NILFS_VALID_FS)) {
 		nilfs->ns_mount_state |= NILFS_VALID_FS;
-		err = nilfs_commit_super(sbi);
+		err = nilfs_commit_super(sbi, 1);
 		if (likely(!err))
 			printk(KERN_INFO "NILFS: recovery complete.\n");
 	}
@@ -652,7 +680,7 @@ nilfs_set_default_options(struct nilfs_sb_info *sbi,
 static int nilfs_setup_super(struct nilfs_sb_info *sbi)
 {
 	struct the_nilfs *nilfs = sbi->s_nilfs;
-	struct nilfs_super_block *sbp = nilfs->ns_sbp;
+	struct nilfs_super_block *sbp = nilfs->ns_sbp[0];
 	int max_mnt_count = le16_to_cpu(sbp->s_max_mnt_count);
 	int mnt_count = le16_to_cpu(sbp->s_mnt_count);
 
@@ -674,88 +702,29 @@ static int nilfs_setup_super(struct nilfs_sb_info *sbi)
 	sbp->s_mnt_count = cpu_to_le16(mnt_count + 1);
 	sbp->s_state = cpu_to_le16(le16_to_cpu(sbp->s_state) & ~NILFS_VALID_FS);
 	sbp->s_mtime = cpu_to_le64(get_seconds());
-	return nilfs_commit_super(sbi);
+	return nilfs_commit_super(sbi, 1);
 }
 
-struct nilfs_super_block *
-nilfs_load_super_block(struct super_block *sb, struct buffer_head **pbh)
+struct nilfs_super_block *nilfs_read_super_block(struct super_block *sb,
+						 u64 pos, int blocksize,
+						 struct buffer_head **pbh)
 {
-	int blocksize;
-	unsigned long offset, sb_index;
-
-	/*
-	 * Adjusting block size
-	 * Blocksize will be enlarged when it is smaller than hardware
-	 * sector size.
-	 * Disk format of superblock does not change.
-	 */
-	blocksize = sb_min_blocksize(sb, BLOCK_SIZE);
-	if (!blocksize) {
-		printk(KERN_ERR
-		       "NILFS: unable to set blocksize of superblock\n");
-		return NULL;
-	}
-	sb_index = NILFS_SB_OFFSET_BYTES / blocksize;
-	offset = NILFS_SB_OFFSET_BYTES % blocksize;
+	unsigned long long sb_index = pos;
+	unsigned long offset;
 
+	offset = do_div(sb_index, blocksize);
 	*pbh = sb_bread(sb, sb_index);
-	if (!*pbh) {
-		printk(KERN_ERR "NILFS: unable to read superblock\n");
+	if (!*pbh)
 		return NULL;
-	}
 	return (struct nilfs_super_block *)((char *)(*pbh)->b_data + offset);
 }
 
-struct nilfs_super_block *
-nilfs_reload_super_block(struct super_block *sb, struct buffer_head **pbh,
-			 int blocksize)
-{
-	struct nilfs_super_block *sbp;
-	unsigned long offset, sb_index;
-	int hw_blocksize = bdev_hardsect_size(sb->s_bdev);
-
-	if (blocksize < hw_blocksize) {
-		printk(KERN_ERR
-		       "NILFS: blocksize %d too small for device "
-		       "(sector-size = %d).\n",
-		       blocksize, hw_blocksize);
-		goto failed_sbh;
-	}
-	brelse(*pbh);
-	sb_set_blocksize(sb, blocksize);
-
-	sb_index = NILFS_SB_OFFSET_BYTES / blocksize;
-	offset = NILFS_SB_OFFSET_BYTES % blocksize;
-
-	*pbh = sb_bread(sb, sb_index);
-	if (!*pbh) {
-		printk(KERN_ERR
-		       "NILFS: cannot read superblock on 2nd try.\n");
-		goto failed;
-	}
-
-	sbp = (struct nilfs_super_block *)((char *)(*pbh)->b_data + offset);
-	if (sbp->s_magic != cpu_to_le16(NILFS_SUPER_MAGIC)) {
-		printk(KERN_ERR
-		       "NILFS: !? Magic mismatch on 2nd try.\n");
-		goto failed_sbh;
-	}
-	return sbp;
-
- failed_sbh:
-	brelse(*pbh);
-
- failed:
-	return NULL;
-}
-
 int nilfs_store_magic_and_option(struct super_block *sb,
 				 struct nilfs_super_block *sbp,
 				 char *data)
 {
 	struct nilfs_sb_info *sbi = NILFS_SB(sb);
 
-	/* trying to fill super (1st stage) */
 	sb->s_magic = le16_to_cpu(sbp->s_magic);
 
 	/* FS independent flags */
@@ -763,11 +732,6 @@ int nilfs_store_magic_and_option(struct super_block *sb,
 	sb->s_flags |= MS_NOATIME;
 #endif
 
-	if (sb->s_magic != NILFS_SUPER_MAGIC) {
-		printk("NILFS: Can't find nilfs on dev %s.\n", sb->s_id);
-		return -EINVAL;
-	}
-
 	nilfs_set_default_options(sbi, sbp);
 
 	sbi->s_resuid = le16_to_cpu(sbp->s_def_resuid);
@@ -775,10 +739,7 @@ int nilfs_store_magic_and_option(struct super_block *sb,
 	sbi->s_interval = le32_to_cpu(sbp->s_c_interval);
 	sbi->s_watermark = le32_to_cpu(sbp->s_c_block_max);
 
-	if (!parse_options(data, sb))
-		return -EINVAL;
-
-	return 0;
+	return !parse_options(data, sb) ? -EINVAL : 0 ;
 }
 
 /**
@@ -967,12 +928,12 @@ static int nilfs_remount(struct super_block *sb, int *flags, char *data)
 		 * the RDONLY flag and then mark the partition as valid again.
 		 */
 		down_write(&nilfs->ns_sem);
-		sbp = nilfs->ns_sbp;
+		sbp = nilfs->ns_sbp[0];
 		if (!(sbp->s_state & le16_to_cpu(NILFS_VALID_FS)) &&
 		    (nilfs->ns_mount_state & NILFS_VALID_FS))
 			sbp->s_state = cpu_to_le16(nilfs->ns_mount_state);
 		sbp->s_mtime = cpu_to_le64(get_seconds());
-		nilfs_commit_super(sbi);
+		nilfs_commit_super(sbi, 1);
 		up_write(&nilfs->ns_sem);
 	} else {
 		/*
diff --git a/fs/nilfs2/the_nilfs.c b/fs/nilfs2/the_nilfs.c
index 661ab76..f233ba7 100644
--- a/fs/nilfs2/the_nilfs.c
+++ b/fs/nilfs2/the_nilfs.c
@@ -25,6 +25,7 @@
 #include <linux/slab.h>
 #include <linux/blkdev.h>
 #include <linux/backing-dev.h>
+#include <linux/crc32.h>
 #include "nilfs.h"
 #include "segment.h"
 #include "alloc.h"
@@ -105,7 +106,8 @@ void put_nilfs(struct the_nilfs *nilfs)
 	}
 	if (nilfs_init(nilfs)) {
 		nilfs_destroy_gccache(nilfs);
-		brelse(nilfs->ns_sbh);
+		brelse(nilfs->ns_sbh[0]);
+		brelse(nilfs->ns_sbh[1]);
 	}
 	kfree(nilfs);
 }
@@ -115,6 +117,7 @@ static int nilfs_load_super_root(struct the_nilfs *nilfs,
 {
 	struct buffer_head *bh_sr;
 	struct nilfs_super_root *raw_sr;
+	struct nilfs_super_block **sbp = nilfs->ns_sbp;
 	unsigned dat_entry_size, segment_usage_size, checkpoint_size;
 	unsigned inode_size;
 	int err;
@@ -124,9 +127,9 @@ static int nilfs_load_super_root(struct the_nilfs *nilfs,
 		return err;
 
 	down_read(&nilfs->ns_sem);
-	dat_entry_size = le16_to_cpu(nilfs->ns_sbp->s_dat_entry_size);
-	checkpoint_size = le16_to_cpu(nilfs->ns_sbp->s_checkpoint_size);
-	segment_usage_size = le16_to_cpu(nilfs->ns_sbp->s_segment_usage_size);
+	dat_entry_size = le16_to_cpu(sbp[0]->s_dat_entry_size);
+	checkpoint_size = le16_to_cpu(sbp[0]->s_checkpoint_size);
+	segment_usage_size = le16_to_cpu(sbp[0]->s_segment_usage_size);
 	up_read(&nilfs->ns_sem);
 
 	inode_size = nilfs->ns_inode_size;
@@ -270,11 +273,8 @@ int load_nilfs(struct the_nilfs *nilfs, struct nilfs_sb_info *sbi)
 			nilfs_mdt_destroy(nilfs->ns_dat);
 			goto failed;
 		}
-		if (ri.ri_need_recovery == NILFS_RECOVERY_SR_UPDATED) {
-			down_write(&nilfs->ns_sem);
-			nilfs_update_last_segment(sbi, 0);
-			up_write(&nilfs->ns_sem);
-		}
+		if (ri.ri_need_recovery == NILFS_RECOVERY_SR_UPDATED)
+			sbi->s_super->s_dirt = 1;
 	}
 
 	set_nilfs_loaded(nilfs);
@@ -296,9 +296,8 @@ static unsigned long long nilfs_max_size(unsigned int blkbits)
 	return res;
 }
 
-static int
-nilfs_store_disk_layout(struct the_nilfs *nilfs, struct super_block *sb,
-			struct nilfs_super_block *sbp)
+static int nilfs_store_disk_layout(struct the_nilfs *nilfs,
+				   struct nilfs_super_block *sbp)
 {
 	if (le32_to_cpu(sbp->s_rev_level) != NILFS_CURRENT_REV) {
 		printk(KERN_ERR "NILFS: revision mismatch "
@@ -309,6 +308,10 @@ nilfs_store_disk_layout(struct the_nilfs *nilfs, struct super_block *sb,
 		       NILFS_CURRENT_REV, NILFS_MINOR_REV);
 		return -EINVAL;
 	}
+	nilfs->ns_sbsize = le16_to_cpu(sbp->s_bytes);
+	if (nilfs->ns_sbsize > BLOCK_SIZE)
+		return -EINVAL;
+
 	nilfs->ns_inode_size = le16_to_cpu(sbp->s_inode_size);
 	nilfs->ns_first_ino = le32_to_cpu(sbp->s_first_ino);
 
@@ -330,6 +333,122 @@ nilfs_store_disk_layout(struct the_nilfs *nilfs, struct super_block *sb,
 	return 0;
 }
 
+static int nilfs_valid_sb(struct nilfs_super_block *sbp)
+{
+	static unsigned char sum[4];
+	const int sumoff = offsetof(struct nilfs_super_block, s_sum);
+	size_t bytes;
+	u32 crc;
+
+	if (!sbp || le16_to_cpu(sbp->s_magic) != NILFS_SUPER_MAGIC)
+		return 0;
+	bytes = le16_to_cpu(sbp->s_bytes);
+	if (bytes > BLOCK_SIZE)
+		return 0;
+	crc = crc32_le(le32_to_cpu(sbp->s_crc_seed), (unsigned char *)sbp,
+		       sumoff);
+	crc = crc32_le(crc, sum, 4);
+	crc = crc32_le(crc, (unsigned char *)sbp + sumoff + 4,
+		       bytes - sumoff - 4);
+	return crc == le32_to_cpu(sbp->s_sum);
+}
+
+static int nilfs_sb2_bad_offset(struct nilfs_super_block *sbp, u64 offset)
+{
+	return offset < ((le64_to_cpu(sbp->s_nsegments) *
+			  le32_to_cpu(sbp->s_blocks_per_segment)) <<
+			 (le32_to_cpu(sbp->s_log_block_size) + 10));
+}
+
+static void nilfs_release_super_block(struct the_nilfs *nilfs)
+{
+	int i;
+
+	for (i = 0; i < 2; i++) {
+		if (nilfs->ns_sbp[i]) {
+			brelse(nilfs->ns_sbh[i]);
+			nilfs->ns_sbh[i] = NULL;
+			nilfs->ns_sbp[i] = NULL;
+		}
+	}
+}
+
+void nilfs_fall_back_super_block(struct the_nilfs *nilfs)
+{
+	brelse(nilfs->ns_sbh[0]);
+	nilfs->ns_sbh[0] = nilfs->ns_sbh[1];
+	nilfs->ns_sbp[0] = nilfs->ns_sbp[1];
+	nilfs->ns_sbh[1] = NULL;
+	nilfs->ns_sbp[1] = NULL;
+}
+
+void nilfs_swap_super_block(struct the_nilfs *nilfs)
+{
+	struct buffer_head *tsbh = nilfs->ns_sbh[0];
+	struct nilfs_super_block *tsbp = nilfs->ns_sbp[0];
+
+	nilfs->ns_sbh[0] = nilfs->ns_sbh[1];
+	nilfs->ns_sbp[0] = nilfs->ns_sbp[1];
+	nilfs->ns_sbh[1] = tsbh;
+	nilfs->ns_sbp[1] = tsbp;
+}
+
+static int nilfs_load_super_block(struct the_nilfs *nilfs,
+				  struct super_block *sb, int blocksize,
+				  struct nilfs_super_block **sbpp)
+{
+	struct nilfs_super_block **sbp = nilfs->ns_sbp;
+	struct buffer_head **sbh = nilfs->ns_sbh;
+	u64 sb2off = NILFS_SB2_OFFSET_BYTES(nilfs->ns_bdev->bd_inode->i_size);
+	int valid[2], swp = 0;
+
+	sbp[0] = nilfs_read_super_block(sb, NILFS_SB_OFFSET_BYTES, blocksize,
+					&sbh[0]);
+	sbp[1] = nilfs_read_super_block(sb, sb2off, blocksize, &sbh[1]);
+
+	if (!sbp[0]) {
+		if (!sbp[1]) {
+			printk(KERN_ERR "NILFS: unable to read superblock\n");
+			return -EIO;
+		}
+		printk(KERN_WARNING
+		       "NILFS warning: unable to read primary superblock\n");
+	} else if (!sbp[1])
+		printk(KERN_WARNING
+		       "NILFS warning: unable to read secondary superblock\n");
+
+	valid[0] = nilfs_valid_sb(sbp[0]);
+	valid[1] = nilfs_valid_sb(sbp[1]);
+	swp = valid[1] &&
+		(!valid[0] ||
+		 le64_to_cpu(sbp[1]->s_wtime) > le64_to_cpu(sbp[0]->s_wtime));
+
+	if (valid[swp] && nilfs_sb2_bad_offset(sbp[swp], sb2off)) {
+		brelse(sbh[1]);
+		sbh[1] = NULL;
+		sbp[1] = NULL;
+		swp = 0;
+	}
+	if (!valid[swp]) {
+		nilfs_release_super_block(nilfs);
+		printk(KERN_ERR "NILFS: Can't find nilfs on dev %s.\n",
+		       sb->s_id);
+		return -EINVAL;
+	}
+
+	if (swp) {
+		printk(KERN_WARNING "NILFS warning: broken superblock. "
+		       "using spare superblock.\n");
+		nilfs_swap_super_block(nilfs);
+	}
+
+	nilfs->ns_sbwtime[0] = le64_to_cpu(sbp[0]->s_wtime);
+	nilfs->ns_sbwtime[1] = valid[!swp] ? le64_to_cpu(sbp[1]->s_wtime) : 0;
+	nilfs->ns_prot_seq = le64_to_cpu(sbp[valid[1] & !swp]->s_last_seq);
+	*sbpp = sbp[0];
+	return 0;
+}
+
 /**
  * init_nilfs - initialize a NILFS instance.
  * @nilfs: the_nilfs structure
@@ -352,7 +471,6 @@ nilfs_store_disk_layout(struct the_nilfs *nilfs, struct super_block *sb,
 int init_nilfs(struct the_nilfs *nilfs, struct nilfs_sb_info *sbi, char *data)
 {
 	struct super_block *sb = sbi->s_super;
-	struct buffer_head *sbh;
 	struct nilfs_super_block *sbp;
 	struct backing_dev_info *bdi;
 	int blocksize;
@@ -361,7 +479,7 @@ int init_nilfs(struct the_nilfs *nilfs, struct nilfs_sb_info *sbi, char *data)
 	down_write(&nilfs->ns_sem);
 	if (nilfs_init(nilfs)) {
 		/* Load values from existing the_nilfs */
-		sbp = nilfs->ns_sbp;
+		sbp = nilfs->ns_sbp[0];
 		err = nilfs_store_magic_and_option(sb, sbp, data);
 		if (err)
 			goto out;
@@ -377,36 +495,49 @@ int init_nilfs(struct the_nilfs *nilfs, struct nilfs_sb_info *sbi, char *data)
 		goto out;
 	}
 
-	sbp = nilfs_load_super_block(sb, &sbh);
-	if (!sbp) {
+	blocksize = sb_min_blocksize(sb, BLOCK_SIZE);
+	if (!blocksize) {
+		printk(KERN_ERR "NILFS: unable to set blocksize\n");
 		err = -EINVAL;
 		goto out;
 	}
+	err = nilfs_load_super_block(nilfs, sb, blocksize, &sbp);
+	if (err)
+		goto out;
+
 	err = nilfs_store_magic_and_option(sb, sbp, data);
 	if (err)
 		goto failed_sbh;
 
 	blocksize = BLOCK_SIZE << le32_to_cpu(sbp->s_log_block_size);
 	if (sb->s_blocksize != blocksize) {
-		sbp = nilfs_reload_super_block(sb, &sbh, blocksize);
-		if (!sbp) {
-			err = -EINVAL;
+		int hw_blocksize = bdev_hardsect_size(sb->s_bdev);
+
+		if (blocksize < hw_blocksize) {
+			printk(KERN_ERR
+			       "NILFS: blocksize %d too small for device "
+			       "(sector-size = %d).\n",
+			       blocksize, hw_blocksize);
+			goto failed_sbh;
+		}
+		nilfs_release_super_block(nilfs);
+		sb_set_blocksize(sb, blocksize);
+
+		err = nilfs_load_super_block(nilfs, sb, blocksize, &sbp);
+		if (err)
 			goto out;
 			/* not failed_sbh; sbh is released automatically
 			   when reloading fails. */
-		}
 	}
 	nilfs->ns_blocksize_bits = sb->s_blocksize_bits;
 
-	err = nilfs_store_disk_layout(nilfs, sb, sbp);
+	err = nilfs_store_disk_layout(nilfs, sbp);
 	if (err)
 		goto failed_sbh;
 
 	sb->s_maxbytes = nilfs_max_size(sb->s_blocksize_bits);
 
 	nilfs->ns_mount_state = le16_to_cpu(sbp->s_state);
-	nilfs->ns_sbh = sbh;
-	nilfs->ns_sbp = sbp;
 
 	bdi = nilfs->ns_bdev->bd_inode_backing_dev_info;
 	if (!bdi)
@@ -443,7 +574,7 @@ int init_nilfs(struct the_nilfs *nilfs, struct nilfs_sb_info *sbi, char *data)
 	return err;
 
  failed_sbh:
-	brelse(sbh);
+	nilfs_release_super_block(nilfs);
 	goto out;
 }
 
diff --git a/fs/nilfs2/the_nilfs.h b/fs/nilfs2/the_nilfs.h
index d750e48..30fe587 100644
--- a/fs/nilfs2/the_nilfs.h
+++ b/fs/nilfs2/the_nilfs.h
@@ -49,8 +49,10 @@ enum {
  * @ns_sem: semaphore for shared states
  * @ns_writer_mutex: mutex protecting ns_writer attach/detach
  * @ns_writer_refcount: number of referrers on ns_writer
- * @ns_sbh: buffer head of the on-disk super block
- * @ns_sbp: pointer to the super block data
+ * @ns_sbh: buffer heads of on-disk super blocks
+ * @ns_sbp: pointers to super block data
+ * @ns_sbwtime: previous write time of super blocks
+ * @ns_sbsize: size of valid data in super block
  * @ns_supers: list of nilfs super block structs
  * @ns_seg_seq: segment sequence counter
  * @ns_segnum: index number of the latest full segment.
@@ -101,8 +103,10 @@ struct the_nilfs {
 	 * - protecting s_dirt in the super_block struct
 	 *   (see nilfs_write_super) and the following fields.
 	 */
-	struct buffer_head     *ns_sbh;
-	struct nilfs_super_block *ns_sbp;
+	struct buffer_head     *ns_sbh[2];
+	struct nilfs_super_block *ns_sbp[2];
+	time_t			ns_sbwtime[2];
+	unsigned		ns_sbsize;
 	unsigned		ns_mount_state;
 	struct list_head	ns_supers;
 
@@ -182,6 +186,10 @@ THE_NILFS_FNS(INIT, init)
 THE_NILFS_FNS(LOADED, loaded)
 THE_NILFS_FNS(DISCONTINUED, discontinued)
 
+/* Minimum interval of periodical update of superblocks (in seconds) */
+#define NILFS_SB_FREQ		10
+#define NILFS_ALTSB_FREQ	60  /* spare superblock */
+
 void nilfs_set_last_segment(struct the_nilfs *, sector_t, u64, __u64);
 struct the_nilfs *alloc_nilfs(struct block_device *);
 void put_nilfs(struct the_nilfs *);
@@ -190,6 +198,8 @@ int load_nilfs(struct the_nilfs *, struct nilfs_sb_info *);
 int nilfs_count_free_blocks(struct the_nilfs *, sector_t *);
 int nilfs_checkpoint_is_mounted(struct the_nilfs *, __u64, int);
 int nilfs_near_disk_full(struct the_nilfs *);
+void nilfs_fall_back_super_block(struct the_nilfs *);
+void nilfs_swap_super_block(struct the_nilfs *);
 
 
 static inline void get_nilfs(struct the_nilfs *nilfs)
diff --git a/include/linux/nilfs2_fs.h b/include/linux/nilfs2_fs.h
index cbce664..1275b30 100644
--- a/include/linux/nilfs2_fs.h
+++ b/include/linux/nilfs2_fs.h
@@ -252,6 +252,10 @@ struct nilfs_super_block {
 #define NILFS_MIN_NRSVSEGS	8	/* Minimum number of reserved
 					   segments */
 
+/*
+ * bytes offset of secondary super block
+ */
+#define NILFS_SB2_OFFSET_BYTES(devsize)	((((devsize) >> 12) - 1) << 12)
 
 /*
  * Maximal count of links to a file
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-03-05 16:13 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-03-05 16:07 [PATCH -mm 0/5] nilfs2 settle matters related to disk format Ryusuke Konishi
2009-03-05 16:07 ` [PATCH -mm 1/5] nilfs2: super block operations fix endian bug Ryusuke Konishi
2009-03-05 16:07 ` [PATCH -mm 2/5] nilfs2: clean up sketch file Ryusuke Konishi
2009-03-05 16:07 ` [PATCH -mm 3/5] nilfs2: mark minor flag for checkpoint created by internal operation Ryusuke Konishi
2009-03-05 16:07 ` [PATCH -mm 4/5] nilfs2: simplify handling of active state of segments Ryusuke Konishi
2009-03-05 16:07 ` [PATCH -mm 5/5] nilfs2: introduce secondary super block Ryusuke Konishi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.