All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/18] e2fsprogs patchbomb 7/14, part 2
@ 2014-07-26  0:33 Darrick J. Wong
  2014-07-26  0:33 ` [PATCH 01/18] e2fsck: reserve blocks for root/lost+found directory repair Darrick J. Wong
                   ` (17 more replies)
  0 siblings, 18 replies; 46+ messages in thread
From: Darrick J. Wong @ 2014-07-26  0:33 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

Hi all,

Continuing the patchset that I sent last Friday, here are patches to
fix e2fsck failures with filesystems containing the bigalloc and/or
metadata checksumming features.  Like last week, e2fuzz helped me to
find the failures.

The first patch is the 'reserve & repair root' patch from last week,
unchanged from last week but included so it doesn't get lost in the
noise.  The second patch fixes a merge error in -next.

Patches 3 & 5 fix some fairly common errors when allocating or
deallocating blocks within a cluster.  Perhaps the most controversial
bigalloc patch is #4, which uses the duplicate block handler to
enforce that a logical cluster only maps to a single physical cluster,
and that within a cluster, each logical block maps to the same block
within the physical cluster.

Patches 6-10 are largely unchanged from patches 10-14 in the 5/14
patchbomb, though they include suggestions that Lukáš had at that
time.  These patches largely alter e2fsck's behavior regarding
checksums -- allowing the user to specify strict mode (zap anything
with a bad checksum) or regular mode (fix checksum if it passes other
tests), and corrects some mis-handling by other parts of fsck.

Patches 11-18 also fix errors when dealing with FS objects that fail
checksum tests.  Library flag handling, error reporting, inode
re-checking (for regular mode), directory repair, and extent checksum
repair all required minor corrections.

I've tested these e2fsprogs changes against the -next branch as of
7/25.  As I stated in the part 1 introduction, I use several VMs, each
with 32M-1G ramdisks to test with; the test process is "misc/e2fuzz.sh
-B <fuzz> -s <size>", where fuzz is anything from 2 bytes to "0.1%"
of metadata bytes.  Since the bulk of changes to -next in the past 12
days are my own patches, I've only run the `make check' tests and run
e2fuzz a few thousand times to quick-check for errors.

Next week will be e2fsck fixes for extended attribute and inline data
problems.

Comments and questions are, as always, welcome.

--D
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 01/18] e2fsck: reserve blocks for root/lost+found directory repair
  2014-07-26  0:33 [PATCH 00/18] e2fsprogs patchbomb 7/14, part 2 Darrick J. Wong
@ 2014-07-26  0:33 ` Darrick J. Wong
  2014-07-26 19:47   ` Theodore Ts'o
  2014-07-26  0:33 ` [PATCH 02/18] e2fsck: fix merge error in "clear uninit flag on directory extents" Darrick J. Wong
                   ` (16 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2014-07-26  0:33 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

If we think we're going to need to repair either the root directory or
the lost+found directory, reserve a block at the end of pass 1 to
reduce the likelihood of an e2fsck abort while reconstructing
root/lost+found during pass 3.

If / and/or /lost+found are corrupt and duplicate processing in pass
1b allocates all the free blocks in the FS, fsck aborts with an
unusable FS since pass 3 can't recreate / or /lost+found.  If either
of those directories are missing, an admin can't easily mount the FS
and access the directory tree to move files off the injured FS and
free up space; this in turn prevents subsequent runs of e2fsck from
being able to continue repairs of the FS.

(One could migrate files manually with debugfs without the help of
path names, but it seems easier if users can simply mount the FS and
use regular FS management tools.)

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 e2fsck/e2fsck.h          |    3 +++
 e2fsck/pass1.c           |   39 +++++++++++++++++++++++++++++++++++++++
 e2fsck/pass3.c           |   23 +++++++++++++++++++++++
 tests/f_holedir/expect.1 |    2 +-
 tests/f_holedir/expect.2 |    2 +-
 5 files changed, 67 insertions(+), 2 deletions(-)


diff --git a/e2fsck/e2fsck.h b/e2fsck/e2fsck.h
index d6d0ba9..76d15c4 100644
--- a/e2fsck/e2fsck.h
+++ b/e2fsck/e2fsck.h
@@ -375,6 +375,9 @@ struct e2fsck_struct {
 	 */
 	void *priv_data;
 	ext2fs_block_bitmap block_metadata_map; /* Metadata blocks */
+
+	/* Reserve blocks for root and l+f re-creation */
+	blk64_t root_repair_block, lnf_repair_block;
 };
 
 /* Used by the region allocation code */
diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
index 10ffe39..c116e1c 100644
--- a/e2fsck/pass1.c
+++ b/e2fsck/pass1.c
@@ -601,6 +601,42 @@ static errcode_t recheck_bad_inode_checksum(ext2_filsys fs, ext2_ino_t ino,
 	return 0;
 }
 
+static void reserve_block_for_root_repair(e2fsck_t ctx)
+{
+	blk64_t		blk = 0;
+	errcode_t	err;
+	ext2_filsys	fs = ctx->fs;
+
+	ctx->root_repair_block = 0;
+	if (ext2fs_test_inode_bitmap2(ctx->inode_used_map, EXT2_ROOT_INO))
+		return;
+
+	err = ext2fs_new_block2(fs, 0, ctx->block_found_map, &blk);
+	if (err)
+		return;
+	ext2fs_mark_block_bitmap2(ctx->block_found_map, blk);
+	ctx->root_repair_block = blk;
+}
+
+static void reserve_block_for_lnf_repair(e2fsck_t ctx)
+{
+	blk64_t		blk = 0;
+	errcode_t	err;
+	ext2_filsys	fs = ctx->fs;
+	const char	*name = "lost+found";
+	ext2_ino_t	ino;
+
+	ctx->lnf_repair_block = 0;
+	if (!ext2fs_lookup(fs, EXT2_ROOT_INO, name, sizeof(name)-1, 0, &ino))
+		return;
+
+	err = ext2fs_new_block2(fs, 0, ctx->block_found_map, &blk);
+	if (err)
+		return;
+	ext2fs_mark_block_bitmap2(ctx->block_found_map, blk);
+	ctx->lnf_repair_block = blk;
+}
+
 void e2fsck_pass1(e2fsck_t ctx)
 {
 	int	i;
@@ -1357,6 +1393,9 @@ endit:
 	if (inode)
 		ext2fs_free_mem(&inode);
 
+	reserve_block_for_root_repair(ctx);
+	reserve_block_for_lnf_repair(ctx);
+
 	/*
 	 * The l+f inode may have been cleared, so zap it now and
 	 * later passes will recalculate it if necessary
diff --git a/e2fsck/pass3.c b/e2fsck/pass3.c
index 4fc390a..92e71e7 100644
--- a/e2fsck/pass3.c
+++ b/e2fsck/pass3.c
@@ -134,6 +134,17 @@ abort_exit:
 		inode_done_map = 0;
 	}
 
+	if (ctx->lnf_repair_block) {
+		ext2fs_unmark_block_bitmap2(ctx->block_found_map,
+					    ctx->lnf_repair_block);
+		ctx->lnf_repair_block = 0;
+	}
+	if (ctx->root_repair_block) {
+		ext2fs_unmark_block_bitmap2(ctx->block_found_map,
+					    ctx->root_repair_block);
+		ctx->root_repair_block = 0;
+	}
+
 	print_resource_track(ctx, _("Pass 3"), &rtrack, ctx->fs->io);
 }
 
@@ -176,6 +187,11 @@ static void check_root(e2fsck_t ctx)
 	/*
 	 * First, find a free block
 	 */
+	if (ctx->root_repair_block) {
+		blk = ctx->root_repair_block;
+		ctx->root_repair_block = 0;
+		goto skip_new_block;
+	}
 	pctx.errcode = ext2fs_new_block2(fs, 0, ctx->block_found_map, &blk);
 	if (pctx.errcode) {
 		pctx.str = "ext2fs_new_block";
@@ -184,6 +200,7 @@ static void check_root(e2fsck_t ctx)
 		return;
 	}
 	ext2fs_mark_block_bitmap2(ctx->block_found_map, blk);
+skip_new_block:
 	ext2fs_mark_block_bitmap2(fs->block_map, blk);
 	ext2fs_mark_bb_dirty(fs);
 
@@ -425,6 +442,11 @@ unlink:
 	/*
 	 * First, find a free block
 	 */
+	if (ctx->lnf_repair_block) {
+		blk = ctx->lnf_repair_block;
+		ctx->lnf_repair_block = 0;
+		goto skip_new_block;
+	}
 	retval = ext2fs_new_block2(fs, 0, ctx->block_found_map, &blk);
 	if (retval) {
 		pctx.errcode = retval;
@@ -432,6 +454,7 @@ unlink:
 		return 0;
 	}
 	ext2fs_mark_block_bitmap2(ctx->block_found_map, blk);
+skip_new_block:
 	ext2fs_block_alloc_stats2(fs, blk, +1);
 
 	/*
diff --git a/tests/f_holedir/expect.1 b/tests/f_holedir/expect.1
index ad74fa6..e9cf590 100644
--- a/tests/f_holedir/expect.1
+++ b/tests/f_holedir/expect.1
@@ -18,7 +18,7 @@ Directory inode 11 has an unallocated block #6.  Allocate? yes
 Pass 3: Checking directory connectivity
 Pass 4: Checking reference counts
 Pass 5: Checking group summary information
-Block bitmap differences:  -21
+Block bitmap differences:  -10
 Fix? yes
 
 Free blocks count wrong for group #0 (78, counted=79).
diff --git a/tests/f_holedir/expect.2 b/tests/f_holedir/expect.2
index 4c0b4f2..6ab6209 100644
--- a/tests/f_holedir/expect.2
+++ b/tests/f_holedir/expect.2
@@ -3,5 +3,5 @@ Pass 2: Checking directory structure
 Pass 3: Checking directory connectivity
 Pass 4: Checking reference counts
 Pass 5: Checking group summary information
-test_filesys: 11/32 files (0.0% non-contiguous), 21/100 blocks
+test_filesys: 11/32 files (9.1% non-contiguous), 21/100 blocks
 Exit status is 0


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 02/18] e2fsck: fix merge error in "clear uninit flag on directory extents"
  2014-07-26  0:33 [PATCH 00/18] e2fsprogs patchbomb 7/14, part 2 Darrick J. Wong
  2014-07-26  0:33 ` [PATCH 01/18] e2fsck: reserve blocks for root/lost+found directory repair Darrick J. Wong
@ 2014-07-26  0:33 ` Darrick J. Wong
  2014-07-26 20:04   ` Theodore Ts'o
  2014-07-26  0:33 ` [PATCH 03/18] e2fsck: perform implied cluster allocations when filling a directory hole Darrick J. Wong
                   ` (15 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2014-07-26  0:33 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

In the original patch (against -next), the hunk to fix uninit dirs was
just prior to the hunk labelled "Corrupt but passes checks?".  The
hunks are ordered this way so that if e2fsck obtains permission to fix
a failed-csum extent (which in turn fixes the checksum), it will not
subsequently ask to (re)fix the checksum.

Due to a merge error the hunk moved to the wrong place, so put it
back.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 e2fsck/pass1.c |   26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)


diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
index c116e1c..a6552e5 100644
--- a/e2fsck/pass1.c
+++ b/e2fsck/pass1.c
@@ -2009,19 +2009,6 @@ static void scan_extent_node(e2fsck_t ctx, struct problem_context *pctx,
 			  (1 << (21 - ctx->fs->super->s_log_block_size))))
 			problem = PR_1_TOOBIG_DIR;
 
-		/* Corrupt but passes checks?  Ask to fix checksum. */
-		if (try_repairs && failed_csum) {
-			pctx->blk = extent.e_pblk;
-			pctx->blk2 = extent.e_lblk;
-			pctx->num = extent.e_len;
-			problem = 0;
-			if (fix_problem(ctx, PR_1_EXTENT_ONLY_CSUM_INVALID,
-					pctx)) {
-				pb->inode_modified = 1;
-				ext2fs_extent_replace(ehandle, 0, &extent);
-			}
-		}

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 03/18] e2fsck: perform implied cluster allocations when filling a directory hole
  2014-07-26  0:33 [PATCH 00/18] e2fsprogs patchbomb 7/14, part 2 Darrick J. Wong
  2014-07-26  0:33 ` [PATCH 01/18] e2fsck: reserve blocks for root/lost+found directory repair Darrick J. Wong
  2014-07-26  0:33 ` [PATCH 02/18] e2fsck: fix merge error in "clear uninit flag on directory extents" Darrick J. Wong
@ 2014-07-26  0:33 ` Darrick J. Wong
  2014-07-26 20:08   ` Theodore Ts'o
  2014-07-26  0:34 ` [PATCH 04/18] e2fsck: fix rule-violating lblk->pblk mappings on bigalloc filesystems Darrick J. Wong
                   ` (14 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2014-07-26  0:33 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

If we're filling a directory hole, we need to perform an implied
cluster allocation to satisfy the bigalloc rule of mapping only one
pblk to a logical cluster.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 e2fsck/pass2.c |   19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)


diff --git a/e2fsck/pass2.c b/e2fsck/pass2.c
index 0ef9637..e2bed2f 100644
--- a/e2fsck/pass2.c
+++ b/e2fsck/pass2.c
@@ -1577,7 +1577,7 @@ static int allocate_dir_block(e2fsck_t ctx,
 			      struct problem_context *pctx)
 {
 	ext2_filsys fs = ctx->fs;
-	blk64_t			blk;
+	blk64_t			blk = 0;
 	char			*block;
 	struct ext2_inode	inode;
 
@@ -1593,11 +1593,17 @@ static int allocate_dir_block(e2fsck_t ctx,
 	/*
 	 * First, find a free block
 	 */
-	pctx->errcode = ext2fs_new_block2(fs, 0, ctx->block_found_map, &blk);
-	if (pctx->errcode) {
-		pctx->str = "ext2fs_new_block";
-		fix_problem(ctx, PR_2_ALLOC_DIRBOCK, pctx);
-		return 1;
+	e2fsck_read_inode(ctx, db->ino, &inode, "allocate_dir_block");
+	pctx->errcode = ext2fs_map_cluster_block(fs, db->ino, &inode,
+						 db->blockcnt, &blk);
+	if (pctx->errcode || blk == 0) {
+		pctx->errcode = ext2fs_new_block2(fs, 0,
+						  ctx->block_found_map, &blk);
+		if (pctx->errcode) {
+			pctx->str = "ext2fs_new_block";
+			fix_problem(ctx, PR_2_ALLOC_DIRBOCK, pctx);
+			return 1;
+		}
 	}
 	ext2fs_mark_block_bitmap2(ctx->block_found_map, blk);
 	ext2fs_mark_block_bitmap2(fs->block_map, blk);
@@ -1629,7 +1635,6 @@ static int allocate_dir_block(e2fsck_t ctx,
 	/*
 	 * Update the inode block count
 	 */
-	e2fsck_read_inode(ctx, db->ino, &inode, "allocate_dir_block");
 	ext2fs_iblk_add_blocks(fs, &inode, 1);
 	if (inode.i_size < (db->blockcnt+1) * fs->blocksize)
 		inode.i_size = (db->blockcnt+1) * fs->blocksize;


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 04/18] e2fsck: fix rule-violating lblk->pblk mappings on bigalloc filesystems
  2014-07-26  0:33 [PATCH 00/18] e2fsprogs patchbomb 7/14, part 2 Darrick J. Wong
                   ` (2 preceding siblings ...)
  2014-07-26  0:33 ` [PATCH 03/18] e2fsck: perform implied cluster allocations when filling a directory hole Darrick J. Wong
@ 2014-07-26  0:34 ` Darrick J. Wong
  2014-07-26  6:02   ` Andreas Dilger
  2014-07-26  0:34 ` [PATCH 05/18] e2fsck: during pass1b delete_file, only free a cluster once Darrick J. Wong
                   ` (13 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2014-07-26  0:34 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

As far as I can tell, logical block mappings on a bigalloc filesystem are
supposed to follow a few constraints:

 * The logical cluster offset must match the physical cluster offset.
 * A logical cluster may not map to multiple physical clusters.

Since the multiply-claimed block recovery code can be used to fix these
problems, teach e2fsck to find these transgressions and fix them.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 e2fsck/pass1.c              |   52 +++++++++++
 e2fsck/pass1b.c             |   42 ++++++++-
 e2fsck/problem.c            |    5 +
 e2fsck/problem.h            |    3 +
 tests/f_badcluster/expect   |  198 +++++++++++++++++++++++++++++++++++++++++++
 tests/f_badcluster/image.gz |  Bin
 tests/f_badcluster/name     |    2 
 tests/f_badcluster/script   |   25 +++++
 8 files changed, 320 insertions(+), 7 deletions(-)
 create mode 100644 tests/f_badcluster/expect
 create mode 100644 tests/f_badcluster/image.gz
 create mode 100644 tests/f_badcluster/name
 create mode 100644 tests/f_badcluster/script


diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
index a6552e5..646ef8a 100644
--- a/e2fsck/pass1.c
+++ b/e2fsck/pass1.c
@@ -1945,6 +1945,40 @@ void e2fsck_clear_inode(e2fsck_t ctx, ext2_ino_t ino,
 	e2fsck_write_inode(ctx, ino, inode, source);
 }
 
+/*
+ * Use the multiple-blocks reclamation code to fix alignment problems in
+ * a bigalloc filesystem.  We want a logical cluster to map to *only* one
+ * physical cluster, and we want the block offsets within that cluster to
+ * line up.
+ */
+static int has_unaligned_cluster_map(e2fsck_t ctx,
+				     blk64_t last_pblk, e2_blkcnt_t last_lblk,
+				     blk64_t pblk, blk64_t lblk)
+{
+	blk64_t cluster_mask;
+
+	if (!ctx->fs->cluster_ratio_bits)
+		return 0;
+	cluster_mask = EXT2FS_CLUSTER_MASK(ctx->fs);
+
+	/*
+	 * If the block in the logical cluster doesn't align with the block in
+	 * the physical cluster...
+	 */
+	if ((lblk & cluster_mask) != (pblk & cluster_mask))
+		return 1;
+
+	/*
+	 * If we cross a physical cluster boundary within a logical cluster...
+	 */
+	if (last_pblk && (lblk & cluster_mask) != 0 &&
+	    EXT2FS_B2C(ctx->fs, lblk) == EXT2FS_B2C(ctx->fs, last_lblk) &&
+	    EXT2FS_B2C(ctx->fs, pblk) != EXT2FS_B2C(ctx->fs, last_pblk))
+		return 1;
+
+	return 0;
+}
+
 static void scan_extent_node(e2fsck_t ctx, struct problem_context *pctx,
 			     struct process_block_struct *pb,
 			     blk64_t start_block, blk64_t end_block,
@@ -2249,7 +2283,16 @@ alloc_later:
 				mark_block_used(ctx, blk);
 				pb->num_blocks++;
 			}
-
+			if (has_unaligned_cluster_map(ctx, pb->previous_block,
+						      pb->last_block, blk,
+						      blockcnt)) {
+				pctx->blk = blockcnt;
+				pctx->blk2 = blk;
+				fix_problem(ctx, PR_1_MISALIGNED_CLUSTER, pctx);
+				mark_block_used(ctx, blk);
+				mark_block_used(ctx, blk);
+			}
+			pb->last_block = blockcnt;
 			pb->previous_block = blk;
 
 			if (is_dir) {
@@ -2815,6 +2858,13 @@ static int process_block(ext2_filsys fs,
 		     ((unsigned) blockcnt & EXT2FS_CLUSTER_MASK(ctx->fs)))) {
 		mark_block_used(ctx, blk);
 		p->num_blocks++;
+	} else if (has_unaligned_cluster_map(ctx, p->previous_block,
+					     p->last_block, blk, blockcnt)) {
+		pctx->blk = blockcnt;
+		pctx->blk2 = blk;
+		fix_problem(ctx, PR_1_MISALIGNED_CLUSTER, pctx);
+		mark_block_used(ctx, blk);
+		mark_block_used(ctx, blk);
 	}
 	if (blockcnt >= 0)
 		p->last_block = blockcnt;
diff --git a/e2fsck/pass1b.c b/e2fsck/pass1b.c
index 8d42d10..c0bfa07 100644
--- a/e2fsck/pass1b.c
+++ b/e2fsck/pass1b.c
@@ -261,7 +261,7 @@ struct process_block_struct {
 	e2fsck_t	ctx;
 	ext2_ino_t	ino;
 	int		dup_blocks;
-	blk64_t		cur_cluster;
+	blk64_t		cur_cluster, phys_cluster;
 	blk64_t		last_blk;
 	struct ext2_inode *inode;
 	struct problem_context *pctx;
@@ -317,6 +317,7 @@ static void pass1b(e2fsck_t ctx, char *block_buf)
 		pb.dup_blocks = 0;
 		pb.inode = &inode;
 		pb.cur_cluster = ~0;
+		pb.phys_cluster = ~0;
 		pb.last_blk = 0;
 		pb.pctx->blk = pb.pctx->blk2 = 0;
 
@@ -360,7 +361,7 @@ static int process_pass1b_block(ext2_filsys fs EXT2FS_ATTR((unused)),
 {
 	struct process_block_struct *p;
 	e2fsck_t ctx;
-	blk64_t	lc;
+	blk64_t	lc, pc;
 	problem_t op;
 
 	if (HOLE_BLKADDR(*block_nr))
@@ -368,6 +369,7 @@ static int process_pass1b_block(ext2_filsys fs EXT2FS_ATTR((unused)),
 	p = (struct process_block_struct *) priv_data;
 	ctx = p->ctx;
 	lc = EXT2FS_B2C(fs, blockcnt);
+	pc = EXT2FS_B2C(fs, *block_nr);
 
 	if (!ext2fs_test_block_bitmap2(ctx->block_dup_map, *block_nr))
 		goto finish;
@@ -389,11 +391,19 @@ static int process_pass1b_block(ext2_filsys fs EXT2FS_ATTR((unused)),
 	p->dup_blocks++;
 	ext2fs_mark_inode_bitmap2(inode_dup_map, p->ino);
 
-	if (blockcnt < 0 || lc != p->cur_cluster)
+	/*
+	 * Qualifications for submitting a block for duplicate processing:
+	 * It's an extent/indirect block (and has a negative logical offset);
+	 * we've crossed a logical cluster boundary; or the physical cluster
+	 * suddenly changed, which indicates that blocks in a logical cluster
+	 * are mapped to multiple physical clusters.
+	 */
+	if (blockcnt < 0 || lc != p->cur_cluster || pc != p->phys_cluster)
 		add_dupe(ctx, p->ino, EXT2FS_B2C(fs, *block_nr), p->inode);
 
 finish:
 	p->cur_cluster = lc;
+	p->phys_cluster = pc;
 	return 0;
 }
 
@@ -563,7 +573,11 @@ static void pass1d(e2fsck_t ctx, char *block_buf)
 			pctx.dir = t->dir;
 			fix_problem(ctx, PR_1D_DUP_FILE_LIST, &pctx);
 		}
-		if (file_ok) {
+		/*
+		 * Even if the file shares blocks with itself, we still need to
+		 * clone the blocks.
+		 */
+		if (file_ok && (meta_data ? shared_len+1 : shared_len) != 0) {
 			fix_problem(ctx, PR_1D_DUP_BLOCKS_DEALT, &pctx);
 			continue;
 		}
@@ -706,9 +720,10 @@ struct clone_struct {
 	errcode_t	errcode;
 	blk64_t		dup_cluster;
 	blk64_t		alloc_block;
-	ext2_ino_t	dir;
+	ext2_ino_t	dir, ino;
 	char	*buf;
 	e2fsck_t ctx;
+	struct ext2_inode	*inode;
 };
 
 static int clone_file_block(ext2_filsys fs,
@@ -756,13 +771,26 @@ static int clone_file_block(ext2_filsys fs,
 			decrement_badcount(ctx, *block_nr, p);
 
 		cs->dup_cluster = c;
-
+		/*
+		 * Let's try an implied cluster allocation.  If we get the same
+		 * cluster back, then we need to find a new block; otherwise,
+		 * we're merely fixing the problem of one logical cluster being
+		 * mapped to multiple physical clusters.
+		 */
+		new_block = 0;
+		retval = ext2fs_map_cluster_block(fs, cs->ino, cs->inode,
+						  blockcnt, &new_block);
+		if (retval == 0 && new_block != 0 &&
+		    EXT2FS_B2C(ctx->fs, new_block) !=
+		    EXT2FS_B2C(ctx->fs, *block_nr))
+			goto cluster_alloc_ok;
 		retval = ext2fs_new_block2(fs, 0, ctx->block_found_map,
 					   &new_block);
 		if (retval) {
 			cs->errcode = retval;
 			return BLOCK_ABORT;
 		}
+cluster_alloc_ok:
 		cs->alloc_block = new_block;
 
 	got_block:
@@ -817,6 +845,8 @@ static errcode_t clone_file(e2fsck_t ctx, ext2_ino_t ino,
 	cs.dup_cluster = ~0;
 	cs.alloc_block = 0;
 	cs.ctx = ctx;
+	cs.ino = ino;
+	cs.inode = &dp->inode;
 	retval = ext2fs_get_mem(fs->blocksize, &cs.buf);
 	if (retval)
 		return retval;
diff --git a/e2fsck/problem.c b/e2fsck/problem.c
index 60c02af..4da8ba8 100644
--- a/e2fsck/problem.c
+++ b/e2fsck/problem.c
@@ -1048,6 +1048,11 @@ static struct e2fsck_problem problem_table[] = {
 	  N_("@d @i %i has @x marked uninitialized at @b %c.  "),
 	  PROMPT_FIX, PR_PREEN_OK },
 
+	/* Inode logical block (physical block ) is misaligned. */
+	{ PR_1_MISALIGNED_CLUSTER,
+	  N_("@i %i logical @b %b (physical @b %c) violates cluster allocation rules.\nWill fix in pass 1B.\n"),
+	  PROMPT_NONE, 0 },
+
 	/* Pass 1b errors */
 
 	/* Pass 1B: Rescan for duplicate/bad blocks */
diff --git a/e2fsck/problem.h b/e2fsck/problem.h
index 6cd3d50..80ef4a2 100644
--- a/e2fsck/problem.h
+++ b/e2fsck/problem.h
@@ -609,6 +609,9 @@ struct problem_context {
 /* uninit directory block */
 #define PR_1_UNINIT_DBLOCK		0x010073
 
+/* Inode logical block is misaligned */
+#define PR_1_MISALIGNED_CLUSTER		0x010074
+
 /*
  * Pass 1b errors
  */
diff --git a/tests/f_badcluster/expect b/tests/f_badcluster/expect
new file mode 100644
index 0000000..eb3bcf0
--- /dev/null
+++ b/tests/f_badcluster/expect
@@ -0,0 +1,198 @@
+Pass 1: Checking inodes, blocks, and sizes
+Inode 12 logical block 2 (physical block 1154) violates cluster allocation rules.
+Will fix in pass 1B.
+Inode 12, i_blocks is 32, should be 64.  Fix? yes
+
+Inode 16 logical block 5 (physical block 1173) violates cluster allocation rules.
+Will fix in pass 1B.
+Inode 16, i_size is 3072, should be 6144.  Fix? yes
+
+Inode 16, i_blocks is 32, should be 64.  Fix? yes
+
+Inode 17 logical block 0 (physical block 1186) violates cluster allocation rules.
+Will fix in pass 1B.
+Inode 17 logical block 2 (physical block 1184) violates cluster allocation rules.
+Will fix in pass 1B.
+Inode 17, i_blocks is 32, should be 64.  Fix? yes
+
+Inode 18 logical block 3 (physical block 1201) violates cluster allocation rules.
+Will fix in pass 1B.
+Inode 18, i_blocks is 32, should be 64.  Fix? yes
+
+
+Running additional passes to resolve blocks claimed by more than one inode...
+Pass 1B: Rescanning for multiply-claimed blocks
+Multiply-claimed block(s) in inode 12: 1154
+Multiply-claimed block(s) in inode 13: 1152--1154
+Multiply-claimed block(s) in inode 14: 1648--1650
+Multiply-claimed block(s) in inode 15: 1650
+Multiply-claimed block(s) in inode 16: 1173
+Multiply-claimed block(s) in inode 17: 1186 1185 1184
+Multiply-claimed block(s) in inode 18: 1201
+Pass 1C: Scanning directories for inodes with multiply-claimed blocks
+Pass 1D: Reconciling multiply-claimed blocks
+(There are 7 inodes containing multiply-claimed blocks.)
+
+File /a (inode #12, mod time Tue Jun 17 08:00:50 2014) 
+  has 1 multiply-claimed block(s), shared with 1 file(s):
+	/b (inode #13, mod time Tue Jun 17 08:00:50 2014)
+Clone multiply-claimed blocks? yes
+
+File /b (inode #13, mod time Tue Jun 17 08:00:50 2014) 
+  has 1 multiply-claimed block(s), shared with 1 file(s):
+	/a (inode #12, mod time Tue Jun 17 08:00:50 2014)
+Multiply-claimed blocks already reassigned or cloned.
+
+File /c (inode #14, mod time Tue Jun 17 08:00:50 2014) 
+  has 1 multiply-claimed block(s), shared with 1 file(s):
+	/d (inode #15, mod time Tue Jun 17 08:00:50 2014)
+Clone multiply-claimed blocks? yes
+
+File /d (inode #15, mod time Tue Jun 17 08:00:50 2014) 
+  has 1 multiply-claimed block(s), shared with 1 file(s):
+	/c (inode #14, mod time Tue Jun 17 08:00:50 2014)
+Multiply-claimed blocks already reassigned or cloned.
+
+File /e (inode #16, mod time Tue Jun 17 08:00:50 2014) 
+  has 1 multiply-claimed block(s), shared with 0 file(s):
+Clone multiply-claimed blocks? yes
+
+File /f (inode #17, mod time Tue Jun 17 08:00:50 2014) 
+  has 1 multiply-claimed block(s), shared with 0 file(s):
+Clone multiply-claimed blocks? yes
+
+File /g (inode #18, mod time Tue Jun 17 08:00:50 2014) 
+  has 1 multiply-claimed block(s), shared with 0 file(s):
+Clone multiply-claimed blocks? yes
+
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+Free blocks count wrong for group #0 (50, counted=47).
+Fix? yes
+
+Free blocks count wrong (800, counted=752).
+Fix? yes
+
+
+test_fs: ***** FILE SYSTEM WAS MODIFIED *****
+test_fs: 18/128 files (22.2% non-contiguous), 1296/2048 blocks
+Pass 1: Checking inodes, blocks, and sizes
+Inode 12, i_blocks is 64, should be 32.  Fix? yes
+
+Inode 16, i_blocks is 64, should be 32.  Fix? yes
+
+Inode 17, i_blocks is 64, should be 32.  Fix? yes
+
+Inode 18, i_blocks is 64, should be 32.  Fix? yes
+
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+Block bitmap differences:  -(1168--1200)
+Fix? yes
+
+Free blocks count wrong for group #0 (47, counted=50).
+Fix? yes
+
+Free blocks count wrong (752, counted=800).
+Fix? yes
+
+
+test_fs: ***** FILE SYSTEM WAS MODIFIED *****
+test_fs: 18/128 files (5.6% non-contiguous), 1248/2048 blocks
+Pass 1: Checking inodes, blocks, and sizes
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+test_fs: 18/128 files (5.6% non-contiguous), 1248/2048 blocks
+debugfs:  stat /a
+Inode: 12   Type: regular    Mode:  0644   Flags: 0x80000
+Generation: 1117152157    Version: 0x00000001
+User:     0   Group:     0   Size: 3072
+File ACL: 0    Directory ACL: 0
+Links: 1   Blockcount: 32
+Fragment:  Address: 0    Number: 0    Size: 0
+ctime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
+atime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
+mtime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
+EXTENTS:
+(0-1):1136-1137, (2):1138
+debugfs:  stat /b
+Inode: 13   Type: regular    Mode:  0644   Flags: 0x80000
+Generation: 1117152158    Version: 0x00000001
+User:     0   Group:     0   Size: 3072
+File ACL: 0    Directory ACL: 0
+Links: 1   Blockcount: 32
+Fragment:  Address: 0    Number: 0    Size: 0
+ctime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
+atime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
+mtime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
+EXTENTS:
+(0-2):1152-1154
+debugfs:  stat /c
+Inode: 14   Type: regular    Mode:  0644   Flags: 0x80000
+Generation: 1117152159    Version: 0x00000001
+User:     0   Group:     0   Size: 3072
+File ACL: 0    Directory ACL: 0
+Links: 1   Blockcount: 32
+Fragment:  Address: 0    Number: 0    Size: 0
+ctime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
+atime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
+mtime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
+EXTENTS:
+(0-1):1216-1217, (2):1218
+debugfs:  stat /d
+Inode: 15   Type: regular    Mode:  0644   Flags: 0x0
+Generation: 1117152160    Version: 0x00000001
+User:     0   Group:     0   Size: 3072
+File ACL: 0    Directory ACL: 0
+Links: 1   Blockcount: 32
+Fragment:  Address: 0    Number: 0    Size: 0
+ctime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
+atime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
+mtime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
+BLOCKS:
+(TIND):1650
+TOTAL: 1
+
+debugfs:  stat /e
+Inode: 16   Type: regular    Mode:  0644   Flags: 0x80000
+Generation: 1117152161    Version: 0x00000001
+User:     0   Group:     0   Size: 6144
+File ACL: 0    Directory ACL: 0
+Links: 1   Blockcount: 32
+Fragment:  Address: 0    Number: 0    Size: 0
+ctime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
+atime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
+mtime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
+EXTENTS:
+(0-2):1664-1666, (5):1669
+debugfs:  stat /f
+Inode: 17   Type: regular    Mode:  0644   Flags: 0x80000
+Generation: 1117152162    Version: 0x00000001
+User:     0   Group:     0   Size: 3072
+File ACL: 0    Directory ACL: 0
+Links: 1   Blockcount: 32
+Fragment:  Address: 0    Number: 0    Size: 0
+ctime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
+atime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
+mtime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
+EXTENTS:
+(0):1232, (1):1233, (2):1234
+debugfs:  stat /g
+Inode: 18   Type: regular    Mode:  0644   Flags: 0x80000
+Generation: 1117152163    Version: 0x00000001
+User:     0   Group:     0   Size: 3072
+File ACL: 0    Directory ACL: 0
+Links: 1   Blockcount: 32
+Fragment:  Address: 0    Number: 0    Size: 0
+ctime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
+atime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
+mtime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
+EXTENTS:
+(0-2):1680-1682, (3):1683
+debugfs:  
\ No newline at end of file
diff --git a/tests/f_badcluster/image.gz b/tests/f_badcluster/image.gz
new file mode 100644
index 0000000000000000000000000000000000000000..e02ee1866868361e6e82bfbe29982365b2e3aa55
GIT binary patch
literal 3131
zcmeH`eNfVO9LF)YGArHAl6flRwbV_eOvC0A4VPY_dFBj}kaU;3`An&ypmc4h9VhIf
zG;+!qGBs1@gsG+2gU*y0Y50JcUV;y)D1xAjhu^x}b(h}$+@E~@_&j|7`F=jH&*u%L
z*?=Z=?ARUF%2(nv2aF3y#X5Z+g%u<)LiS(y@}tx>cPz)&G0Qe&A*OK@Raq{8^q_F-
zbB{#>iF|Dz=b)|E490O%W`;sOzn*Dtd$W+q>f!h2aWn3xA)rK!kjYY12Cz2VS^ia?
zJKzn{oZ*j8$vKNhzKW^O#IziXdU8FzD*|<eD@NYnUd~^Sa_N2j2lXf(hj_l!WF(Xz
z^@k=+XFN;5_JiKlfolqV3uj7S$m~OhGOU?ARvy2Z3xiBLmvwf9s?~oQZlIxv;1!zu
zbzM+@IZwGj2G3`h`BuLoys*(veF#j!uuCmX7XnXQP{(9>l8|3~Qb0L3W_@O8Bou2S
z#LwPyeSHU4SGh|_3O@<jVw-ulMAQSCYtw-BmUoczrKImN`3b_5sj-MBHFstycPr9Q
zyDc_LT!%)FD#_s5hLIl5UXnDPD@{u}7ViyT0SH2%u*0u$j#2PSivDR6be36cLR7JI
z>P9Io$LN#}Hy;+P)k<}%cC(5=1%>EC`mJ`K9DhLsMXNnTY7b5%xHS-GMy*X>nRI$s
zn4{EJv>{B*zOB8Z6U4SCeQg`RoRS%(Z&ez2MRsCB5pII*p9!eO5eo6@SX?5Fv~;uk
zMYgdE|JNdavOxx(M+a9JGE#v?A50YtD7>vh#2-d%9rOK)Okcl?dlmE3Udha-PBwlx
zw<*_LcZb-D6dfZP2dfe{<%37d#Ki?z2z!rK7Yx~>r3c$JM1<s%-6TO8VzA^PvIA2p
zr@^=X@pVnNrWr5+)70e+;E{U_fA>jX*E;3D-245!%MWU2YQwru2>?*S{bGVA^+1FB
z8-;3y!|ILos18QJB|+U;YN8OP0+aw@s+)wy9IvNaJ!a&tRX-rM!{|#hNL**Yt+QAN
zJh9_Kysv$_Bh~mb8`5JR&ddJMoa1}#T#i6&vzy$;rqF-rzqsGWd$8<0N-IB0;jPp~
z22J;hV9a6Hf&ScQfGR<EF*<B2wZ4daZ;nx^i0)Iqacq`d{&rb6l&4^4_yVusv+Swi
z1ak*`wtFPmFfS9Bu<1ZooEm+61;F<$9g=?5M=jQYQ<c*bH6}vQ?^dJ9a3$sY-ekEM
zucY8tA{+GF>N<>h`Nd1)Ci8h@bC`iSpmOlA7MwV}^GPJS^lC&^e~*O6GRb2rQ1h+d
zB*q7XIxlAAlVK8>hUhH`Qg4!ZahPpe$wQ2fuN{FV4;STob=J3cl1ux;sphvTO-et5
za=9(yX3Y%vwwp~zE$z|267_p4(cyK{(@S=d<tY<GLiz(@c9y?d_xCfA^QuQ;p;6U-
zgkHj3QM@#;1&o%EeQo}R%l?CeiQ{1XPRZF1iUXvrAI<|wei)IyMDNH*+K}a%dA%B_
zeZmRAs#0yds@p(lDKZT^e7?L|o5?bXMz^W(-3eHtg@A>Cg@A>Cg@A>?|4(4iWQ{Gq
K#Rh>6g8l&!3vo*T

literal 0
HcmV?d00001

diff --git a/tests/f_badcluster/name b/tests/f_badcluster/name
new file mode 100644
index 0000000..266f81c
--- /dev/null
+++ b/tests/f_badcluster/name
@@ -0,0 +1,2 @@
+test alignment problems with bigalloc clusters
+
diff --git a/tests/f_badcluster/script b/tests/f_badcluster/script
new file mode 100644
index 0000000..ba6b248
--- /dev/null
+++ b/tests/f_badcluster/script
@@ -0,0 +1,25 @@
+if test -x $DEBUGFS_EXE; then
+	IMAGE=$test_dir/../f_badcluster/image.gz
+	OUT=$test_name.log
+	EXP=$test_dir/expect
+	gzip -d < $IMAGE > $TMPFILE
+	../misc/tune2fs -L test_fs $TMPFILE
+	../e2fsck/e2fsck -fy $TMPFILE > $OUT
+	../e2fsck/e2fsck -fy $TMPFILE >> $OUT
+	../e2fsck/e2fsck -fy $TMPFILE >> $OUT
+	for i in a b c d e f g; do echo "stat /$i"; done | $DEBUGFS_EXE $TMPFILE >> $OUT
+
+	cmp -s $OUT $EXP
+	status=$?
+
+	if [ "$status" = 0 ]; then
+		echo "$test_name: $test_description: ok"
+		touch $test_name.ok
+	else
+		echo "$test_name: $test_description: failed"
+		diff $DIFF_OPTS $EXP $OUT > $test_name.failed
+		rm -f $test_name.tmp
+	fi
+else
+	echo "$test_name: skipped"
+fi


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 05/18] e2fsck: during pass1b delete_file, only free a cluster once
  2014-07-26  0:33 [PATCH 00/18] e2fsprogs patchbomb 7/14, part 2 Darrick J. Wong
                   ` (3 preceding siblings ...)
  2014-07-26  0:34 ` [PATCH 04/18] e2fsck: fix rule-violating lblk->pblk mappings on bigalloc filesystems Darrick J. Wong
@ 2014-07-26  0:34 ` Darrick J. Wong
  2014-07-26 20:30   ` Theodore Ts'o
  2014-07-26  0:34 ` [PATCH 06/18] dumpe2fs: add switch to disable checksum verification Darrick J. Wong
                   ` (12 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2014-07-26  0:34 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

If we're forced to delete a crosslinked file, only call
ext2fs_block_alloc_stats2() on cluster boundaries, since the block
bitmaps are all cluster bitmaps at this point.  It's safe to do this
only once per cluster since we know all the blocks are going away.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 e2fsck/pass1b.c |    3 ++-
 e2fsck/pass2.c  |    3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)


diff --git a/e2fsck/pass1b.c b/e2fsck/pass1b.c
index c0bfa07..2d1b448 100644
--- a/e2fsck/pass1b.c
+++ b/e2fsck/pass1b.c
@@ -644,7 +644,8 @@ static int delete_file_block(ext2_filsys fs,
 			    _("internal error: can't find dup_blk for %llu\n"),
 				*block_nr);
 	} else {
-		ext2fs_block_alloc_stats2(fs, *block_nr, -1);
+		if ((*block_nr % EXT2FS_CLUSTER_RATIO(ctx->fs)) == 0)
+			ext2fs_block_alloc_stats2(fs, *block_nr, -1);
 		pb->dup_blocks++;
 	}
 	pb->cur_cluster = lc;
diff --git a/e2fsck/pass2.c b/e2fsck/pass2.c
index e2bed2f..57d5a16 100644
--- a/e2fsck/pass2.c
+++ b/e2fsck/pass2.c
@@ -1336,7 +1336,8 @@ static int deallocate_inode_block(ext2_filsys fs,
 	if ((*block_nr < fs->super->s_first_data_block) ||
 	    (*block_nr >= ext2fs_blocks_count(fs->super)))
 		return 0;
-	ext2fs_block_alloc_stats2(fs, *block_nr, -1);
+	if ((*block_nr % EXT2FS_CLUSTER_RATIO(fs)) == 0)
+		ext2fs_block_alloc_stats2(fs, *block_nr, -1);
 	p->num++;
 	return 0;
 }


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 06/18] dumpe2fs: add switch to disable checksum verification
  2014-07-26  0:33 [PATCH 00/18] e2fsprogs patchbomb 7/14, part 2 Darrick J. Wong
                   ` (4 preceding siblings ...)
  2014-07-26  0:34 ` [PATCH 05/18] e2fsck: during pass1b delete_file, only free a cluster once Darrick J. Wong
@ 2014-07-26  0:34 ` Darrick J. Wong
  2014-07-26 20:58   ` Theodore Ts'o
  2014-07-26  0:34 ` [PATCH 07/18] e2fsck: verify checksums after checking everything else Darrick J. Wong
                   ` (11 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2014-07-26  0:34 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

Add a -n switch to turn off checksum verification.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 misc/dumpe2fs.8.in |    3 +++
 misc/dumpe2fs.c    |   10 +++++++---
 2 files changed, 10 insertions(+), 3 deletions(-)


diff --git a/misc/dumpe2fs.8.in b/misc/dumpe2fs.8.in
index befaf94..51614db 100644
--- a/misc/dumpe2fs.8.in
+++ b/misc/dumpe2fs.8.in
@@ -61,6 +61,9 @@ using
 .I device
 as the pathname to the image file.
 .TP
+.B \-n
+Don't verify checksums when dumping the filesystem.
+.TP
 .B \-x
 print the detailed group information block numbers in hexadecimal format
 .TP
diff --git a/misc/dumpe2fs.c b/misc/dumpe2fs.c
index 0b84ece..3ce7304 100644
--- a/misc/dumpe2fs.c
+++ b/misc/dumpe2fs.c
@@ -52,7 +52,7 @@ static int blocks64 = 0;
 
 static void usage(void)
 {
-	fprintf (stderr, _("Usage: %s [-bfhixV] [-o superblock=<num>] "
+	fprintf(stderr, _("Usage: %s [-bfhinxV] [-o superblock=<num>] "
 		 "[-o blocksize=<num>] device\n"), program_name);
 	exit (1);
 }
@@ -582,7 +582,9 @@ int main (int argc, char ** argv)
 	if (argc && *argv)
 		program_name = *argv;
 
-	while ((c = getopt (argc, argv, "bfhixVo:")) != EOF) {
+	flags = EXT2_FLAG_JOURNAL_DEV_OK | EXT2_FLAG_SOFTSUPP_FEATURES |
+		EXT2_FLAG_64BITS;
+	while ((c = getopt(argc, argv, "bfhixVo:n")) != EOF) {
 		switch (c) {
 		case 'b':
 			print_badblocks++;
@@ -608,6 +610,9 @@ int main (int argc, char ** argv)
 		case 'x':
 			hex_format++;
 			break;
+		case 'n':
+			flags |= EXT2_FLAG_IGNORE_CSUM_ERRORS;
+			break;
 		default:
 			usage();
 		}
@@ -615,7 +620,6 @@ int main (int argc, char ** argv)
 	if (optind > argc - 1)
 		usage();
 	device_name = argv[optind++];
-	flags = EXT2_FLAG_JOURNAL_DEV_OK | EXT2_FLAG_SOFTSUPP_FEATURES | EXT2_FLAG_64BITS;
 	if (force)
 		flags |= EXT2_FLAG_FORCE;
 	if (image_dump)


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 07/18] e2fsck: verify checksums after checking everything else
  2014-07-26  0:33 [PATCH 00/18] e2fsprogs patchbomb 7/14, part 2 Darrick J. Wong
                   ` (5 preceding siblings ...)
  2014-07-26  0:34 ` [PATCH 06/18] dumpe2fs: add switch to disable checksum verification Darrick J. Wong
@ 2014-07-26  0:34 ` Darrick J. Wong
  2014-07-26 20:53   ` Theodore Ts'o
  2014-07-26  0:34 ` [PATCH 08/18] e2fsck: fix the various checksum error messages Darrick J. Wong
                   ` (10 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2014-07-26  0:34 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

There's a particular problem with e2fsck's user interface where
checksum errors are concerned:  Fixing the first complaint about
a checksum problem results in the inode being cleared even if e2fsck
could otherwise have recovered it.  While this mode is useful for
cleaning the remaining broken crud off the filesystem, we could at
least default to checking everything /else/ and only complaining about
the incorrect checksum if fsck finds nothing else wrong.

So, plumb in a config option.  We default to "verify and checksum"
unless the user tell us otherwise.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 e2fsck/e2fsck.8.in      |   12 ++++++++++++
 e2fsck/e2fsck.conf.5.in |   20 ++++++++++++++++++++
 e2fsck/e2fsck.h         |    1 +
 e2fsck/problem.c        |   25 +++++++++++++++++++++----
 e2fsck/problemP.h       |    1 +
 e2fsck/unix.c           |   11 +++++++++++
 6 files changed, 66 insertions(+), 4 deletions(-)


diff --git a/e2fsck/e2fsck.8.in b/e2fsck/e2fsck.8.in
index f5ed758..43ee063 100644
--- a/e2fsck/e2fsck.8.in
+++ b/e2fsck/e2fsck.8.in
@@ -207,6 +207,18 @@ option may prevent you from further manual data recovery.
 .BI nodiscard
 Do not attempt to discard free blocks and unused inode blocks. This option is
 exactly the opposite of discard option. This is set as default.
+.TP
+.BI strict_csums
+Verify each metadata object's checksum before checking anything other fields
+in the metadata object.  If the verification fails, offer to clear the item,
+also before checking any of the other fields.  This option causes e2fsck to
+favor throwing away broken objects over trying to salvage them.
+.TP
+.BI no_strict_csums
+Perform all regular checks of a metadata object and only verify the checksum if
+no problems were found.  This option causes e2fsck to try to salvage slightly
+damaged metadata objects, at the cost of spending processing time on recovering
+data.  This is set as the default.
 .RE
 .TP
 .B \-f
diff --git a/e2fsck/e2fsck.conf.5.in b/e2fsck/e2fsck.conf.5.in
index 9ebfbbf..a8219a8 100644
--- a/e2fsck/e2fsck.conf.5.in
+++ b/e2fsck/e2fsck.conf.5.in
@@ -222,6 +222,26 @@ If this boolean relation is true, e2fsck will run as if the option
 .B -v
 is always specified.  This will cause e2fsck to print some additional
 information at the end of each full file system check.
+.TP
+.I strict_csums
+If this boolean relation is true, e2fsck will run as if
+.B -E strict_csums
+is set.  This causes e2fsck to verify each metadata object's checksum before
+checking anything other fields in the metadata object.  If the verification
+fails, offer to clear the item, also before checking any of the other fields.
+This option causes e2fsck to favor throwing away broken objects over trying to
+salvage them.
+.IP
+If the boolean relation is false, e2fsck will run as if
+.B -E no_strict_csums
+is set.  In this case, e2fsck will perform all regular checks of a metadata
+object and only verify the checksum if no problems were found.  This option
+causes e2fsck to try to salvage slightly damaged metadata objects, at the cost
+of spending processing time on recovering data.
+.IP
+The default is for e2fsck to behave as if
+.B -E no_strict_csums
+is set.
 .SH THE [problems] STANZA
 Each tag in the
 .I [problems] 
diff --git a/e2fsck/e2fsck.h b/e2fsck/e2fsck.h
index 76d15c4..f8b88f0 100644
--- a/e2fsck/e2fsck.h
+++ b/e2fsck/e2fsck.h
@@ -167,6 +167,7 @@ struct resource_track {
 #define E2F_OPT_FRAGCHECK	0x0800
 #define E2F_OPT_JOURNAL_ONLY	0x1000 /* only replay the journal */
 #define E2F_OPT_DISCARD		0x2000
+#define E2F_OPT_CSUM_FIRST	0x4000
 
 /*
  * E2fsck flags
diff --git a/e2fsck/problem.c b/e2fsck/problem.c
index 4da8ba8..a56511a 100644
--- a/e2fsck/problem.c
+++ b/e2fsck/problem.c
@@ -970,7 +970,7 @@ static struct e2fsck_problem problem_table[] = {
 	/* inode checksum does not match inode */
 	{ PR_1_INODE_CSUM_INVALID,
 	  N_("@i %i checksum does not match @i.  "),
-	  PROMPT_CLEAR, PR_PREEN_OK },
+	  PROMPT_CLEAR, PR_PREEN_OK | PR_INITIAL_CSUM },
 
 	/* inode passes checks, but checksum does not match inode */
 	{ PR_1_INODE_ONLY_CSUM_INVALID,
@@ -981,7 +981,7 @@ static struct e2fsck_problem problem_table[] = {
 	{ PR_1_EXTENT_CSUM_INVALID,
 	  N_("@i %i extent block checksum does not match extent\n\t(logical @b "
 	     "%c, @n physical @b %b, len %N)\n"),
-	  PROMPT_CLEAR, 0 },
+	  PROMPT_CLEAR, PR_INITIAL_CSUM },
 
 	/*
 	 * Inode extent block passes checks, but checksum does not match
@@ -996,7 +996,7 @@ static struct e2fsck_problem problem_table[] = {
 	{ PR_1_EA_BLOCK_CSUM_INVALID,
 	  N_("Extended attribute @a @b %b checksum for @i %i does not "
 	     "match.  "),
-	  PROMPT_CLEAR, 0 },
+	  PROMPT_CLEAR, PR_INITIAL_CSUM },
 
 	/*
 	 * Extended attribute block passes checks, but checksum for inode does
@@ -1493,7 +1493,7 @@ static struct e2fsck_problem problem_table[] = {
 	/* leaf node fails checksum */
 	{ PR_2_LEAF_NODE_CSUM_INVALID,
 	  N_("@d @i %i, %B, offset %N: @d fails checksum\n"),
-	  PROMPT_SALVAGE, PR_PREEN_OK },
+	  PROMPT_SALVAGE, PR_PREEN_OK | PR_INITIAL_CSUM },
 
 	/* leaf node has no checksum */
 	{ PR_2_LEAF_NODE_MISSING_CSUM,
@@ -2053,6 +2053,23 @@ int fix_problem(e2fsck_t ctx, problem_t code, struct problem_context *pctx)
 	}
 	if (ctx->logf && message)
 		print_e2fsck_message(ctx->logf, ctx, message, pctx, 1, 0);
+	/*
+	 * If there is a problem with the initial csum verification and the
+	 * user told e2fsck to verify csums /after/ checking everything else,
+	 * then don't "fix" anything, just warn the user that the csum failed
+	 * and that sanity checks are about to be run.
+	 */
+	if ((ptr->flags & PR_INITIAL_CSUM) &&
+	    !(ctx->options & E2F_OPT_CSUM_FIRST)) {
+		if (*message) {
+			print_e2fsck_message(stdout, ctx,
+				"Running sanity checks.\n", pctx, 1, 0);
+			if (ctx->logf)
+				print_e2fsck_message(ctx->logf, ctx,
+					"Running sanity checks.\n", pctx, 1, 0);
+		}
+		return 0;
+	}
 	if (!(ptr->flags & PR_PREEN_OK) && (ptr->prompt != PROMPT_NONE))
 		preenhalt(ctx);
 
diff --git a/e2fsck/problemP.h b/e2fsck/problemP.h
index 7944cd6..a983598 100644
--- a/e2fsck/problemP.h
+++ b/e2fsck/problemP.h
@@ -44,3 +44,4 @@ struct latch_descr {
 #define PR_CONFIG	0x080000 /* This problem has been customized
 				    from the config file */
 #define PR_FORCE_NO	0x100000 /* Force the answer to be no */
+#define PR_INITIAL_CSUM	0x200000 /* User can ignore initial csum check */
diff --git a/e2fsck/unix.c b/e2fsck/unix.c
index d883c9e..17043ae 100644
--- a/e2fsck/unix.c
+++ b/e2fsck/unix.c
@@ -698,6 +698,10 @@ static void parse_extended_opts(e2fsck_t ctx, const char *opts)
 			else
 				ctx->log_fn = string_copy(ctx, arg, 0);
 			continue;
+		} else if (strcmp(token, "strict_csums") == 0) {
+			ctx->options |= E2F_OPT_CSUM_FIRST;
+		} else if (strcmp(token, "no_strict_csums") == 0) {
+			ctx->options &= ~E2F_OPT_CSUM_FIRST;
 		} else {
 			fprintf(stderr, _("Unknown extended option: %s\n"),
 				token);
@@ -716,6 +720,8 @@ static void parse_extended_opts(e2fsck_t ctx, const char *opts)
 		fputs(("\tjournal_only\n"), stderr);
 		fputs(("\tdiscard\n"), stderr);
 		fputs(("\tnodiscard\n"), stderr);
+		fputs(("\tstrict_csums\n"), stderr);
+		fputs(("\tno_strict_csums\n"), stderr);
 		fputc('\n', stderr);
 		exit(1);
 	}
@@ -951,6 +957,11 @@ static errcode_t PRS(int argc, char *argv[], e2fsck_t *ret_ctx)
 	profile_set_syntax_err_cb(syntax_err_report);
 	profile_init(config_fn, &ctx->profile);
 
+	profile_get_boolean(ctx->profile, "options", "strict_csums", NULL,
+			    0, &c);
+	if (c)
+		ctx->options |= E2F_OPT_CSUM_FIRST;
+
 	profile_get_boolean(ctx->profile, "options", "report_time", 0, 0,
 			    &c);
 	if (c)


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 08/18] e2fsck: fix the various checksum error messages
  2014-07-26  0:33 [PATCH 00/18] e2fsprogs patchbomb 7/14, part 2 Darrick J. Wong
                   ` (6 preceding siblings ...)
  2014-07-26  0:34 ` [PATCH 07/18] e2fsck: verify checksums after checking everything else Darrick J. Wong
@ 2014-07-26  0:34 ` Darrick J. Wong
  2014-07-26 21:09   ` Theodore Ts'o
  2014-07-26  0:34 ` [PATCH 09/18] e2fsck: insert a missing dirent tail for checksums if possible Darrick J. Wong
                   ` (9 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2014-07-26  0:34 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

Make the "EA block passes checks but fails checksum" message less
strange, and make the other checksum error messages actually print a
period at the end of the sentence.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 e2fsck/problem.c |   26 ++++++++++++--------------
 1 file changed, 12 insertions(+), 14 deletions(-)


diff --git a/e2fsck/problem.c b/e2fsck/problem.c
index a56511a..8440052 100644
--- a/e2fsck/problem.c
+++ b/e2fsck/problem.c
@@ -992,19 +992,17 @@ static struct e2fsck_problem problem_table[] = {
 	     "extent\n\t(logical @b %c, @n physical @b %b, len %N)\n"),
 	  PROMPT_FIX, 0 },
 
-	/* Extended attribute block checksum for inode does not match. */
+	/* Inode extended attribute block checksum does not match block. */
 	{ PR_1_EA_BLOCK_CSUM_INVALID,
-	  N_("Extended attribute @a @b %b checksum for @i %i does not "
-	     "match.  "),
+	  N_("@i %i @a @b %b checksum does not match block.  "),
 	  PROMPT_CLEAR, PR_INITIAL_CSUM },
 
 	/*
-	 * Extended attribute block passes checks, but checksum for inode does
-	 * not match.
+	 * Inode extended attribute block passes checks, but checksum does not
+	 * match block.
 	 */
 	{ PR_1_EA_BLOCK_ONLY_CSUM_INVALID,
-	  N_("Extended attribute @a @b %b passes checks, but checksum for "
-	     "@i %i does not match.  "),
+	  N_("@i %i @a @b %b passes checks, but checksum does not match block.  "),
 	  PROMPT_FIX, 0 },
 
 	/*
@@ -1482,27 +1480,27 @@ static struct e2fsck_problem problem_table[] = {
 
 	/* htree root node fails checksum */
 	{ PR_2_HTREE_ROOT_CSUM_INVALID,
-	  N_("@p @h %d: root node fails checksum\n"),
+	  N_("@p @h %d: root node fails checksum.\n"),
 	  PROMPT_CLEAR_HTREE, PR_PREEN_OK },
 
 	/* htree internal node fails checksum */
 	{ PR_2_HTREE_NODE_CSUM_INVALID,
-	  N_("@p @h %d: internal node fails checksum\n"),
+	  N_("@p @h %d: internal node fails checksum.\n"),
 	  PROMPT_CLEAR_HTREE, PR_PREEN_OK },
 
 	/* leaf node fails checksum */
 	{ PR_2_LEAF_NODE_CSUM_INVALID,
-	  N_("@d @i %i, %B, offset %N: @d fails checksum\n"),
+	  N_("@d @i %i, %B, offset %N: @d fails checksum.\n"),
 	  PROMPT_SALVAGE, PR_PREEN_OK | PR_INITIAL_CSUM },
 
 	/* leaf node has no checksum */
 	{ PR_2_LEAF_NODE_MISSING_CSUM,
-	  N_("@d @i %i, %B, offset %N: @d has no checksum\n"),
+	  N_("@d @i %i, %B, offset %N: @d has no checksum.\n"),
 	  PROMPT_FIX, PR_PREEN_OK },
 
 	/* leaf node passes checks but fails checksum */
 	{ PR_2_LEAF_NODE_ONLY_CSUM_INVALID,
-	  N_("@d @i %i, %B, offset %N: @d passes checks but fails checksum\n"),
+	  N_("@d @i %i, %B, offset %N: @d passes checks but fails checksum.\n"),
 	  PROMPT_FIX, PR_PREEN_OK },
 
 	/* Pass 3 errors */
@@ -1828,12 +1826,12 @@ static struct e2fsck_problem problem_table[] = {
 
 	/* Group N inode bitmap does not match checksum */
 	{ PR_5_INODE_BITMAP_CSUM_INVALID,
-	  N_("@g %g @i bitmap does not match checksum\n"),
+	  N_("@g %g @i bitmap does not match checksum.\n"),
 	  PROMPT_FIX, PR_LATCH_IBITMAP | PR_PREEN_OK },
 
 	/* Group N block bitmap does not match checksum */
 	{ PR_5_BLOCK_BITMAP_CSUM_INVALID,
-	  N_("@g %g @b bitmap does not match checksum\n"),
+	  N_("@g %g @b bitmap does not match checksum.\n"),
 	  PROMPT_FIX, PR_LATCH_BBITMAP | PR_PREEN_OK },
 
 	/* Post-Pass 5 errors */


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 09/18] e2fsck: insert a missing dirent tail for checksums if possible
  2014-07-26  0:33 [PATCH 00/18] e2fsprogs patchbomb 7/14, part 2 Darrick J. Wong
                   ` (7 preceding siblings ...)
  2014-07-26  0:34 ` [PATCH 08/18] e2fsck: fix the various checksum error messages Darrick J. Wong
@ 2014-07-26  0:34 ` Darrick J. Wong
  2014-07-26 21:13   ` Theodore Ts'o
  2014-07-26  0:34 ` [PATCH 10/18] e2fsck: write dir blocks after new inode when reconstructing root/lost+found Darrick J. Wong
                   ` (8 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2014-07-26  0:34 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

If e2fsck is writing a block of directory entries to disk, it should
adjust the dirents to add the dirent tail if one is missing.  It's not
a big deal if there's no space to do this since rehash (pass 3A) will
reconstruct directories for us.  However, we may as well avoid
unnecessary work.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 e2fsck/pass2.c |   41 ++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 40 insertions(+), 1 deletion(-)


diff --git a/e2fsck/pass2.c b/e2fsck/pass2.c
index 57d5a16..5e31343 100644
--- a/e2fsck/pass2.c
+++ b/e2fsck/pass2.c
@@ -739,6 +739,41 @@ static int is_last_entry(ext2_filsys fs, int inline_data_size,
 		return (offset < fs->blocksize - csum_size);
 }
 
+static errcode_t insert_dirent_tail(ext2_filsys fs, void *dirbuf)
+{
+	struct ext2_dir_entry *d;
+	void *top;
+	struct ext2_dir_entry_tail *t;
+	unsigned int rec_len;
+
+	d = dirbuf;
+	top = EXT2_DIRENT_TAIL(dirbuf, fs->blocksize);
+
+	rec_len = d->rec_len;
+	while (rec_len && !(rec_len & 0x3)) {
+		d = (struct ext2_dir_entry *)(((char *)d) + rec_len);
+		if (((void *)d) + d->rec_len >= top)
+			break;
+		rec_len = d->rec_len;
+	}
+
+	if (d != top) {
+		size_t min_size = EXT2_DIR_REC_LEN(
+				ext2fs_dirent_name_len(dirbuf));
+		if (min_size > d->rec_len - sizeof(struct ext2_dir_entry_tail))
+			return EXT2_ET_DIR_NO_SPACE_FOR_CSUM;
+		d->rec_len -= sizeof(struct ext2_dir_entry_tail);
+	}
+
+	t = (struct ext2_dir_entry_tail *)top;
+	if (t->det_reserved_zero1 ||
+	    t->det_rec_len != sizeof(struct ext2_dir_entry_tail) ||
+	    t->det_reserved_name_len != EXT2_DIR_NAME_LEN_CSUM)
+		ext2fs_initialize_dirent_tail(fs, t);
+
+	return 0;
+}
+
 static int check_dir_block(ext2_filsys fs,
 			   struct ext2_db_entry2 *db,
 			   void *priv_data)
@@ -1275,8 +1310,12 @@ skip_checksum:
 		if (EXT2_HAS_RO_COMPAT_FEATURE(fs->super,
 				EXT4_FEATURE_RO_COMPAT_METADATA_CSUM) &&
 		    is_leaf &&
-		    !ext2fs_dirent_has_tail(fs, (struct ext2_dir_entry *)buf))
+		    !inline_data_size &&
+		    !ext2fs_dirent_has_tail(fs, (struct ext2_dir_entry *)buf)) {
+			if (insert_dirent_tail(fs, buf) == 0)
+				goto write_and_fix;
 			e2fsck_rehash_dir_later(ctx, ino);
+		}
 
 write_and_fix:
 		if (e2fsck_dir_will_be_rehashed(ctx, ino))


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 10/18] e2fsck: write dir blocks after new inode when reconstructing root/lost+found
  2014-07-26  0:33 [PATCH 00/18] e2fsprogs patchbomb 7/14, part 2 Darrick J. Wong
                   ` (8 preceding siblings ...)
  2014-07-26  0:34 ` [PATCH 09/18] e2fsck: insert a missing dirent tail for checksums if possible Darrick J. Wong
@ 2014-07-26  0:34 ` Darrick J. Wong
  2014-07-26 21:18   ` Theodore Ts'o
  2014-07-26  0:34 ` [PATCH 11/18] libext2/fsck: correctly preserve fs flags when modifying ignore-csum-error flag Darrick J. Wong
                   ` (7 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2014-07-26  0:34 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

If we trash the root directory block, e2fsck will find inode 11 (the
old lost+found) and try to attach it to l+f.  The lost+found checker
also fails to find l+f and tries to add one to the root dir.  The root
dir is not found but is recreated with incorrect checksums, so linking
in the l+f dir fails and the l+f '..' entry isn't set.  Since both
dirs now fail checksum verification, they're both referred to rehash
to have that fixed, but because l+f doesn't have a '..' entry, rehash
crashes because l+f has < 2 entries.

On a checksumming filesystem, the routines in e2fsck that recreate
/lost+found and / must write the new directory block *after* the inode
has been written to disk because the checksum depends on i_generation.
Add a regression test while we're at it.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 e2fsck/pass3.c                        |   85 +++++----
 tests/f_rebuild_csum_rootdir/expect.1 |  311 +++++++++++++++++++++++++++++++++
 tests/f_rebuild_csum_rootdir/expect.2 |    7 +
 tests/f_rebuild_csum_rootdir/image.gz |  Bin
 tests/f_rebuild_csum_rootdir/name     |    1 
 5 files changed, 364 insertions(+), 40 deletions(-)
 create mode 100644 tests/f_rebuild_csum_rootdir/expect.1
 create mode 100644 tests/f_rebuild_csum_rootdir/expect.2
 create mode 100644 tests/f_rebuild_csum_rootdir/image.gz
 create mode 100644 tests/f_rebuild_csum_rootdir/name


diff --git a/e2fsck/pass3.c b/e2fsck/pass3.c
index 92e71e7..1ec4015 100644
--- a/e2fsck/pass3.c
+++ b/e2fsck/pass3.c
@@ -205,28 +205,6 @@ skip_new_block:
 	ext2fs_mark_bb_dirty(fs);
 
 	/*
-	 * Now let's create the actual data block for the inode
-	 */
-	pctx.errcode = ext2fs_new_dir_block(fs, EXT2_ROOT_INO, EXT2_ROOT_INO,
-					    &block);
-	if (pctx.errcode) {
-		pctx.str = "ext2fs_new_dir_block";
-		fix_problem(ctx, PR_3_CREATE_ROOT_ERROR, &pctx);
-		ctx->flags |= E2F_FLAG_ABORT;
-		return;
-	}
-
-	pctx.errcode = ext2fs_write_dir_block4(fs, blk, block, 0,
-					       EXT2_ROOT_INO);
-	if (pctx.errcode) {
-		pctx.str = "ext2fs_write_dir_block4";
-		fix_problem(ctx, PR_3_CREATE_ROOT_ERROR, &pctx);
-		ctx->flags |= E2F_FLAG_ABORT;
-		return;
-	}
-	ext2fs_free_mem(&block);
-
-	/*
 	 * Set up the inode structure
 	 */
 	memset(&inode, 0, sizeof(inode));
@@ -249,6 +227,30 @@ skip_new_block:
 	}
 
 	/*
+	 * Now let's create the actual data block for the inode.
+	 * Due to metadata_csum, we must write the dir blocks AFTER
+	 * the inode has been written to disk!
+	 */
+	pctx.errcode = ext2fs_new_dir_block(fs, EXT2_ROOT_INO, EXT2_ROOT_INO,
+					    &block);
+	if (pctx.errcode) {
+		pctx.str = "ext2fs_new_dir_block";
+		fix_problem(ctx, PR_3_CREATE_ROOT_ERROR, &pctx);
+		ctx->flags |= E2F_FLAG_ABORT;
+		return;
+	}
+
+	pctx.errcode = ext2fs_write_dir_block4(fs, blk, block, 0,
+					       EXT2_ROOT_INO);
+	ext2fs_free_mem(&block);
+	if (pctx.errcode) {
+		pctx.str = "ext2fs_write_dir_block4";
+		fix_problem(ctx, PR_3_CREATE_ROOT_ERROR, &pctx);
+		ctx->flags |= E2F_FLAG_ABORT;
+		return;
+	}
+
+	/*
 	 * Miscellaneous bookkeeping...
 	 */
 	e2fsck_add_dir_info(ctx, EXT2_ROOT_INO, EXT2_ROOT_INO);
@@ -472,24 +474,6 @@ skip_new_block:
 	ext2fs_inode_alloc_stats2(fs, ino, +1, 1);
 
 	/*
-	 * Now let's create the actual data block for the inode
-	 */
-	retval = ext2fs_new_dir_block(fs, ino, EXT2_ROOT_INO, &block);
-	if (retval) {
-		pctx.errcode = retval;
-		fix_problem(ctx, PR_3_ERR_LPF_NEW_DIR_BLOCK, &pctx);
-		return 0;
-	}
-
-	retval = ext2fs_write_dir_block4(fs, blk, block, 0, ino);
-	ext2fs_free_mem(&block);
-	if (retval) {
-		pctx.errcode = retval;
-		fix_problem(ctx, PR_3_ERR_LPF_WRITE_BLOCK, &pctx);
-		return 0;
-	}
-
-	/*
 	 * Set up the inode structure
 	 */
 	memset(&inode, 0, sizeof(inode));
@@ -509,6 +493,27 @@ skip_new_block:
 		fix_problem(ctx, PR_3_CREATE_LPF_ERROR, &pctx);
 		return 0;
 	}
+
+	/*
+	 * Now let's create the actual data block for the inode.
+	 * Due to metadata_csum, the directory block MUST be written
+	 * after the inode is written to disk!
+	 */
+	retval = ext2fs_new_dir_block(fs, ino, EXT2_ROOT_INO, &block);
+	if (retval) {
+		pctx.errcode = retval;
+		fix_problem(ctx, PR_3_ERR_LPF_NEW_DIR_BLOCK, &pctx);
+		return 0;
+	}
+
+	retval = ext2fs_write_dir_block4(fs, blk, block, 0, ino);
+	ext2fs_free_mem(&block);
+	if (retval) {
+		pctx.errcode = retval;
+		fix_problem(ctx, PR_3_ERR_LPF_WRITE_BLOCK, &pctx);
+		return 0;
+	}
+
 	/*
 	 * Finally, create the directory link
 	 */
diff --git a/tests/f_rebuild_csum_rootdir/expect.1 b/tests/f_rebuild_csum_rootdir/expect.1
new file mode 100644
index 0000000..4df58f9
--- /dev/null
+++ b/tests/f_rebuild_csum_rootdir/expect.1
@@ -0,0 +1,311 @@
+Pass 1: Checking inodes, blocks, and sizes
+Pass 2: Checking directory structure
+Directory inode 2, block #0, offset 0: directory has no checksum.
+Fix? yes
+
+Directory inode 2, block #0, offset 0: directory corrupted
+Salvage? yes
+
+Missing '.' in directory inode 2.
+Fix? yes
+
+Setting filetype for entry '.' in ??? (2) to 2.
+Missing '..' in directory inode 2.
+Fix? yes
+
+Setting filetype for entry '..' in ??? (2) to 2.
+Pass 3: Checking directory connectivity
+'..' in / (2) is <The NULL inode> (0), should be / (2).
+Fix? yes
+
+Unconnected directory inode 11 (/???)
+Connect to /lost+found? yes
+
+/lost+found not found.  Create? yes
+
+Pass 3A: Optimizing directories
+Pass 4: Checking reference counts
+Inode 11 ref count is 3, should be 2.  Fix? yes
+
+Unattached inode 12
+Connect to /lost+found? yes
+
+Inode 12 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 13
+Connect to /lost+found? yes
+
+Inode 13 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 14
+Connect to /lost+found? yes
+
+Inode 14 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 15
+Connect to /lost+found? yes
+
+Inode 15 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 16
+Connect to /lost+found? yes
+
+Inode 16 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 17
+Connect to /lost+found? yes
+
+Inode 17 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 18
+Connect to /lost+found? yes
+
+Inode 18 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 19
+Connect to /lost+found? yes
+
+Inode 19 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 20
+Connect to /lost+found? yes
+
+Inode 20 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 21
+Connect to /lost+found? yes
+
+Inode 21 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 22
+Connect to /lost+found? yes
+
+Inode 22 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 23
+Connect to /lost+found? yes
+
+Inode 23 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 24
+Connect to /lost+found? yes
+
+Inode 24 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 25
+Connect to /lost+found? yes
+
+Inode 25 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 26
+Connect to /lost+found? yes
+
+Inode 26 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 27
+Connect to /lost+found? yes
+
+Inode 27 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 28
+Connect to /lost+found? yes
+
+Inode 28 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 29
+Connect to /lost+found? yes
+
+Inode 29 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 30
+Connect to /lost+found? yes
+
+Inode 30 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 31
+Connect to /lost+found? yes
+
+Inode 31 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 32
+Connect to /lost+found? yes
+
+Inode 32 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 33
+Connect to /lost+found? yes
+
+Inode 33 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 34
+Connect to /lost+found? yes
+
+Inode 34 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 35
+Connect to /lost+found? yes
+
+Inode 35 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 36
+Connect to /lost+found? yes
+
+Inode 36 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 37
+Connect to /lost+found? yes
+
+Inode 37 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 38
+Connect to /lost+found? yes
+
+Inode 38 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 39
+Connect to /lost+found? yes
+
+Inode 39 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 40
+Connect to /lost+found? yes
+
+Inode 40 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 41
+Connect to /lost+found? yes
+
+Inode 41 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 42
+Connect to /lost+found? yes
+
+Inode 42 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 43
+Connect to /lost+found? yes
+
+Inode 43 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 44
+Connect to /lost+found? yes
+
+Inode 44 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 45
+Connect to /lost+found? yes
+
+Inode 45 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 46
+Connect to /lost+found? yes
+
+Inode 46 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 47
+Connect to /lost+found? yes
+
+Inode 47 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 48
+Connect to /lost+found? yes
+
+Inode 48 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 49
+Connect to /lost+found? yes
+
+Inode 49 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 50
+Connect to /lost+found? yes
+
+Inode 50 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 51
+Connect to /lost+found? yes
+
+Inode 51 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 52
+Connect to /lost+found? yes
+
+Inode 52 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 53
+Connect to /lost+found? yes
+
+Inode 53 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 54
+Connect to /lost+found? yes
+
+Inode 54 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 55
+Connect to /lost+found? yes
+
+Inode 55 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 56
+Connect to /lost+found? yes
+
+Inode 56 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 57
+Connect to /lost+found? yes
+
+Inode 57 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 58
+Connect to /lost+found? yes
+
+Inode 58 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 59
+Connect to /lost+found? yes
+
+Inode 59 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 60
+Connect to /lost+found? yes
+
+Inode 60 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 61
+Connect to /lost+found? yes
+
+Inode 61 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 62
+Connect to /lost+found? yes
+
+Inode 62 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 63
+Connect to /lost+found? yes
+
+Inode 63 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 64
+Connect to /lost+found? yes
+
+Inode 64 ref count is 2, should be 1.  Fix? yes
+
+Unattached zero-length inode 65.  Clear? yes
+
+Unattached inode 66
+Connect to /lost+found? yes
+
+Inode 66 ref count is 2, should be 1.  Fix? yes
+
+Unattached inode 67
+Connect to /lost+found? yes
+
+Inode 67 ref count is 2, should be 1.  Fix? yes
+
+Pass 5: Checking group summary information
+
+test_filesys: ***** FILE SYSTEM WAS MODIFIED *****
+test_filesys: 67/512 files (1.5% non-contiguous), 1127/2048 blocks
+Exit status is 1
diff --git a/tests/f_rebuild_csum_rootdir/expect.2 b/tests/f_rebuild_csum_rootdir/expect.2
new file mode 100644
index 0000000..033f1bf
--- /dev/null
+++ b/tests/f_rebuild_csum_rootdir/expect.2
@@ -0,0 +1,7 @@
+Pass 1: Checking inodes, blocks, and sizes
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+test_filesys: 67/512 files (1.5% non-contiguous), 1127/2048 blocks
+Exit status is 0
diff --git a/tests/f_rebuild_csum_rootdir/image.gz b/tests/f_rebuild_csum_rootdir/image.gz
new file mode 100644
index 0000000000000000000000000000000000000000..a32fd4431a44560b20033d43836000ef22ce977f
GIT binary patch
literal 12476
zcmeI2c~leUyT@t$QMaFB>q3zwwrWuk5Ktr{q?HOkRKy~R2oeP}Dq@f&VRc#;a6z#s
zrpl69L{tz2#3&I4TtEa8ma;FSr9dEopd<uln0fBld(XM|o_o$czd!p&@=tP}yz|cc
zeBS5%exEsK7#C;g9HESNemYIjJ@bmu!CU3;GrDISaQ)YkC7*op=Y<*7pKbba-qmkE
z{p=r`FV7KVy-rRy3bf?&zWaXpYvZqH{p0*kI-Bftzdp6>+ba_$YR?N_+<)}zf`To5
z=hb$P-kkHF`HVWm`KluLK6~u-bcIqdR9ibDd9NH992~rn;rhk?^7~s8DqO8i9iH5G
zbTwI3I@!Bs+3X;vX#e`4%+zo0IIWB#)puBe<k4)1jR@a~ZElv_JmzO_wa#H8NfkT$
zv*p#67WPB&=KV$mH4)-VMq?elNFE}uYs-01N_4+AeAe?m#pQuxWM!6(s7&6qu=g8u
zDwWuhHCTP6qz5~DIUa2>J>5bO@g0*d-aQQe^5?moSuer{AFys?L8)aG_kI{eJy^)P
zyT8XU=k4shrP+9NwN81tFNv)(=H1H!mgoJP2cHO%ch#7zGPbpr$L9Nrqg?{@baZrj
z7=DQ5$~mJa>EDrpV>9Wm*tcinA`+HmmArRVmgyjicYCcr`AoB!{-=C;#g%1~!(FiP
zSW-xA_nO{mbkOmx6f7yFD<_KjIAI2S3@Tv&g0i>y-OR|18&;2&WoE5M#^Z<9AJmLF
z^ehjUPY>zoJSv5)#RD3j(mLTd>GrTDx=P+^pOMRdcD%G#oX3w9sV^saD<@o<ZdEM!
zL`UaGK;p1zPX_Hbvgpf4dzv-BA(7v~eLP`HM6rnc@ARU;tTnHvq2=RUB9X~~oXk`_
z{!&CbimW=RxoDF-RAmyYDsFHKIO?S24qdHieo;~V$so)@%!5aNnvr#zUzNc3?=wvf
z^!(8Fa}e;J=dMXxAJCk9iAe>du06UR8_0+X*{@-rkd%)I-$zb$K3ucyba4~mO$m<H
zpnuFgSu-d)xuf=Xo97?6Z~hQD&J9*kw}A@G?&(83_ED|ehg00U<Tpm~NmAqg+R4+;
zzhdSS$+Ml{H}in;DwS&5G=f6gOD04Ih7*Srm~@KWYEPh6z<om5teDj3ox{-%dDa>b
ze%Mw!iENGr)XL3hj4Ma7*0;bG@&TA{OLhpm{9_oyr0_RXsI)Ux!BCs2U)#yu*t|TJ
z(eYYp`J7jHRlkB|nF-l1oPgQaM$UKi%td@9)R+yfzt0F~44&&|=UKL3zZAi#M9Pi&
z63@PLxf?xRLlFt68?vv1<;a1v2ISU7XQ^FDl?uIl&~x1q(Tmynd98PJD2R%VmEXRV
zSDe}Cz}xQP9<x<#Gp$KDXQld4R<^ivz_Ze;f6fCk7tg%@;hsa-fal}8ev)4spY^wl
z94Hm+JQ^I7d#%&Vcwb77yMa;0KP}erdA9uGi11)<2lAGOS6I#9V6!Kdv}R-bBTr_L
z$KdgiygZ+?sPZs(rD&Z<bZnnPsq4L*tWpnG=|i_+ThBvY{d|7-qxeqsyAHL*3VyM`
z$9}_xCVBT`9y6O49xme51{-+WcXu~8>|Dv;^*ARzTa<4`<}7P3!;9*+>>Ca<%n_t#
z^xWc+Rg$##=a+KgkF>?a4DcV<S!U~I2BF;4bic8#a3))vs5|E4V=vxtD4}NJU&Gw(
z!$*fl9E<q7Y!Xjz35#JSo=ztg*JSVJR#_Bx-n{(s`kSezDWgBl%1k!jI_P;=5xlL6
z+d9~ncUv=U(Q}MfXM%NDiJwMn8!qhJd()>QxkqpB3{*LswzIu7+S^C4s@pjIQryz)
z0&6txc+g}(@#BoV@R$5Y%VRu!Y%S}ChnU4DcrHFZE-{6i;Sc@kRvbzYy}>LXW*}3A
zHysR&jt#VawpJ{!m5f}UOwmoL3=Q*=-?&uV$sSxS;7EhEAx-UF4l#b}@ud6=&f$BF
zEnUs&e&1n{SdMMnhM2u(ef=5NCyiYT`Pd^pQE65%FD%+%{{2?A&=1*J=ssN7(J$Rz
z73;D!YDzKEqT<XCLYjKIZzy~WTT5TL7P5z}?Kjj6H+Oay?5C*WN{K{ry)wKk)zcoy
zvd%AwThWy_9;z`g-Z#5(t7~#~bJ^!v8RPvUBjGnwd=|%S7`PJ_`7}B98F#P63ek13
zeMk21l?{et+yIUjGd&@@#X?heLa?j}A2Ca@>1=99HQFQ2-xa-g!`%CRl824@HNg&-
z2Gg^}T`dz?)5i63_{}wwjb~=YU0r?Sht&RFpPH~#j?g<@VZ6`0FHhCN@J;O>yXkHz
zbsMN0jqN{QmvGz;X$zc1i=N@TF<?CTW6CUX=9lHqozcH6BeKxZmD(LWR@JQ6@4Lsa
zasFvP!|ny~9v^VI8)C6bbTJR2-7vX0wRQWMVSPx+;8y&45?voSEvC4bj~YhF(fHC2
zcblh6pM>|f@YBWqU4IIYF!sH*4h9~Cs-gb9gx#C|pB_M!86DhlPZPT2PJ_@~YP(2h
z|Jx|{xG%pu@}q_p<#O>d1%!lEkiv1vQ+To2fMEGfG><NBKR7=?2`65E^9ncpQ-RuL
z0)^Kj@`?gNJZl7s+$Gdz1JWJS$05W=L3vaRvaC3;+<+7dHX%-43QD-J3%FPsdC>;f
z*sWEuLZVe9W^3MpvLzHZP=!5Nt<vkT;K3ht7t~5&5i}-JS)K8e6MQa4M5i?1$|TSm
z4anHJAqw=WM!49sJ@5$~OKAYb?Rt2Z?XO|^j8ZHn2M!vL`bPS=vTY>~ymghpkhUXH
z&TwL@6Vx{SW(l($DYxo&WV180TIiAT;}^P-iAAsQ?0yA1xdC~J<;Yp$mm`e5Apl31
zM1HM|JvLSkqZtK6(&~vRWqBHjNo2@iUp@AD9H|V@6(H056zHl<5cSow<hgAmr*)xR
zm{EiraR&GMLQuVV(?+6(eu^>1p+6O{-5J{YZo#-rh!h^m@h21hRIsl%AOj=KYOJq-
z;9EQ)0}e^392;EZJr9@0t>wW^-Bv`et{ri6h6@)rfvf4O5!hOxMqh-ETS{fzz4P%T
z^cx-+oW7t&tT|9@K+fmSAVrBu-Ozh#4BpXM)X4N0qgJb!C-y3+fLYsNY3Mjx*Z|Mp
z)FUUH6x0A6a_>{Xac7Wg&QSZ11!Aa3CzOmyTT*=lvctx)D-G=EjjH?}dmeP1N&*);
zXRAzTaoYrsr@tBzdW=qD4NSr-P^FiDlIpGL_19nykM@8Sli<GD3qHnps>%6o#IIry
zIVPb}ZE$|6jS3+ZZKUO*XRw~0ZMj^HvJ<FkqEexZVU*udj!x0uTpmM^;NV&?%qR*y
z4o}tv5H8ZO14PwxB7h6b+_&TTy`kgS^#(MUBSX~VZDgl@EZIAeMv6EpESxJtN0Qq}
z;mQe0Fh6u0vCKq^I#5|g8|l6%mQ=n>Bl&?UW!QchYx1%TgPkF`@LCZ_5-BsO9w`+v
z#K`v*N_3K5vAFOt!ulMC@b55<F#aoF#IEaZC(JH`-Y`1rLn_F@scbVBx^Q-nQZPvu
ziByK(DVeh8p!Cfa!~#~$B$RaFsE39K#3w+*W`~di*V0JVYZ<Gc1bV#|`)e%CCUAC`
zgo?4og;SelDE_F8447>J?)2t!ouRu~$;z3s=D|&R1I(EO@=F8y!s|Yn<F65Vgg`>5
zgdz%iAk0aIh96Z>O|u*+X>~hYn#kUWNu;*4fVi4C0M(z7uTW(~gfskMd4@{Lz1Ypl
zdP_0F!Y*GSY;ZKzM~10nZRFKi7SP|PK$c`8IrO1>alJ}0T}S836)oFuJX`Cnl$U)t
zyqN503pOF@+$v4zoqCp^0SSW07yQs9SKFyK$LP@8Y2_4*g9uO(hJgzR0iA$-3`y<D
zVGt@eC;)meKzz@Tkko0eVv)XV;<crT)$bn%VZV^_Qa~M*Q8%4Y+XJapdBE^-r|125
zK9TwVr>Sd%c++YZwqf_$JGLt7j-!FEgHf8%6$%?|^{_r%I+k)Dc`wPG`aU^vi-~>c
zFt+`dh4EvDa`zTmRUyZ&I?quMT{e1i6_lC8ppCRKNkR-4#UFXdi)Ph;NARRqR|S{3
zFS=&%9_LIScI_26&Zt91x&m3fy|e~ymrUhj`zNs-gqxkw^|?zM4~WAfE#h|AEgOhn
zhaQlzA04cve#tnwe?JE6ee}ULH~oinlDGl#Xb!CJhdHgtXU*|^|9rvoDddff)c`CU
zB%yNciCXjlr1b0oGqs2kt#KW%WqvA&i+p{7I$746Ru4G=!pH-JcbwA`D&&b3Ay~UW
z&PXkSXNi<>t{z>00UGy-9R<`0CLxu|*x@H+$nB*(<Z0O+BtSyVw#L^_2FQfA8+cG+
zPRjvhOGX?4`gG|ZQdxR0z`Yz6i=qEi4O$vD&c4=wyz_Db+9@=QLe~OnCX=v~$WYW#
zg++%bs61N6<7mZtBSY}AdPwN>*9d6YIZuzOmSLp}aKjRR4dlM<g)%0g&rvB)9F`$(
zRk9=`b&(viq$ybAT;E0b^k984^8U~(Ttf>xt+9>4PH+p(b)!BN1l*<%Uabs``d&Q=
zFA9jx^_DHjx3t9^RI%(=s*s6$ZKRae8+S=N;_VE&Rc9%C9&G@-=}W8V6fzo+uEu)S
z*dtjDoJvA=(kl6L0~FUdK}o8=hEcY;fV#&d1c55Xd(99owJ0DG7R6F5a}J~!kU=L+
za7K>}4}R7af-&6=cN&nErAD~@Z}||H`}TwK*EG`Wpb8!v;hgk^Vo;R$lTBgcmUS|4
zRjY+wyDLYad(^)Z`Cm5C$!a~eD;KV$ku}!1va7U&5^v;zL|25=*0v+`DYl)x37$5H
zz+qPisi{vRTLV@3Pgbae;T{SKUEmA$t2yu?k-D?)7EA=uN?~<(517)qrZ%unB-A4{
z6Ky2R!VVZ2MNpDRMeTHg^G$TFv=%acE&@D}(%&6VT{NT()++=@R&juDgR@anl@iZv
zBlSkQ5O^&b%y%bP=|vEqNHzA}LexD4#K5B1DHc8Yk^w0?{sk^_Siys9r=)OM(vE(e
ziA1VNhGo*lZbVb>Pgc<_#mXR(8zJLHFXKU!ZZ;Iqz9EreYDqn)ivl$0o>2;;rTKbp
zE1WdOO_hdX)O2MF%ZvjL3`hrQ0(DkXHNwh$JN5C=q|+J~4gZtO?=cZ0#3?zeYCwTH
zWWt>oCx{Td9D!{W!)&^Z0Y7JiQ;ak?6cEDIPbkD+LfKp68s-|6FxEw-frV+5DQ!$P
zneb-J3C0GhMi@Vgl3XFJ#$GMG*!)}IuCE$~1@R%|yO2yrO23O5dzwa?+2F9Nn~Lf?
z*G9gW)dMWLpCJs$R!;#kpwJh?`BR;L6(HXka=g6Q5Ok><iK|cx>1u-|necLmV4#ij
z#E5r=8v4^nO`wW3d9fZ*A1NTJ7Wq=bn?=x*NQrZlNZwy&pn8w^{xpas7eRuz5^4V*
zD!SGgR_bD~gRXXV%Q$Md39@zuxWoTr3>wYe^agZd_AOW{_t&`7zM1rt>GGO1AlYT+
zDsFVRf^wy+ySAnszIBG(%^J}2F)3pFJA$koqa~mvpe3Lspe3Lspe3Lspe3Lspe3Ls
zpe3Lspe3Lspe3Lspe3Lspe3Lspe3Lspe3Lspe67(B|wQB?3U1P9P7+eL4^IM3;(9e
W)Gq69OyI|Sozg9bf2Chr*ZB{=uW}{;

literal 0
HcmV?d00001

diff --git a/tests/f_rebuild_csum_rootdir/name b/tests/f_rebuild_csum_rootdir/name
new file mode 100644
index 0000000..b246f48
--- /dev/null
+++ b/tests/f_rebuild_csum_rootdir/name
@@ -0,0 +1 @@
+force fsck to rebuild a corrupted rootdir w/ metadata_csum


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 11/18] libext2/fsck: correctly preserve fs flags when modifying ignore-csum-error flag
  2014-07-26  0:33 [PATCH 00/18] e2fsprogs patchbomb 7/14, part 2 Darrick J. Wong
                   ` (9 preceding siblings ...)
  2014-07-26  0:34 ` [PATCH 10/18] e2fsck: write dir blocks after new inode when reconstructing root/lost+found Darrick J. Wong
@ 2014-07-26  0:34 ` Darrick J. Wong
  2014-07-27 23:27   ` Theodore Ts'o
  2014-07-26  0:34 ` [PATCH 12/18] e2fsck: toggle checksum verification error reporting appropriately Darrick J. Wong
                   ` (6 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2014-07-26  0:34 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

When we need to modify the "ignore checksum error" behavior flag to
get us past a library call, it's possible that the library call can
result in other flag bits being changed.  Therefore, it is not correct
to restore unconditionally the previous flags value, since this will
have unintended side effects on the other fs->flags; nor is it correct
to assume that we can unconditionally set (or clear) the "ignore csum
error" flag bit.  Therefore, we must merge the previous value of the
"ignore csum error" flag with the value of flags after the call.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 e2fsck/pass2.c     |   14 ++++++++++----
 e2fsck/pass3.c     |   11 ++++++++---
 e2fsck/rehash.c    |    4 +++-
 e2fsck/util.c      |    5 ++++-
 lib/ext2fs/inode.c |    3 ++-
 5 files changed, 27 insertions(+), 10 deletions(-)


diff --git a/e2fsck/pass2.c b/e2fsck/pass2.c
index 5e31343..9394a29 100644
--- a/e2fsck/pass2.c
+++ b/e2fsck/pass2.c
@@ -1306,6 +1306,7 @@ skip_checksum:
 		}
 	}
 	if (dir_modified) {
+		int	flags, will_rehash;
 		/* leaf block with no tail?  Rehash dirs later. */
 		if (EXT2_HAS_RO_COMPAT_FEATURE(fs->super,
 				EXT4_FEATURE_RO_COMPAT_METADATA_CSUM) &&
@@ -1318,8 +1319,11 @@ skip_checksum:
 		}
 
 write_and_fix:
-		if (e2fsck_dir_will_be_rehashed(ctx, ino))
+		will_rehash = e2fsck_dir_will_be_rehashed(ctx, ino);
+		if (will_rehash) {
+			flags = ctx->fs->flags;
 			ctx->fs->flags |= EXT2_FLAG_IGNORE_CSUM_ERRORS;
+		}
 		if (inline_data_size) {
 			cd->pctx.errcode =
 				ext2fs_inline_data_set(fs, ino, 0, buf,
@@ -1327,8 +1331,11 @@ write_and_fix:
 		} else
 			cd->pctx.errcode = ext2fs_write_dir_block4(fs, block_nr,
 								   buf, 0, ino);
-		if (e2fsck_dir_will_be_rehashed(ctx, ino))
-			ctx->fs->flags &= ~EXT2_FLAG_IGNORE_CSUM_ERRORS;
+		if (will_rehash)
+			ctx->fs->flags = (flags &
+					  EXT2_FLAG_IGNORE_CSUM_ERRORS) |
+					 (ctx->fs->flags &
+					  ~EXT2_FLAG_IGNORE_CSUM_ERRORS);
 		if (cd->pctx.errcode) {
 			if (!fix_problem(ctx, PR_2_WRITE_DIRBLOCK,
 					 &cd->pctx))
@@ -1604,7 +1611,6 @@ int e2fsck_process_bad_inode(e2fsck_t ctx, ext2_ino_t dir,
 	return 0;
 }
 
-
 /*
  * allocate_dir_block --- this function allocates a new directory
  * 	block for a particular inode; this is done if a directory has
diff --git a/e2fsck/pass3.c b/e2fsck/pass3.c
index 1ec4015..9eb859e 100644
--- a/e2fsck/pass3.c
+++ b/e2fsck/pass3.c
@@ -687,6 +687,7 @@ static void fix_dotdot(e2fsck_t ctx, ext2_ino_t ino, ext2_ino_t parent)
 	errcode_t	retval;
 	struct fix_dotdot_struct fp;
 	struct problem_context pctx;
+	int		flags, will_rehash;
 
 	fp.fs = fs;
 	fp.parent = parent;
@@ -699,12 +700,16 @@ static void fix_dotdot(e2fsck_t ctx, ext2_ino_t ino, ext2_ino_t parent)
 
 	clear_problem_context(&pctx);
 	pctx.ino = ino;
-	if (e2fsck_dir_will_be_rehashed(ctx, ino))
+	will_rehash = e2fsck_dir_will_be_rehashed(ctx, ino);
+	if (will_rehash) {
+		flags = ctx->fs->flags;
 		ctx->fs->flags |= EXT2_FLAG_IGNORE_CSUM_ERRORS;
+	}
 	retval = ext2fs_dir_iterate(fs, ino, DIRENT_FLAG_INCLUDE_EMPTY,
 				    0, fix_dotdot_proc, &fp);
-	if (e2fsck_dir_will_be_rehashed(ctx, ino))
-		ctx->fs->flags &= ~EXT2_FLAG_IGNORE_CSUM_ERRORS;
+	if (will_rehash)
+		ctx->fs->flags = (flags & EXT2_FLAG_IGNORE_CSUM_ERRORS) |
+			(ctx->fs->flags & ~EXT2_FLAG_IGNORE_CSUM_ERRORS);
 	if (retval || !fp.done) {
 		pctx.errcode = retval;
 		fix_problem(ctx, retval ? PR_3_FIX_PARENT_ERR :
diff --git a/e2fsck/rehash.c b/e2fsck/rehash.c
index 5913ae7..6d18348 100644
--- a/e2fsck/rehash.c
+++ b/e2fsck/rehash.c
@@ -126,10 +126,12 @@ static int fill_dir_block(ext2_filsys fs,
 		dirent = (struct ext2_dir_entry *) dir;
 		(void) ext2fs_set_rec_len(fs, fs->blocksize, dirent);
 	} else {
+		int flags = fs->flags;
 		fs->flags |= EXT2_FLAG_IGNORE_CSUM_ERRORS;
 		fd->err = ext2fs_read_dir_block4(fs, *block_nr, dir, 0,
 						 fd->dir);
-		fs->flags &= ~EXT2_FLAG_IGNORE_CSUM_ERRORS;
+		fs->flags = (flags & EXT2_FLAG_IGNORE_CSUM_ERRORS) |
+			    (fs->flags & ~EXT2_FLAG_IGNORE_CSUM_ERRORS);
 		if (fd->err)
 			return BLOCK_ABORT;
 	}
diff --git a/e2fsck/util.c b/e2fsck/util.c
index e36e902..8237328 100644
--- a/e2fsck/util.c
+++ b/e2fsck/util.c
@@ -267,6 +267,7 @@ void e2fsck_read_bitmaps(e2fsck_t ctx)
 	errcode_t	retval;
 	const char	*old_op;
 	unsigned int	save_type;
+	int		flags;
 
 	if (ctx->invalid_bitmaps) {
 		com_err(ctx->program_name, 0,
@@ -278,9 +279,11 @@ void e2fsck_read_bitmaps(e2fsck_t ctx)
 	old_op = ehandler_operation(_("reading inode and block bitmaps"));
 	e2fsck_set_bitmap_type(fs, EXT2FS_BMAP64_RBTREE, "fs_bitmaps",
 			       &save_type);
+	flags = ctx->fs->flags;
 	ctx->fs->flags |= EXT2_FLAG_IGNORE_CSUM_ERRORS;
 	retval = ext2fs_read_bitmaps(fs);
-	ctx->fs->flags &= ~EXT2_FLAG_IGNORE_CSUM_ERRORS;
+	ctx->fs->flags = (flags & EXT2_FLAG_IGNORE_CSUM_ERRORS) |
+			 (ctx->fs->flags & ~EXT2_FLAG_IGNORE_CSUM_ERRORS);
 	fs->default_bitmap_type = save_type;
 	ehandler_operation(old_op);
 	if (retval) {
diff --git a/lib/ext2fs/inode.c b/lib/ext2fs/inode.c
index 0cea9f0..ca97ab8 100644
--- a/lib/ext2fs/inode.c
+++ b/lib/ext2fs/inode.c
@@ -717,7 +717,8 @@ errcode_t ext2fs_write_inode_full(ext2_filsys fs, ext2_ino_t ino,
 		retval = ext2fs_read_inode_full(fs, ino,
 						(struct ext2_inode *)w_inode,
 						length);
-		fs->flags = old_flags;
+		fs->flags = (old_flags & EXT2_FLAG_IGNORE_CSUM_ERRORS) |
+			    (fs->flags & ~EXT2_FLAG_IGNORE_CSUM_ERRORS);
 		if (retval)
 			goto errout;
 	}


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 12/18] e2fsck: toggle checksum verification error reporting appropriately
  2014-07-26  0:33 [PATCH 00/18] e2fsprogs patchbomb 7/14, part 2 Darrick J. Wong
                   ` (10 preceding siblings ...)
  2014-07-26  0:34 ` [PATCH 11/18] libext2/fsck: correctly preserve fs flags when modifying ignore-csum-error flag Darrick J. Wong
@ 2014-07-26  0:34 ` Darrick J. Wong
  2014-07-27 23:37   ` Theodore Ts'o
  2014-07-26  0:34 ` [PATCH 13/18] libext2fs: Don't cache inodes that fail checksum verification Darrick J. Wong
                   ` (5 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2014-07-26  0:34 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

There are a few mistakes in the checksum verification error reporting
logic.  First, when we're doing a re-check of an inode that failed
earlier, we must never ignore checksum errors.  Second, if we're
performing sanity checks after an initial checksum verification
failure, then we /should/ disable checksum error reporting for
block_iterate because that function will re-read the inode from disk.
This fixes the numerous "inode checksum failure" problems that cause
e2fsck to abort.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 e2fsck/pass1.c |   36 ++++++++++++++++++++++++++++++++++--
 1 file changed, 34 insertions(+), 2 deletions(-)


diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
index 646ef8a..d09b4eb 100644
--- a/e2fsck/pass1.c
+++ b/e2fsck/pass1.c
@@ -573,13 +573,18 @@ static errcode_t recheck_bad_inode_checksum(ext2_filsys fs, ext2_ino_t ino,
 {
 	errcode_t retval;
 	struct ext2_inode_large inode;
+	int flags;
 
 	/*
 	 * Reread inode.  If we don't see checksum error, then this inode
 	 * has been fixed elsewhere.
 	 */
+	flags = fs->flags;
+	fs->flags &= ~EXT2_FLAG_IGNORE_CSUM_ERRORS;
 	retval = ext2fs_read_inode_full(fs, ino, (struct ext2_inode *)&inode,
 					sizeof(inode));
+	fs->flags = (flags & EXT2_FLAG_IGNORE_CSUM_ERRORS) |
+		    (fs->flags & ~EXT2_FLAG_IGNORE_CSUM_ERRORS);
 	if (retval && retval != EXT2_ET_INODE_CSUM_INVALID)
 		return retval;
 	if (!retval)
@@ -588,11 +593,22 @@ static errcode_t recheck_bad_inode_checksum(ext2_filsys fs, ext2_ino_t ino,
 	/*
 	 * Checksum still doesn't match.  That implies that the inode passes
 	 * all the sanity checks, so maybe the checksum is simply corrupt.
-	 * See if the user will go for fixing that.
+	 * See if the user will go for fixing that.  We have to re-try the
+	 * inode read because the inode struct won't be overwritten if there's
+	 * a checksum verify error.
 	 */
 	if (!fix_problem(ctx, PR_1_INODE_ONLY_CSUM_INVALID, pctx))
 		return 0;
 
+	flags = fs->flags;
+	fs->flags |= EXT2_FLAG_IGNORE_CSUM_ERRORS;
+	retval = ext2fs_read_inode_full(fs, ino, (struct ext2_inode *)&inode,
+					sizeof(inode));
+	fs->flags = (flags & EXT2_FLAG_IGNORE_CSUM_ERRORS) |
+		    (fs->flags & ~EXT2_FLAG_IGNORE_CSUM_ERRORS);
+	if (retval)
+		return retval;
+
 	retval = ext2fs_write_inode_full(fs, ino, (struct ext2_inode *)&inode,
 					 sizeof(inode));
 	if (retval)
@@ -970,6 +986,7 @@ void e2fsck_pass1(e2fsck_t ctx)
 
 		if (ino == EXT2_BAD_INO) {
 			struct process_block_struct pb;
+			int flags;
 
 			if ((inode->i_mode || inode->i_uid || inode->i_gid ||
 			     inode->i_links_count || inode->i_file_acl) &&
@@ -996,8 +1013,13 @@ void e2fsck_pass1(e2fsck_t ctx)
 			pb.inode = inode;
 			pb.pctx = &pctx;
 			pb.ctx = ctx;
+			flags = fs->flags;
+			if (failed_csum)
+				fs->flags |= EXT2_FLAG_IGNORE_CSUM_ERRORS;
 			pctx.errcode = ext2fs_block_iterate3(fs, ino, 0,
 				     block_buf, process_bad_block, &pb);
+			fs->flags = (flags & EXT2_FLAG_IGNORE_CSUM_ERRORS) |
+				    (fs->flags & ~EXT2_FLAG_IGNORE_CSUM_ERRORS);
 			ext2fs_free_block_bitmap(pb.fs_meta_blocks);
 			if (pctx.errcode) {
 				fix_problem(ctx, PR_1_BLOCK_ITERATE, &pctx);
@@ -2453,6 +2475,7 @@ static void check_blocks(e2fsck_t ctx, struct problem_context *pctx,
 		if (extent_fs && (inode->i_flags & EXT4_EXTENTS_FL))
 			check_blocks_extents(ctx, pctx, &pb);
 		else {
+			int flags;
 			/*
 			 * If we've modified the inode, write it out before
 			 * iterate() tries to use it.
@@ -2462,6 +2485,7 @@ static void check_blocks(e2fsck_t ctx, struct problem_context *pctx,
 						   "check_blocks");
 				dirty_inode = 0;
 			}
+			flags = fs->flags;
 			fs->flags |= EXT2_FLAG_IGNORE_CSUM_ERRORS;
 			pctx->errcode = ext2fs_block_iterate3(fs, ino,
 						pb.is_dir ? BLOCK_FLAG_HOLE : 0,
@@ -2480,7 +2504,8 @@ static void check_blocks(e2fsck_t ctx, struct problem_context *pctx,
 			if (pb.inode_modified)
 				e2fsck_read_inode(ctx, ino, inode,
 						  "check_blocks");
-			fs->flags &= ~EXT2_FLAG_IGNORE_CSUM_ERRORS;
+			fs->flags = (flags & EXT2_FLAG_IGNORE_CSUM_ERRORS) |
+				    (fs->flags & ~EXT2_FLAG_IGNORE_CSUM_ERRORS);
 		}
 	} else {
 		/* check inline data */
@@ -2546,10 +2571,17 @@ static void check_blocks(e2fsck_t ctx, struct problem_context *pctx,
 	if (pb.is_dir) {
 		int nblock = inode->i_size >> EXT2_BLOCK_SIZE_BITS(fs->super);
 		if (inode->i_flags & EXT4_INLINE_DATA_FL) {
+			int flags;
 			size_t size;
 
+			flags = ctx->fs->flags;
+			ctx->fs->flags |= EXT2_FLAG_IGNORE_CSUM_ERRORS;
 			if (ext2fs_inline_data_size(ctx->fs, pctx->ino, &size))
 				bad_size = 5;
+			ctx->fs->flags = (flags &
+					  EXT2_FLAG_IGNORE_CSUM_ERRORS) |
+					 (ctx->fs->flags &
+					  ~EXT2_FLAG_IGNORE_CSUM_ERRORS);
 			if (size != inode->i_size)
 				bad_size = 5;
 		} else if (inode->i_size & (fs->blocksize - 1))


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 13/18] libext2fs: Don't cache inodes that fail checksum verification
  2014-07-26  0:33 [PATCH 00/18] e2fsprogs patchbomb 7/14, part 2 Darrick J. Wong
                   ` (11 preceding siblings ...)
  2014-07-26  0:34 ` [PATCH 12/18] e2fsck: toggle checksum verification error reporting appropriately Darrick J. Wong
@ 2014-07-26  0:34 ` Darrick J. Wong
  2014-07-26  0:35 ` [PATCH 14/18] e2fsck: always recheck an inode checksum failure Darrick J. Wong
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 46+ messages in thread
From: Darrick J. Wong @ 2014-07-26  0:34 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

If an inode fails checksum verification, don't stuff a copy of it in
the inode cache, because this can cause the library to fail to return
the "corrupt inode" error code.

In general, this happens if ext2fs_read_inode_full() is called twice
on an inode with an incorrect checksum.  If fs->flags has
EXT2_FLAG_IGNORE_CSUM_ERRORS set during the first call and *unset*
during the second call, the cache hit during the second call fails to
return EXT2_ET_INODE_CSUM_INVALID as you'd expect.  This happens
during fsck if strict_csums is not set, because the first read_inode
call happens as part of check_blocks and the second call happens
during inode checksum revalidation.  A file system with a slightly
corrupt non-extent inode will trigger this.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 lib/ext2fs/inode.c                      |   12 +++++++-----
 tests/f_no_cache_corrupt_inode/expect.1 |   12 ++++++++++++
 tests/f_no_cache_corrupt_inode/expect.2 |    7 +++++++
 tests/f_no_cache_corrupt_inode/image.gz |  Bin
 tests/f_no_cache_corrupt_inode/name     |    1 +
 5 files changed, 27 insertions(+), 5 deletions(-)
 create mode 100644 tests/f_no_cache_corrupt_inode/expect.1
 create mode 100644 tests/f_no_cache_corrupt_inode/expect.2
 create mode 100644 tests/f_no_cache_corrupt_inode/image.gz
 create mode 100644 tests/f_no_cache_corrupt_inode/name


diff --git a/lib/ext2fs/inode.c b/lib/ext2fs/inode.c
index ca97ab8..9cad5c1 100644
--- a/lib/ext2fs/inode.c
+++ b/lib/ext2fs/inode.c
@@ -580,7 +580,7 @@ errcode_t ext2fs_read_inode_full(ext2_filsys fs, ext2_ino_t ino,
 	io_channel	io;
 	int		length = EXT2_INODE_SIZE(fs->super);
 	struct ext2_inode_large	*iptr;
-	int		cache_slot;
+	int		cache_slot, fail_csum;
 
 	EXT2_CHECK_MAGIC(fs, EXT2_ET_MAGIC_EXT2FS_FILSYS);
 
@@ -658,8 +658,8 @@ errcode_t ext2fs_read_inode_full(ext2_filsys fs, ext2_ino_t ino,
 	length = EXT2_INODE_SIZE(fs->super);
 
 	/* Verify the inode checksum. */
-	if (!(fs->flags & EXT2_FLAG_IGNORE_CSUM_ERRORS) &&
-	    !ext2fs_inode_csum_verify(fs, ino, iptr))
+	fail_csum = !ext2fs_inode_csum_verify(fs, ino, iptr);
+	if (!(fs->flags & EXT2_FLAG_IGNORE_CSUM_ERRORS) && fail_csum)
 		return EXT2_ET_INODE_CSUM_INVALID;
 
 #ifdef WORDS_BIGENDIAN
@@ -669,8 +669,10 @@ errcode_t ext2fs_read_inode_full(ext2_filsys fs, ext2_ino_t ino,
 #endif
 
 	/* Update the inode cache bookkeeping */
-	fs->icache->cache_last = cache_slot;
-	fs->icache->cache[cache_slot].ino = ino;
+	if (!fail_csum) {
+		fs->icache->cache_last = cache_slot;
+		fs->icache->cache[cache_slot].ino = ino;
+	}
 	memcpy(inode, iptr, (bufsize > length) ? length : bufsize);
 
 	return 0;
diff --git a/tests/f_no_cache_corrupt_inode/expect.1 b/tests/f_no_cache_corrupt_inode/expect.1
new file mode 100644
index 0000000..94b2cae
--- /dev/null
+++ b/tests/f_no_cache_corrupt_inode/expect.1
@@ -0,0 +1,12 @@
+Pass 1: Checking inodes, blocks, and sizes
+Inode 12 checksum does not match inode.  Running sanity checks.
+Inode 12 passes checks, but checksum does not match inode.  Fix? yes
+
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+
+test_filesys: ***** FILE SYSTEM WAS MODIFIED *****
+test_filesys: 12/128 files (0.0% non-contiguous), 19/512 blocks
+Exit status is 1
diff --git a/tests/f_no_cache_corrupt_inode/expect.2 b/tests/f_no_cache_corrupt_inode/expect.2
new file mode 100644
index 0000000..1b43315
--- /dev/null
+++ b/tests/f_no_cache_corrupt_inode/expect.2
@@ -0,0 +1,7 @@
+Pass 1: Checking inodes, blocks, and sizes
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+test_filesys: 12/128 files (0.0% non-contiguous), 19/512 blocks
+Exit status is 0
diff --git a/tests/f_no_cache_corrupt_inode/image.gz b/tests/f_no_cache_corrupt_inode/image.gz
new file mode 100644
index 0000000000000000000000000000000000000000..e17e9216deac289cf5c13a33be87b1810640a66d
GIT binary patch
literal 2606
zcmb2|=3wyobSapL`Ry(14B<o>h7a#A{ZQZy>*8x|QnKm@V3(D-?OG_n)#_yJ!7jME
zccFxPCX1kBt6$TLWsAF;`vY|rJyPTFYwCAAxQJ1dedE2Sil)~bqkGT(t$SEM{r$V5
z^84lefA)JjF<e?ThsQ|q%aO%5HClpdtCq=KWj(d^V#U*s0h)apXU_8O?RMHzA<=uf
zx777(Vfx+;+L7*iCMn+(*tTZBLGiB(^L{U9TlbZ}?oIvWJ@U7?-M#B~ud{9gI$`~{
zxf=I(#nr{Fdot<o+4QW>&#ou0{Bmr4M&rwy6_4VU%juSV`?OqhEd#^L$xHU%EzDW<
z?ElgobJqIRTYdVPt0{6lLy>`jA>!Y&Pd7g-O%|Nb4&?o-KKcItPl425PH*=!{G4#|
z%h#*_4)4}`VkCStd1)?t_wjx?k7unP7OYm*66uzk1JrTANZ|X^`FE>t1i#FA^?(1n
zRbT%zGB7lJd+|SCBD&hV^8Z5bYM?TQ-QWIe?>@c^EbH)J9w?O2|IZ#sfxw16qFqPl
zgh4nsNrku1cFj5kly<oNk%a}PGJ+Bf_YI>eZLW0GZ$0}^eC_)0ZyanQqF4V<vCS&B
zI`rz7QQ+1sU*vpSFG$bhS${9;gxW<5zD&#Ke)-<tP0b7S?tZ;rVtn^&{r{a`e+Sq}
z^1u6{U+}+w-~IppLehgRv$V?J`@E}{SDUrJ{Zi2%^ZQ=kf7bt~+4Z`<dfC_C1?ng3
ze~F*^Z<P3W&(ldyQe0)UA4hC;xm~x@e&@gZIhjwt)_+Y2S$q45|Ihd@&i|%IA2HLr
zyU*zF>dL)|HA^P`SDl!%e<C*%K_8AP7!85Z5Eu=C(GVC7fzc2kF9bgP-p6Po@Th@-
HL4g4PrH?{0

literal 0
HcmV?d00001

diff --git a/tests/f_no_cache_corrupt_inode/name b/tests/f_no_cache_corrupt_inode/name
new file mode 100644
index 0000000..fb213e2
--- /dev/null
+++ b/tests/f_no_cache_corrupt_inode/name
@@ -0,0 +1 @@
+don't cache inodes that fail checksum verification


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 14/18] e2fsck: always recheck an inode checksum failure
  2014-07-26  0:33 [PATCH 00/18] e2fsprogs patchbomb 7/14, part 2 Darrick J. Wong
                   ` (12 preceding siblings ...)
  2014-07-26  0:34 ` [PATCH 13/18] libext2fs: Don't cache inodes that fail checksum verification Darrick J. Wong
@ 2014-07-26  0:35 ` Darrick J. Wong
  2014-07-26  0:35 ` [PATCH 15/18] e2fsck: clear badblocks inode when checksum fails Darrick J. Wong
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 46+ messages in thread
From: Darrick J. Wong @ 2014-07-26  0:35 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

For all inodes that fail the initial checksum verification, always
re-verify the checksum at the end of inode sanity checks to see if the
user wants to simply fix the checksum.  (Obviously, this won't happen
either if the user chose to zap/fix the inode, or specified
strict_csums to wipe anything at the first sign of trouble.)

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 e2fsck/pass1.c |   48 +++++++++++++++++++++++++++++++++++-------------
 1 file changed, 35 insertions(+), 13 deletions(-)


diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
index d09b4eb..3734645 100644
--- a/e2fsck/pass1.c
+++ b/e2fsck/pass1.c
@@ -653,6 +653,29 @@ static void reserve_block_for_lnf_repair(e2fsck_t ctx)
 	ctx->lnf_repair_block = blk;
 }
 
+static void finish_processing_inode(e2fsck_t ctx, ext2_ino_t ino,
+				    struct problem_context *pctx,
+				    int failed_csum)
+{
+	if (!failed_csum)
+		return;
+
+	/*
+	 * If the inode failed the checksum and the user didn't
+	 * clear the inode, test the checksum again -- if it still
+	 * fails, ask the user if the checksum should be corrected.
+	 */
+	pctx->errcode = recheck_bad_inode_checksum(ctx->fs, ino, ctx, pctx);
+	if (pctx->errcode)
+		ctx->flags |= E2F_FLAG_ABORT;
+}
+#define FINISH_INODE_LOOP(ctx, ino, pctx, failed_csum) \
+	do { \
+		finish_processing_inode((ctx), (ino), (pctx), (failed_csum)); \
+		if ((ctx)->flags & E2F_FLAG_ABORT) \
+			return; \
+	} while (0)
+
 void e2fsck_pass1(e2fsck_t ctx)
 {
 	int	i;
@@ -865,6 +888,7 @@ void e2fsck_pass1(e2fsck_t ctx)
 				alloc_bb_map(ctx);
 			ext2fs_mark_inode_bitmap2(ctx->inode_bb_map, ino);
 			ext2fs_mark_inode_bitmap2(ctx->inode_used_map, ino);
+			FINISH_INODE_LOOP(ctx, ino, &pctx, failed_csum);
 			continue;
 		}
 		if (pctx.errcode &&
@@ -911,6 +935,8 @@ void e2fsck_pass1(e2fsck_t ctx)
 				inlinedata_fs = 1;
 			} else if (!fix_problem(ctx, PR_1_INLINE_DATA_SET, &pctx)) {
 				e2fsck_clear_inode(ctx, ino, inode, 0, "pass1");
+				FINISH_INODE_LOOP(ctx, ino, &pctx,
+						  failed_csum);
 				continue;
 			}
 		}
@@ -945,6 +971,7 @@ void e2fsck_pass1(e2fsck_t ctx)
 				if (ino == EXT2_BAD_INO)
 					ext2fs_mark_inode_bitmap2(ctx->inode_used_map,
 								 ino);
+				FINISH_INODE_LOOP(ctx, ino, &pctx, failed_csum);
 				continue;
 			}
 		}
@@ -1033,6 +1060,7 @@ void e2fsck_pass1(e2fsck_t ctx)
 			}
 			ext2fs_mark_inode_bitmap2(ctx->inode_used_map, ino);
 			clear_problem_context(&pctx);
+			FINISH_INODE_LOOP(ctx, ino, &pctx, failed_csum);
 			continue;
 		} else if (ino == EXT2_ROOT_INO) {
 			/*
@@ -1071,6 +1099,7 @@ void e2fsck_pass1(e2fsck_t ctx)
 							   "pass1");
 				}
 				check_blocks(ctx, &pctx, block_buf);
+				FINISH_INODE_LOOP(ctx, ino, &pctx, failed_csum);
 				continue;
 			}
 			if ((inode->i_links_count ||
@@ -1098,6 +1127,7 @@ void e2fsck_pass1(e2fsck_t ctx)
 							"pass1");
 				}
 				check_blocks(ctx, &pctx, block_buf);
+				FINISH_INODE_LOOP(ctx, ino, &pctx, failed_csum);
 				continue;
 			}
 			if ((inode->i_links_count ||
@@ -1133,6 +1163,7 @@ void e2fsck_pass1(e2fsck_t ctx)
 				}
 			}
 			check_blocks(ctx, &pctx, block_buf);
+			FINISH_INODE_LOOP(ctx, ino, &pctx, failed_csum);
 			continue;
 		}
 
@@ -1176,6 +1207,7 @@ void e2fsck_pass1(e2fsck_t ctx)
 							   "pass1");
 				}
 			}
+			FINISH_INODE_LOOP(ctx, ino, &pctx, failed_csum);
 			continue;
 		}
 		/*
@@ -1274,6 +1306,7 @@ void e2fsck_pass1(e2fsck_t ctx)
 			} else if (ext2fs_inode_data_blocks(fs, inode) == 0) {
 				ctx->fs_fast_symlinks_count++;
 				check_blocks(ctx, &pctx, block_buf);
+				FINISH_INODE_LOOP(ctx, ino, &pctx, failed_csum);
 				continue;
 			}
 		}
@@ -1310,19 +1343,7 @@ void e2fsck_pass1(e2fsck_t ctx)
 		} else
 			check_blocks(ctx, &pctx, block_buf);
 
-		/*
-		 * If the inode failed the checksum and the user didn't
-		 * clear the inode, test the checksum again -- if it still
-		 * fails, ask the user if the checksum should be corrected.
-		 */
-		if (failed_csum) {
-			pctx.errcode = recheck_bad_inode_checksum(fs, ino, ctx,
-								  &pctx);
-			if (pctx.errcode) {
-				ctx->flags |= E2F_FLAG_ABORT;
-				return;
-			}
-		}
+		FINISH_INODE_LOOP(ctx, ino, &pctx, failed_csum);
 
 		if (ctx->flags & E2F_FLAG_SIGNAL_MASK)
 			goto endit;
@@ -1427,6 +1448,7 @@ endit:
 	if ((ctx->flags & E2F_FLAG_SIGNAL_MASK) == 0)
 		print_resource_track(ctx, _("Pass 1"), &rtrack, ctx->fs->io);
 }
+#undef FINISH_INODE_LOOP
 
 /*
  * When the inode_scan routines call this callback at the end of the


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 15/18] e2fsck: clear badblocks inode when checksum fails
  2014-07-26  0:33 [PATCH 00/18] e2fsprogs patchbomb 7/14, part 2 Darrick J. Wong
                   ` (13 preceding siblings ...)
  2014-07-26  0:35 ` [PATCH 14/18] e2fsck: always recheck an inode checksum failure Darrick J. Wong
@ 2014-07-26  0:35 ` Darrick J. Wong
  2014-07-27 23:42   ` Theodore Ts'o
  2014-07-26  0:35 ` [PATCH 16/18] e2fsck: leave room for checksum structure when salvaging a directory Darrick J. Wong
                   ` (2 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2014-07-26  0:35 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

If the badblocks inode fails checksum verification, just clear the
inode and move on.  If we don't do this, we can end up importing a lot
of garbage into the badblocks list, which will then cause fsck to try
to regenerate anything that was sitting atop the supposedly damaged
blocks.  Given that most hardware will remap bad sectors transparently
from ext4, the number of people this could affect adversely is pretty
low.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 e2fsck/pass1.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)


diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
index 3734645..c02f5e9 100644
--- a/e2fsck/pass1.c
+++ b/e2fsck/pass1.c
@@ -1015,8 +1015,10 @@ void e2fsck_pass1(e2fsck_t ctx)
 			struct process_block_struct pb;
 			int flags;
 
-			if ((inode->i_mode || inode->i_uid || inode->i_gid ||
-			     inode->i_links_count || inode->i_file_acl) &&
+			if ((failed_csum || inode->i_mode || inode->i_uid ||
+			     inode->i_gid || inode->i_links_count ||
+			     (inode->i_flags & EXT4_INLINE_DATA_FL) ||
+			     inode->i_file_acl) &&
 			    fix_problem(ctx, PR_1_INVALID_BAD_INODE, &pctx)) {
 				memset(inode, 0, sizeof(struct ext2_inode));
 				e2fsck_write_inode(ctx, ino, inode,


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 16/18] e2fsck: leave room for checksum structure when salvaging a directory
  2014-07-26  0:33 [PATCH 00/18] e2fsprogs patchbomb 7/14, part 2 Darrick J. Wong
                   ` (14 preceding siblings ...)
  2014-07-26  0:35 ` [PATCH 15/18] e2fsck: clear badblocks inode when checksum fails Darrick J. Wong
@ 2014-07-26  0:35 ` Darrick J. Wong
  2014-07-27 23:45   ` Theodore Ts'o
  2014-07-26  0:35 ` [PATCH 17/18] e2fsck: make insert_dirent_tail more robust Darrick J. Wong
  2014-07-26  0:35 ` [PATCH 18/18] e2fsck: don't offer to fix the checksum of fixed extents Darrick J. Wong
  17 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2014-07-26  0:35 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

When we're salvaging a directory, leave room at the end of the block
for the checksum entry so that e2fsck can write the checksummed dir
block out later.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 e2fsck/pass2.c |   19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)


diff --git a/e2fsck/pass2.c b/e2fsck/pass2.c
index 9394a29..925d1a2 100644
--- a/e2fsck/pass2.c
+++ b/e2fsck/pass2.c
@@ -665,7 +665,8 @@ clear_and_exit:
 static void salvage_directory(ext2_filsys fs,
 			      struct ext2_dir_entry *dirent,
 			      struct ext2_dir_entry *prev,
-			      unsigned int *offset)
+			      unsigned int *offset,
+			      unsigned int block_len)
 {
 	char	*cp = (char *) dirent;
 	int left;
@@ -673,7 +674,7 @@ static void salvage_directory(ext2_filsys fs,
 	unsigned int name_len = ext2fs_dirent_name_len(dirent);
 
 	(void) ext2fs_get_rec_len(fs, dirent, &rec_len);
-	left = fs->blocksize - *offset - rec_len;
+	left = block_len - *offset - rec_len;
 
 	/*
 	 * Special case of directory entry of size 8: copy what's left
@@ -703,7 +704,7 @@ static void salvage_directory(ext2_filsys fs,
 	 * previous directory entry absorb the invalid one.
 	 */
 	if (prev && rec_len && (rec_len % 4) == 0 &&
-	    (*offset + rec_len <= fs->blocksize)) {
+	    (*offset + rec_len <= block_len)) {
 		(void) ext2fs_get_rec_len(fs, prev, &prev_rec_len);
 		prev_rec_len += rec_len;
 		(void) ext2fs_set_rec_len(fs, prev_rec_len, prev);
@@ -718,11 +719,11 @@ static void salvage_directory(ext2_filsys fs,
 	 */
 	if (prev) {
 		(void) ext2fs_get_rec_len(fs, prev, &prev_rec_len);
-		prev_rec_len += fs->blocksize - *offset;
+		prev_rec_len += block_len - *offset;
 		(void) ext2fs_set_rec_len(fs, prev_rec_len, prev);
 		*offset = fs->blocksize;
 	} else {
-		rec_len = fs->blocksize - *offset;
+		rec_len = block_len - *offset;
 		(void) ext2fs_set_rec_len(fs, rec_len, dirent);
 		ext2fs_dirent_set_name_len(dirent, 0);
 		ext2fs_dirent_set_file_type(dirent, EXT2_FT_UNKNOWN);
@@ -995,8 +996,12 @@ skip_checksum:
 			    (rec_len < 12) ||
 			    ((rec_len % 4) != 0) ||
 			    ((ext2fs_dirent_name_len(dirent) + 8) > rec_len)) {
-				if (fix_problem(ctx, PR_2_DIR_CORRUPTED, &cd->pctx)) {
-					salvage_directory(fs, dirent, prev, &offset);
+				if (fix_problem(ctx, PR_2_DIR_CORRUPTED,
+						&cd->pctx)) {
+					salvage_directory(fs, dirent, prev,
+							  &offset,
+							  fs->blocksize -
+							  de_csum_size);
 					dir_modified++;
 					continue;
 				} else


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 17/18] e2fsck: make insert_dirent_tail more robust
  2014-07-26  0:33 [PATCH 00/18] e2fsprogs patchbomb 7/14, part 2 Darrick J. Wong
                   ` (15 preceding siblings ...)
  2014-07-26  0:35 ` [PATCH 16/18] e2fsck: leave room for checksum structure when salvaging a directory Darrick J. Wong
@ 2014-07-26  0:35 ` Darrick J. Wong
  2014-07-27 23:48   ` Theodore Ts'o
  2014-07-26  0:35 ` [PATCH 18/18] e2fsck: don't offer to fix the checksum of fixed extents Darrick J. Wong
  17 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2014-07-26  0:35 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

Fix the routine that adds dirent checksum structures to the directory
block to handle oddball situations a bit more robustly.

First, when we're walking the entry array, we might encounter an
entry that ends exactly one byte before where the checksum entry needs
to start, i.e. there's space for the tail entry, but it needs to be
reinitialized.  When that happens, we should proceed until d points to
that space so that the tail entry can be initialized.

Second, it's possible that we've been fed a directory block where the
entries end just short of the end of the block.  In this case, we need
to adjust the size of the last entry to point exactly to where the
dirent tail starts.  The current code requires that entries end
exactly on the block boundary, but this is not always the case with
damaged filesystems.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 e2fsck/pass2.c                       |   15 ++++++---------
 tests/f_corrupt_dirent_tail/expect.1 |   16 ++++++++++++++++
 tests/f_corrupt_dirent_tail/expect.2 |    7 +++++++
 tests/f_corrupt_dirent_tail/image.gz |  Bin
 tests/f_corrupt_dirent_tail/name     |    1 +
 5 files changed, 30 insertions(+), 9 deletions(-)
 create mode 100644 tests/f_corrupt_dirent_tail/expect.1
 create mode 100644 tests/f_corrupt_dirent_tail/expect.2
 create mode 100644 tests/f_corrupt_dirent_tail/image.gz
 create mode 100644 tests/f_corrupt_dirent_tail/name


diff --git a/e2fsck/pass2.c b/e2fsck/pass2.c
index 925d1a2..f1299ec 100644
--- a/e2fsck/pass2.c
+++ b/e2fsck/pass2.c
@@ -740,6 +740,7 @@ static int is_last_entry(ext2_filsys fs, int inline_data_size,
 		return (offset < fs->blocksize - csum_size);
 }
 
+#define NEXT_DIRENT(d)	((void *)((char *)(d) + (d)->rec_len))
 static errcode_t insert_dirent_tail(ext2_filsys fs, void *dirbuf)
 {
 	struct ext2_dir_entry *d;
@@ -750,20 +751,15 @@ static errcode_t insert_dirent_tail(ext2_filsys fs, void *dirbuf)
 	d = dirbuf;
 	top = EXT2_DIRENT_TAIL(dirbuf, fs->blocksize);
 
-	rec_len = d->rec_len;
-	while (rec_len && !(rec_len & 0x3)) {
-		d = (struct ext2_dir_entry *)(((char *)d) + rec_len);
-		if (((void *)d) + d->rec_len >= top)
-			break;
-		rec_len = d->rec_len;
-	}
+	while (d->rec_len && !(d->rec_len & 0x3) && NEXT_DIRENT(d) <= top)
+		d = NEXT_DIRENT(d);
 
 	if (d != top) {
 		size_t min_size = EXT2_DIR_REC_LEN(
 				ext2fs_dirent_name_len(dirbuf));
-		if (min_size > d->rec_len - sizeof(struct ext2_dir_entry_tail))
+		if (min_size > top - (void *)d)
 			return EXT2_ET_DIR_NO_SPACE_FOR_CSUM;
-		d->rec_len -= sizeof(struct ext2_dir_entry_tail);
+		d->rec_len = top - (void *)d;
 	}
 
 	t = (struct ext2_dir_entry_tail *)top;
@@ -774,6 +770,7 @@ static errcode_t insert_dirent_tail(ext2_filsys fs, void *dirbuf)
 
 	return 0;
 }
+#undef NEXT_DIRENT
 
 static int check_dir_block(ext2_filsys fs,
 			   struct ext2_db_entry2 *db,
diff --git a/tests/f_corrupt_dirent_tail/expect.1 b/tests/f_corrupt_dirent_tail/expect.1
new file mode 100644
index 0000000..0813755
--- /dev/null
+++ b/tests/f_corrupt_dirent_tail/expect.1
@@ -0,0 +1,16 @@
+Pass 1: Checking inodes, blocks, and sizes
+Pass 2: Checking directory structure
+Directory inode 2, block #0, offset 0: directory has no checksum.
+Fix? yes
+
+Directory inode 2, block #0, offset 1012: directory corrupted
+Salvage? yes
+
+Pass 3: Checking directory connectivity
+Pass 3A: Optimizing directories
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+
+test_filesys: ***** FILE SYSTEM WAS MODIFIED *****
+test_filesys: 11/128 files (9.1% non-contiguous), 1090/2048 blocks
+Exit status is 1
diff --git a/tests/f_corrupt_dirent_tail/expect.2 b/tests/f_corrupt_dirent_tail/expect.2
new file mode 100644
index 0000000..c42466d
--- /dev/null
+++ b/tests/f_corrupt_dirent_tail/expect.2
@@ -0,0 +1,7 @@
+Pass 1: Checking inodes, blocks, and sizes
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+test_filesys: 11/128 files (9.1% non-contiguous), 1090/2048 blocks
+Exit status is 0
diff --git a/tests/f_corrupt_dirent_tail/image.gz b/tests/f_corrupt_dirent_tail/image.gz
new file mode 100644
index 0000000000000000000000000000000000000000..f2753080fbe81ae0ae09fb62cc9f177810694f3b
GIT binary patch
literal 2685
zcmb2|=3odBS`^I0{PynoY?)9Qh7a%0<lLT-Gy6!_S;d46a$HjbL|8XAnJMh7$S8X6
z*~MyjLE?kT>f^5*c~+=ToYV5{!NJ)|8!t51H^k}4Et&F7L2&ZqjO*ssyG`uFqC0kF
zKKsmhe`e*$H|Of&pZzZLd+*9ndR5EZS@8Mgce9m}G~aGMtL_te@s(MB@A@<DTW){O
zj(omnRqV^Xc}s3v%q<K5+`--3(YanMEb9E3Gh6%qe0conSbqIK^P4^Iew@9q>~Z#b
z+rO``&p$VBcgfes#qaGuzO9b@mSpmIcO~z7@ryS~>hAg4zu))yqsU{;YeDvBzwiBb
zW7@TpUnhUfxq4!iU8AE_!S6T8{LBA7)|~h1%KxBU*H=q_zVtrcNAKwL_y6u@=f_LT
zz1Ylv3O>BgcrSP};ZpwAg~v|q%$+sy>E*z)#g}_*=FeOA?8DADp6BcJb@o+$`g7=#
z^m-}n#QnAJ`*T+Rk9H0Hf8gKMo*yrN8<($}eq6fsU&F4e|DPV6tbWCQMZLpc=2!9o
z{~La>U$I~CkMS%2iuwl!SO35IeEx40)7`5B&8ylYUG6Vk{d?`>xi5VgDmHB4&&_7u
zn#jU};<|>t&u{!VRipoJefGbPPab^!_jKtCpUN%9{}*kRD>@R8ZoOyTS2fR98tS%I
zyH01{|7nwCw5jv$|Mp#XiskFV-p{Ij9kBOv{kPVt=k>{jALcLo>UV2SmPuJ(eWcIZ
zPkW!g`}qB7_tPKWKeIQ>{I~jfJmSmg{ZhZGAJ=b8{iDBMdfBeu7e$T-_biXlR(faN
z7#@}SuU&ra>mUEW9jv<kf9lPH^=(%_{d^v>|M`3ywSNzPE}!{mVr<;+r|(bw{=ay3
z>E8d0D4s#(H0)jP5xDMcm4fR|mHeGLl{;U(cGheQ?s3_A@AX3w_eD|fD|A+U?ZamT
ms)3`t(GVC7fzc2c4S~@R7!3jXgg}D*Pj0pYQjrV{3Jd^J4vVh<

literal 0
HcmV?d00001

diff --git a/tests/f_corrupt_dirent_tail/name b/tests/f_corrupt_dirent_tail/name
new file mode 100644
index 0000000..08259a3
--- /dev/null
+++ b/tests/f_corrupt_dirent_tail/name
@@ -0,0 +1 @@
+rebuild a directory with corrupt dirent tail


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 18/18] e2fsck: don't offer to fix the checksum of fixed extents
  2014-07-26  0:33 [PATCH 00/18] e2fsprogs patchbomb 7/14, part 2 Darrick J. Wong
                   ` (16 preceding siblings ...)
  2014-07-26  0:35 ` [PATCH 17/18] e2fsck: make insert_dirent_tail more robust Darrick J. Wong
@ 2014-07-26  0:35 ` Darrick J. Wong
  2014-07-27 23:52   ` Theodore Ts'o
  17 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2014-07-26  0:35 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

If an extent fails checksum and the sanity checks, and the user elects
to fix the extents, don't bother asking (the second time) if the user
would like to fix the checksum.  Refactor some redundant code to make
what's going on a little cleaner.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 e2fsck/pass1.c |   31 +++++++++++++------------------
 1 file changed, 13 insertions(+), 18 deletions(-)


diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
index c02f5e9..f5e3f11 100644
--- a/e2fsck/pass1.c
+++ b/e2fsck/pass1.c
@@ -2056,12 +2056,14 @@ static void scan_extent_node(e2fsck_t ctx, struct problem_context *pctx,
 		last_lblk = extent.e_lblk + extent.e_len - 1;
 
 		problem = 0;
+		pctx->blk = extent.e_pblk;
+		pctx->blk2 = extent.e_lblk;
+		pctx->num = extent.e_len;
+		pctx->blkcount = extent.e_lblk + extent.e_len;
+
 		/* Ask to clear a corrupt extent block */
 		if (try_repairs &&
 		    pctx->errcode == EXT2_ET_EXTENT_CSUM_INVALID) {
-			pctx->blk = extent.e_pblk;
-			pctx->blk2 = extent.e_lblk;
-			pctx->num = extent.e_len;
 			problem = PR_1_EXTENT_CSUM_INVALID;
 			if (fix_problem(ctx, problem, pctx))
 				goto fix_problem_now;
@@ -2105,25 +2107,18 @@ static void scan_extent_node(e2fsck_t ctx, struct problem_context *pctx,
 			failed_csum = 0;
 		}
 
-		/* Corrupt but passes checks?  Ask to fix checksum. */
-		if (try_repairs && failed_csum) {
-			pctx->blk = extent.e_pblk;
-			pctx->blk2 = extent.e_lblk;
-			pctx->num = extent.e_len;
-			problem = 0;
-			if (fix_problem(ctx, PR_1_EXTENT_ONLY_CSUM_INVALID,
-					pctx)) {
-				pb->inode_modified = 1;
-				ext2fs_extent_replace(ehandle, 0, &extent);
-			}
+		/* Failed csum but passes checks?  Ask to fix checksum. */
+		if (try_repairs && failed_csum && problem == 0 &&
+		    fix_problem(ctx, PR_1_EXTENT_ONLY_CSUM_INVALID, pctx)) {
+			pb->inode_modified = 1;
+			pctx->errcode = ext2fs_extent_replace(ehandle,
+							0, &extent);
+			if (pctx->errcode)
+				return;
 		}
 
 		if (try_repairs && problem) {
 report_problem:
-			pctx->blk = extent.e_pblk;
-			pctx->blk2 = extent.e_lblk;
-			pctx->num = extent.e_len;
-			pctx->blkcount = extent.e_lblk + extent.e_len;
 			if (fix_problem(ctx, problem, pctx)) {
 fix_problem_now:
 				if (ctx->invalid_bitmaps) {


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH 04/18] e2fsck: fix rule-violating lblk->pblk mappings on bigalloc filesystems
  2014-07-26  0:34 ` [PATCH 04/18] e2fsck: fix rule-violating lblk->pblk mappings on bigalloc filesystems Darrick J. Wong
@ 2014-07-26  6:02   ` Andreas Dilger
  2014-07-26 20:27     ` Theodore Ts'o
  0 siblings, 1 reply; 46+ messages in thread
From: Andreas Dilger @ 2014-07-26  6:02 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: tytso, linux-ext4

Wouldn't it be possible to use this information to determine which
of the inodes sharing a duplicate mapped block is the right one
and which inode is the bad one?

Cheers, Andreas

> On Jul 25, 2014, at 18:34, "Darrick J. Wong" <darrick.wong@oracle.com> wrote:
> 
> As far as I can tell, logical block mappings on a bigalloc filesystem are
> supposed to follow a few constraints:
> 
> * The logical cluster offset must match the physical cluster offset.
> * A logical cluster may not map to multiple physical clusters.
> 
> Since the multiply-claimed block recovery code can be used to fix these
> problems, teach e2fsck to find these transgressions and fix them.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
> e2fsck/pass1.c              |   52 +++++++++++
> e2fsck/pass1b.c             |   42 ++++++++-
> e2fsck/problem.c            |    5 +
> e2fsck/problem.h            |    3 +
> tests/f_badcluster/expect   |  198 +++++++++++++++++++++++++++++++++++++++++++
> tests/f_badcluster/image.gz |  Bin
> tests/f_badcluster/name     |    2 
> tests/f_badcluster/script   |   25 +++++
> 8 files changed, 320 insertions(+), 7 deletions(-)
> create mode 100644 tests/f_badcluster/expect
> create mode 100644 tests/f_badcluster/image.gz
> create mode 100644 tests/f_badcluster/name
> create mode 100644 tests/f_badcluster/script
> 
> 
> diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
> index a6552e5..646ef8a 100644
> --- a/e2fsck/pass1.c
> +++ b/e2fsck/pass1.c
> @@ -1945,6 +1945,40 @@ void e2fsck_clear_inode(e2fsck_t ctx, ext2_ino_t ino,
>    e2fsck_write_inode(ctx, ino, inode, source);
> }
> 
> +/*
> + * Use the multiple-blocks reclamation code to fix alignment problems in
> + * a bigalloc filesystem.  We want a logical cluster to map to *only* one
> + * physical cluster, and we want the block offsets within that cluster to
> + * line up.
> + */
> +static int has_unaligned_cluster_map(e2fsck_t ctx,
> +                     blk64_t last_pblk, e2_blkcnt_t last_lblk,
> +                     blk64_t pblk, blk64_t lblk)
> +{
> +    blk64_t cluster_mask;
> +
> +    if (!ctx->fs->cluster_ratio_bits)
> +        return 0;
> +    cluster_mask = EXT2FS_CLUSTER_MASK(ctx->fs);
> +
> +    /*
> +     * If the block in the logical cluster doesn't align with the block in
> +     * the physical cluster...
> +     */
> +    if ((lblk & cluster_mask) != (pblk & cluster_mask))
> +        return 1;
> +
> +    /*
> +     * If we cross a physical cluster boundary within a logical cluster...
> +     */
> +    if (last_pblk && (lblk & cluster_mask) != 0 &&
> +        EXT2FS_B2C(ctx->fs, lblk) == EXT2FS_B2C(ctx->fs, last_lblk) &&
> +        EXT2FS_B2C(ctx->fs, pblk) != EXT2FS_B2C(ctx->fs, last_pblk))
> +        return 1;
> +
> +    return 0;
> +}
> +
> static void scan_extent_node(e2fsck_t ctx, struct problem_context *pctx,
>                 struct process_block_struct *pb,
>                 blk64_t start_block, blk64_t end_block,
> @@ -2249,7 +2283,16 @@ alloc_later:
>                mark_block_used(ctx, blk);
>                pb->num_blocks++;
>            }
> -
> +            if (has_unaligned_cluster_map(ctx, pb->previous_block,
> +                              pb->last_block, blk,
> +                              blockcnt)) {
> +                pctx->blk = blockcnt;
> +                pctx->blk2 = blk;
> +                fix_problem(ctx, PR_1_MISALIGNED_CLUSTER, pctx);
> +                mark_block_used(ctx, blk);
> +                mark_block_used(ctx, blk);
> +            }
> +            pb->last_block = blockcnt;
>            pb->previous_block = blk;
> 
>            if (is_dir) {
> @@ -2815,6 +2858,13 @@ static int process_block(ext2_filsys fs,
>             ((unsigned) blockcnt & EXT2FS_CLUSTER_MASK(ctx->fs)))) {
>        mark_block_used(ctx, blk);
>        p->num_blocks++;
> +    } else if (has_unaligned_cluster_map(ctx, p->previous_block,
> +                         p->last_block, blk, blockcnt)) {
> +        pctx->blk = blockcnt;
> +        pctx->blk2 = blk;
> +        fix_problem(ctx, PR_1_MISALIGNED_CLUSTER, pctx);
> +        mark_block_used(ctx, blk);
> +        mark_block_used(ctx, blk);
>    }
>    if (blockcnt >= 0)
>        p->last_block = blockcnt;
> diff --git a/e2fsck/pass1b.c b/e2fsck/pass1b.c
> index 8d42d10..c0bfa07 100644
> --- a/e2fsck/pass1b.c
> +++ b/e2fsck/pass1b.c
> @@ -261,7 +261,7 @@ struct process_block_struct {
>    e2fsck_t    ctx;
>    ext2_ino_t    ino;
>    int        dup_blocks;
> -    blk64_t        cur_cluster;
> +    blk64_t        cur_cluster, phys_cluster;
>    blk64_t        last_blk;
>    struct ext2_inode *inode;
>    struct problem_context *pctx;
> @@ -317,6 +317,7 @@ static void pass1b(e2fsck_t ctx, char *block_buf)
>        pb.dup_blocks = 0;
>        pb.inode = &inode;
>        pb.cur_cluster = ~0;
> +        pb.phys_cluster = ~0;
>        pb.last_blk = 0;
>        pb.pctx->blk = pb.pctx->blk2 = 0;
> 
> @@ -360,7 +361,7 @@ static int process_pass1b_block(ext2_filsys fs EXT2FS_ATTR((unused)),
> {
>    struct process_block_struct *p;
>    e2fsck_t ctx;
> -    blk64_t    lc;
> +    blk64_t    lc, pc;
>    problem_t op;
> 
>    if (HOLE_BLKADDR(*block_nr))
> @@ -368,6 +369,7 @@ static int process_pass1b_block(ext2_filsys fs EXT2FS_ATTR((unused)),
>    p = (struct process_block_struct *) priv_data;
>    ctx = p->ctx;
>    lc = EXT2FS_B2C(fs, blockcnt);
> +    pc = EXT2FS_B2C(fs, *block_nr);
> 
>    if (!ext2fs_test_block_bitmap2(ctx->block_dup_map, *block_nr))
>        goto finish;
> @@ -389,11 +391,19 @@ static int process_pass1b_block(ext2_filsys fs EXT2FS_ATTR((unused)),
>    p->dup_blocks++;
>    ext2fs_mark_inode_bitmap2(inode_dup_map, p->ino);
> 
> -    if (blockcnt < 0 || lc != p->cur_cluster)
> +    /*
> +     * Qualifications for submitting a block for duplicate processing:
> +     * It's an extent/indirect block (and has a negative logical offset);
> +     * we've crossed a logical cluster boundary; or the physical cluster
> +     * suddenly changed, which indicates that blocks in a logical cluster
> +     * are mapped to multiple physical clusters.
> +     */
> +    if (blockcnt < 0 || lc != p->cur_cluster || pc != p->phys_cluster)
>        add_dupe(ctx, p->ino, EXT2FS_B2C(fs, *block_nr), p->inode);
> 
> finish:
>    p->cur_cluster = lc;
> +    p->phys_cluster = pc;
>    return 0;
> }
> 
> @@ -563,7 +573,11 @@ static void pass1d(e2fsck_t ctx, char *block_buf)
>            pctx.dir = t->dir;
>            fix_problem(ctx, PR_1D_DUP_FILE_LIST, &pctx);
>        }
> -        if (file_ok) {
> +        /*
> +         * Even if the file shares blocks with itself, we still need to
> +         * clone the blocks.
> +         */
> +        if (file_ok && (meta_data ? shared_len+1 : shared_len) != 0) {
>            fix_problem(ctx, PR_1D_DUP_BLOCKS_DEALT, &pctx);
>            continue;
>        }
> @@ -706,9 +720,10 @@ struct clone_struct {
>    errcode_t    errcode;
>    blk64_t        dup_cluster;
>    blk64_t        alloc_block;
> -    ext2_ino_t    dir;
> +    ext2_ino_t    dir, ino;
>    char    *buf;
>    e2fsck_t ctx;
> +    struct ext2_inode    *inode;
> };
> 
> static int clone_file_block(ext2_filsys fs,
> @@ -756,13 +771,26 @@ static int clone_file_block(ext2_filsys fs,
>            decrement_badcount(ctx, *block_nr, p);
> 
>        cs->dup_cluster = c;
> -
> +        /*
> +         * Let's try an implied cluster allocation.  If we get the same
> +         * cluster back, then we need to find a new block; otherwise,
> +         * we're merely fixing the problem of one logical cluster being
> +         * mapped to multiple physical clusters.
> +         */
> +        new_block = 0;
> +        retval = ext2fs_map_cluster_block(fs, cs->ino, cs->inode,
> +                          blockcnt, &new_block);
> +        if (retval == 0 && new_block != 0 &&
> +            EXT2FS_B2C(ctx->fs, new_block) !=
> +            EXT2FS_B2C(ctx->fs, *block_nr))
> +            goto cluster_alloc_ok;
>        retval = ext2fs_new_block2(fs, 0, ctx->block_found_map,
>                       &new_block);
>        if (retval) {
>            cs->errcode = retval;
>            return BLOCK_ABORT;
>        }
> +cluster_alloc_ok:
>        cs->alloc_block = new_block;
> 
>    got_block:
> @@ -817,6 +845,8 @@ static errcode_t clone_file(e2fsck_t ctx, ext2_ino_t ino,
>    cs.dup_cluster = ~0;
>    cs.alloc_block = 0;
>    cs.ctx = ctx;
> +    cs.ino = ino;
> +    cs.inode = &dp->inode;
>    retval = ext2fs_get_mem(fs->blocksize, &cs.buf);
>    if (retval)
>        return retval;
> diff --git a/e2fsck/problem.c b/e2fsck/problem.c
> index 60c02af..4da8ba8 100644
> --- a/e2fsck/problem.c
> +++ b/e2fsck/problem.c
> @@ -1048,6 +1048,11 @@ static struct e2fsck_problem problem_table[] = {
>      N_("@d @i %i has @x marked uninitialized at @b %c.  "),
>      PROMPT_FIX, PR_PREEN_OK },
> 
> +    /* Inode logical block (physical block ) is misaligned. */
> +    { PR_1_MISALIGNED_CLUSTER,
> +      N_("@i %i logical @b %b (physical @b %c) violates cluster allocation rules.\nWill fix in pass 1B.\n"),
> +      PROMPT_NONE, 0 },
> +
>    /* Pass 1b errors */
> 
>    /* Pass 1B: Rescan for duplicate/bad blocks */
> diff --git a/e2fsck/problem.h b/e2fsck/problem.h
> index 6cd3d50..80ef4a2 100644
> --- a/e2fsck/problem.h
> +++ b/e2fsck/problem.h
> @@ -609,6 +609,9 @@ struct problem_context {
> /* uninit directory block */
> #define PR_1_UNINIT_DBLOCK        0x010073
> 
> +/* Inode logical block is misaligned */
> +#define PR_1_MISALIGNED_CLUSTER        0x010074
> +
> /*
>  * Pass 1b errors
>  */
> diff --git a/tests/f_badcluster/expect b/tests/f_badcluster/expect
> new file mode 100644
> index 0000000..eb3bcf0
> --- /dev/null
> +++ b/tests/f_badcluster/expect
> @@ -0,0 +1,198 @@
> +Pass 1: Checking inodes, blocks, and sizes
> +Inode 12 logical block 2 (physical block 1154) violates cluster allocation rules.
> +Will fix in pass 1B.
> +Inode 12, i_blocks is 32, should be 64.  Fix? yes
> +
> +Inode 16 logical block 5 (physical block 1173) violates cluster allocation rules.
> +Will fix in pass 1B.
> +Inode 16, i_size is 3072, should be 6144.  Fix? yes
> +
> +Inode 16, i_blocks is 32, should be 64.  Fix? yes
> +
> +Inode 17 logical block 0 (physical block 1186) violates cluster allocation rules.
> +Will fix in pass 1B.
> +Inode 17 logical block 2 (physical block 1184) violates cluster allocation rules.
> +Will fix in pass 1B.
> +Inode 17, i_blocks is 32, should be 64.  Fix? yes
> +
> +Inode 18 logical block 3 (physical block 1201) violates cluster allocation rules.
> +Will fix in pass 1B.
> +Inode 18, i_blocks is 32, should be 64.  Fix? yes
> +
> +
> +Running additional passes to resolve blocks claimed by more than one inode...
> +Pass 1B: Rescanning for multiply-claimed blocks
> +Multiply-claimed block(s) in inode 12: 1154
> +Multiply-claimed block(s) in inode 13: 1152--1154
> +Multiply-claimed block(s) in inode 14: 1648--1650
> +Multiply-claimed block(s) in inode 15: 1650
> +Multiply-claimed block(s) in inode 16: 1173
> +Multiply-claimed block(s) in inode 17: 1186 1185 1184
> +Multiply-claimed block(s) in inode 18: 1201
> +Pass 1C: Scanning directories for inodes with multiply-claimed blocks
> +Pass 1D: Reconciling multiply-claimed blocks
> +(There are 7 inodes containing multiply-claimed blocks.)
> +
> +File /a (inode #12, mod time Tue Jun 17 08:00:50 2014) 
> +  has 1 multiply-claimed block(s), shared with 1 file(s):
> +    /b (inode #13, mod time Tue Jun 17 08:00:50 2014)
> +Clone multiply-claimed blocks? yes
> +
> +File /b (inode #13, mod time Tue Jun 17 08:00:50 2014) 
> +  has 1 multiply-claimed block(s), shared with 1 file(s):
> +    /a (inode #12, mod time Tue Jun 17 08:00:50 2014)
> +Multiply-claimed blocks already reassigned or cloned.
> +
> +File /c (inode #14, mod time Tue Jun 17 08:00:50 2014) 
> +  has 1 multiply-claimed block(s), shared with 1 file(s):
> +    /d (inode #15, mod time Tue Jun 17 08:00:50 2014)
> +Clone multiply-claimed blocks? yes
> +
> +File /d (inode #15, mod time Tue Jun 17 08:00:50 2014) 
> +  has 1 multiply-claimed block(s), shared with 1 file(s):
> +    /c (inode #14, mod time Tue Jun 17 08:00:50 2014)
> +Multiply-claimed blocks already reassigned or cloned.
> +
> +File /e (inode #16, mod time Tue Jun 17 08:00:50 2014) 
> +  has 1 multiply-claimed block(s), shared with 0 file(s):
> +Clone multiply-claimed blocks? yes
> +
> +File /f (inode #17, mod time Tue Jun 17 08:00:50 2014) 
> +  has 1 multiply-claimed block(s), shared with 0 file(s):
> +Clone multiply-claimed blocks? yes
> +
> +File /g (inode #18, mod time Tue Jun 17 08:00:50 2014) 
> +  has 1 multiply-claimed block(s), shared with 0 file(s):
> +Clone multiply-claimed blocks? yes
> +
> +Pass 2: Checking directory structure
> +Pass 3: Checking directory connectivity
> +Pass 4: Checking reference counts
> +Pass 5: Checking group summary information
> +Free blocks count wrong for group #0 (50, counted=47).
> +Fix? yes
> +
> +Free blocks count wrong (800, counted=752).
> +Fix? yes
> +
> +
> +test_fs: ***** FILE SYSTEM WAS MODIFIED *****
> +test_fs: 18/128 files (22.2% non-contiguous), 1296/2048 blocks
> +Pass 1: Checking inodes, blocks, and sizes
> +Inode 12, i_blocks is 64, should be 32.  Fix? yes
> +
> +Inode 16, i_blocks is 64, should be 32.  Fix? yes
> +
> +Inode 17, i_blocks is 64, should be 32.  Fix? yes
> +
> +Inode 18, i_blocks is 64, should be 32.  Fix? yes
> +
> +Pass 2: Checking directory structure
> +Pass 3: Checking directory connectivity
> +Pass 4: Checking reference counts
> +Pass 5: Checking group summary information
> +Block bitmap differences:  -(1168--1200)
> +Fix? yes
> +
> +Free blocks count wrong for group #0 (47, counted=50).
> +Fix? yes
> +
> +Free blocks count wrong (752, counted=800).
> +Fix? yes
> +
> +
> +test_fs: ***** FILE SYSTEM WAS MODIFIED *****
> +test_fs: 18/128 files (5.6% non-contiguous), 1248/2048 blocks
> +Pass 1: Checking inodes, blocks, and sizes
> +Pass 2: Checking directory structure
> +Pass 3: Checking directory connectivity
> +Pass 4: Checking reference counts
> +Pass 5: Checking group summary information
> +test_fs: 18/128 files (5.6% non-contiguous), 1248/2048 blocks
> +debugfs:  stat /a
> +Inode: 12   Type: regular    Mode:  0644   Flags: 0x80000
> +Generation: 1117152157    Version: 0x00000001
> +User:     0   Group:     0   Size: 3072
> +File ACL: 0    Directory ACL: 0
> +Links: 1   Blockcount: 32
> +Fragment:  Address: 0    Number: 0    Size: 0
> +ctime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
> +atime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
> +mtime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
> +EXTENTS:
> +(0-1):1136-1137, (2):1138
> +debugfs:  stat /b
> +Inode: 13   Type: regular    Mode:  0644   Flags: 0x80000
> +Generation: 1117152158    Version: 0x00000001
> +User:     0   Group:     0   Size: 3072
> +File ACL: 0    Directory ACL: 0
> +Links: 1   Blockcount: 32
> +Fragment:  Address: 0    Number: 0    Size: 0
> +ctime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
> +atime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
> +mtime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
> +EXTENTS:
> +(0-2):1152-1154
> +debugfs:  stat /c
> +Inode: 14   Type: regular    Mode:  0644   Flags: 0x80000
> +Generation: 1117152159    Version: 0x00000001
> +User:     0   Group:     0   Size: 3072
> +File ACL: 0    Directory ACL: 0
> +Links: 1   Blockcount: 32
> +Fragment:  Address: 0    Number: 0    Size: 0
> +ctime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
> +atime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
> +mtime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
> +EXTENTS:
> +(0-1):1216-1217, (2):1218
> +debugfs:  stat /d
> +Inode: 15   Type: regular    Mode:  0644   Flags: 0x0
> +Generation: 1117152160    Version: 0x00000001
> +User:     0   Group:     0   Size: 3072
> +File ACL: 0    Directory ACL: 0
> +Links: 1   Blockcount: 32
> +Fragment:  Address: 0    Number: 0    Size: 0
> +ctime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
> +atime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
> +mtime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
> +BLOCKS:
> +(TIND):1650
> +TOTAL: 1
> +
> +debugfs:  stat /e
> +Inode: 16   Type: regular    Mode:  0644   Flags: 0x80000
> +Generation: 1117152161    Version: 0x00000001
> +User:     0   Group:     0   Size: 6144
> +File ACL: 0    Directory ACL: 0
> +Links: 1   Blockcount: 32
> +Fragment:  Address: 0    Number: 0    Size: 0
> +ctime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
> +atime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
> +mtime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
> +EXTENTS:
> +(0-2):1664-1666, (5):1669
> +debugfs:  stat /f
> +Inode: 17   Type: regular    Mode:  0644   Flags: 0x80000
> +Generation: 1117152162    Version: 0x00000001
> +User:     0   Group:     0   Size: 3072
> +File ACL: 0    Directory ACL: 0
> +Links: 1   Blockcount: 32
> +Fragment:  Address: 0    Number: 0    Size: 0
> +ctime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
> +atime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
> +mtime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
> +EXTENTS:
> +(0):1232, (1):1233, (2):1234
> +debugfs:  stat /g
> +Inode: 18   Type: regular    Mode:  0644   Flags: 0x80000
> +Generation: 1117152163    Version: 0x00000001
> +User:     0   Group:     0   Size: 3072
> +File ACL: 0    Directory ACL: 0
> +Links: 1   Blockcount: 32
> +Fragment:  Address: 0    Number: 0    Size: 0
> +ctime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
> +atime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
> +mtime: 0x539ff5b2 -- Tue Jun 17 08:00:50 2014
> +EXTENTS:
> +(0-2):1680-1682, (3):1683
> +debugfs:  
> \ No newline at end of file
> diff --git a/tests/f_badcluster/image.gz b/tests/f_badcluster/image.gz
> new file mode 100644
> index 0000000000000000000000000000000000000000..e02ee1866868361e6e82bfbe29982365b2e3aa55
> GIT binary patch
> literal 3131
> zcmeH`eNfVO9LF)YGArHAl6flRwbV_eOvC0A4VPY_dFBj}kaU;3`An&ypmc4h9VhIf
> zG;+!qGBs1@gsG+2gU*y0Y50JcUV;y)D1xAjhu^x}b(h}$+@E~@_&j|7`F=jH&*u%L
> z*?=Z=?ARUF%2(nv2aF3y#X5Z+g%u<)LiS(y@}tx>cPz)&G0Qe&A*OK@Raq{8^q_F-
> zbB{#>iF|Dz=b)|E490O%W`;sOzn*Dtd$W+q>f!h2aWn3xA)rK!kjYY12Cz2VS^ia?
> zJKzn{oZ*j8$vKNhzKW^O#IziXdU8FzD*|<eD@NYnUd~^Sa_N2j2lXf(hj_l!WF(Xz
> z^@k=+XFN;5_JiKlfolqV3uj7S$m~OhGOU?ARvy2Z3xiBLmvwf9s?~oQZlIxv;1!zu
> zbzM+@IZwGj2G3`h`BuLoys*(veF#j!uuCmX7XnXQP{(9>l8|3~Qb0L3W_@O8Bou2S
> z#LwPyeSHU4SGh|_3O@<jVw-ulMAQSCYtw-BmUoczrKImN`3b_5sj-MBHFstycPr9Q
> zyDc_LT!%)FD#_s5hLIl5UXnDPD@{u}7ViyT0SH2%u*0u$j#2PSivDR6be36cLR7JI
> z>P9Io$LN#}Hy;+P)k<}%cC(5=1%>EC`mJ`K9DhLsMXNnTY7b5%xHS-GMy*X>nRI$s
> zn4{EJv>{B*zOB8Z6U4SCeQg`RoRS%(Z&ez2MRsCB5pII*p9!eO5eo6@SX?5Fv~;uk
> zMYgdE|JNdavOxx(M+a9JGE#v?A50YtD7>vh#2-d%9rOK)Okcl?dlmE3Udha-PBwlx
> zw<*_LcZb-D6dfZP2dfe{<%37d#Ki?z2z!rK7Yx~>r3c$JM1<s%-6TO8VzA^PvIA2p
> zr@^=X@pVnNrWr5+)70e+;E{U_fA>jX*E;3D-245!%MWU2YQwru2>?*S{bGVA^+1FB
> z8-;3y!|ILos18QJB|+U;YN8OP0+aw@s+)wy9IvNaJ!a&tRX-rM!{|#hNL**Yt+QAN
> zJh9_Kysv$_Bh~mb8`5JR&ddJMoa1}#T#i6&vzy$;rqF-rzqsGWd$8<0N-IB0;jPp~
> z22J;hV9a6Hf&ScQfGR<EF*<B2wZ4daZ;nx^i0)Iqacq`d{&rb6l&4^4_yVusv+Swi
> z1ak*`wtFPmFfS9Bu<1ZooEm+61;F<$9g=?5M=jQYQ<c*bH6}vQ?^dJ9a3$sY-ekEM
> zucY8tA{+GF>N<>h`Nd1)Ci8h@bC`iSpmOlA7MwV}^GPJS^lC&^e~*O6GRb2rQ1h+d
> zB*q7XIxlAAlVK8>hUhH`Qg4!ZahPpe$wQ2fuN{FV4;STob=J3cl1ux;sphvTO-et5
> za=9(yX3Y%vwwp~zE$z|267_p4(cyK{(@S=d<tY<GLiz(@c9y?d_xCfA^QuQ;p;6U-
> zgkHj3QM@#;1&o%EeQo}R%l?CeiQ{1XPRZF1iUXvrAI<|wei)IyMDNH*+K}a%dA%B_
> zeZmRAs#0yds@p(lDKZT^e7?L|o5?bXMz^W(-3eHtg@A>Cg@A>Cg@A>?|4(4iWQ{Gq
> K#Rh>6g8l&!3vo*T
> 
> literal 0
> HcmV?d00001
> 
> diff --git a/tests/f_badcluster/name b/tests/f_badcluster/name
> new file mode 100644
> index 0000000..266f81c
> --- /dev/null
> +++ b/tests/f_badcluster/name
> @@ -0,0 +1,2 @@
> +test alignment problems with bigalloc clusters
> +
> diff --git a/tests/f_badcluster/script b/tests/f_badcluster/script
> new file mode 100644
> index 0000000..ba6b248
> --- /dev/null
> +++ b/tests/f_badcluster/script
> @@ -0,0 +1,25 @@
> +if test -x $DEBUGFS_EXE; then
> +    IMAGE=$test_dir/../f_badcluster/image.gz
> +    OUT=$test_name.log
> +    EXP=$test_dir/expect
> +    gzip -d < $IMAGE > $TMPFILE
> +    ../misc/tune2fs -L test_fs $TMPFILE
> +    ../e2fsck/e2fsck -fy $TMPFILE > $OUT
> +    ../e2fsck/e2fsck -fy $TMPFILE >> $OUT
> +    ../e2fsck/e2fsck -fy $TMPFILE >> $OUT
> +    for i in a b c d e f g; do echo "stat /$i"; done | $DEBUGFS_EXE $TMPFILE >> $OUT
> +
> +    cmp -s $OUT $EXP
> +    status=$?
> +
> +    if [ "$status" = 0 ]; then
> +        echo "$test_name: $test_description: ok"
> +        touch $test_name.ok
> +    else
> +        echo "$test_name: $test_description: failed"
> +        diff $DIFF_OPTS $EXP $OUT > $test_name.failed
> +        rm -f $test_name.tmp
> +    fi
> +else
> +    echo "$test_name: skipped"
> +fi
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 01/18] e2fsck: reserve blocks for root/lost+found directory repair
  2014-07-26  0:33 ` [PATCH 01/18] e2fsck: reserve blocks for root/lost+found directory repair Darrick J. Wong
@ 2014-07-26 19:47   ` Theodore Ts'o
  2014-07-28  7:27     ` Darrick J. Wong
  0 siblings, 1 reply; 46+ messages in thread
From: Theodore Ts'o @ 2014-07-26 19:47 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Fri, Jul 25, 2014 at 05:33:45PM -0700, Darrick J. Wong wrote:
> +static void reserve_block_for_lnf_repair(e2fsck_t ctx)
> +{
> +	blk64_t		blk = 0;
> +	errcode_t	err;
> +	ext2_filsys	fs = ctx->fs;
> +	const char	*name = "lost+found";
> +	ext2_ino_t	ino;
> +
> +	ctx->lnf_repair_block = 0;
> +	if (!ext2fs_lookup(fs, EXT2_ROOT_INO, name, sizeof(name)-1, 0, &ino))
> +		return;

Let me guess... this originally read:

	const char	name[] = "lost+found";

But you changed it without rerunning the regression tests.  :-(

Another reason why there is no such thing as not running the
regression tests too many times, even after the most trivial changes.

I'll fix this up and commit it, with the following comment added:

    [ Fixed up an obvious C trap: const char * and const char [] are not
      the same thing when you are taking the size of the parameter.
      People, run your regression tests!  Like spinache, it's good for you.  :-)
      -- tytso ]

						- Ted


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 02/18] e2fsck: fix merge error in "clear uninit flag on directory extents"
  2014-07-26  0:33 ` [PATCH 02/18] e2fsck: fix merge error in "clear uninit flag on directory extents" Darrick J. Wong
@ 2014-07-26 20:04   ` Theodore Ts'o
  0 siblings, 0 replies; 46+ messages in thread
From: Theodore Ts'o @ 2014-07-26 20:04 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Fri, Jul 25, 2014 at 05:33:51PM -0700, Darrick J. Wong wrote:
> In the original patch (against -next), the hunk to fix uninit dirs was
> just prior to the hunk labelled "Corrupt but passes checks?".  The
> hunks are ordered this way so that if e2fsck obtains permission to fix
> a failed-csum extent (which in turn fixes the checksum), it will not
> subsequently ask to (re)fix the checksum.
> 
> Due to a merge error the hunk moved to the wrong place, so put it
> back.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Thanks, applied.

					- Ted

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 03/18] e2fsck: perform implied cluster allocations when filling a directory hole
  2014-07-26  0:33 ` [PATCH 03/18] e2fsck: perform implied cluster allocations when filling a directory hole Darrick J. Wong
@ 2014-07-26 20:08   ` Theodore Ts'o
  0 siblings, 0 replies; 46+ messages in thread
From: Theodore Ts'o @ 2014-07-26 20:08 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Fri, Jul 25, 2014 at 05:33:57PM -0700, Darrick J. Wong wrote:
> If we're filling a directory hole, we need to perform an implied
> cluster allocation to satisfy the bigalloc rule of mapping only one
> pblk to a logical cluster.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Thanks, applied.

						- Ted

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 04/18] e2fsck: fix rule-violating lblk->pblk mappings on bigalloc filesystems
  2014-07-26  6:02   ` Andreas Dilger
@ 2014-07-26 20:27     ` Theodore Ts'o
  2014-07-28  8:28       ` Darrick J. Wong
  2014-07-28 17:55       ` Darrick J. Wong
  0 siblings, 2 replies; 46+ messages in thread
From: Theodore Ts'o @ 2014-07-26 20:27 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Darrick J. Wong, linux-ext4

On Sat, Jul 26, 2014 at 12:02:10AM -0600, Andreas Dilger wrote:
> Wouldn't it be possible to use this information to determine which
> of the inodes sharing a duplicate mapped block is the right one
> and which inode is the bad one?

Yes, although I'm not sure it's worth the effort.  Putting in more
intelligent hueristics for using the metadata checksums will help all
file systems, and not just bigalloc file systems --- and I think it
would be easier.

> On Jul 25, 2014, at 18:34, "Darrick J. Wong" <darrick.wong@oracle.com> wrote:
> 
> As far as I can tell, logical block mappings on a bigalloc filesystem are
> supposed to follow a few constraints:
> 
> * The logical cluster offset must match the physical cluster offset.
> * A logical cluster may not map to multiple physical clusters.
> 
> Since the multiply-claimed block recovery code can be used to fix these
> problems, teach e2fsck to find these transgressions and fix them.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Thanks, applied, but a couple of tips about writing test scripts.

You originally had this:

	../misc/tune2fs -L test_fs $TMPFILE
	../e2fsck/e2fsck -fy $TMPFILE > $OUT
	../e2fsck/e2fsck -fy $TMPFILE >> $OUT
	../e2fsck/e2fsck -fy $TMPFILE >> $OUT

This spits out the version numbers to stderr, which is ugly.  I fixed
up the binary image file to have the test_fs label, and I changed the
lines above to:

	$FSCK -fy $TMPFILE 2>&1 | sed -f $cmd_dir/filter.sed > $OUT
	$FSCK -fy $TMPFILE 2>&1 | sed -f $cmd_dir/filter.sed >> $OUT
	$FSCK -fy $TMPFILE 2>&1 | sed -f $cmd_dir/filter.sed >> $OUT

The advantage of doing things this way is that any errors get captured
in the log file.  Also, if the user has requested valgrind be used,
that gets reflected in the value of $FSCK.

Cheers,

					- Ted

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 05/18] e2fsck: during pass1b delete_file, only free a cluster once
  2014-07-26  0:34 ` [PATCH 05/18] e2fsck: during pass1b delete_file, only free a cluster once Darrick J. Wong
@ 2014-07-26 20:30   ` Theodore Ts'o
  0 siblings, 0 replies; 46+ messages in thread
From: Theodore Ts'o @ 2014-07-26 20:30 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Fri, Jul 25, 2014 at 05:34:10PM -0700, Darrick J. Wong wrote:
> If we're forced to delete a crosslinked file, only call
> ext2fs_block_alloc_stats2() on cluster boundaries, since the block
> bitmaps are all cluster bitmaps at this point.  It's safe to do this
> only once per cluster since we know all the blocks are going away.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Thanks, applied.

						- Ted

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 07/18] e2fsck: verify checksums after checking everything else
  2014-07-26  0:34 ` [PATCH 07/18] e2fsck: verify checksums after checking everything else Darrick J. Wong
@ 2014-07-26 20:53   ` Theodore Ts'o
  2014-07-28  8:27     ` Darrick J. Wong
  0 siblings, 1 reply; 46+ messages in thread
From: Theodore Ts'o @ 2014-07-26 20:53 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Fri, Jul 25, 2014 at 05:34:22PM -0700, Darrick J. Wong wrote:
> There's a particular problem with e2fsck's user interface where
> checksum errors are concerned:  Fixing the first complaint about
> a checksum problem results in the inode being cleared even if e2fsck
> could otherwise have recovered it.  While this mode is useful for
> cleaning the remaining broken crud off the filesystem, we could at
> least default to checking everything /else/ and only complaining about
> the incorrect checksum if fsck finds nothing else wrong.
> 
> So, plumb in a config option.  We default to "verify and checksum"
> unless the user tell us otherwise.

I'm not convinced this is the right way to go.  Telling the user that
they need to muck with the config file depending on what sort of file
system corruption they have seems rather unsatisfying.

This is what I'd much rather do.  Add a "sanity checking" mode to the
inode scanning functions which gets enabled when EXT2_SF_SANITY_CHECK
is set via ext2fs_inode_scan_flags().  What the sanity check mode does
is every time the inode scan functions read in a new inode table
block, it performs a "sanity check" on the inode table block.  

The sanity check is carried out as follows.  If a majority of the
inodes in the inode table block are "insane" then set the
EXT2_SF_INSANE_ITABLE_BLOCK flag in scan flags, if not, clear this
flag.  If checksum is incorrect, the inode is considered insane.  If
the extent flag is set, and the extent header looks insane, then the
inode is considered insane.  For indirect blocks, if more than 50% of
the blocks in i_blocks[] are invalid, then inode is considered insane.

This is basically a simiplified version of an algorithm which Andreas
has been carrying in Lustre's e2fsprogs for a while, which tries to
apply a hueristic check over multiple inodes to decide whether if we
would be better off just zapping all of the inodes in an inode table
block.  The reason why I never integrated that change into mainline is
that in order to make it work, it violated a large number of
abstractions, and so I considered too ugly to live.

The advantage of doing this all inside lib/ext2fs/inode.c's inode
scanning function is that it's much cleaner.  We can't do as many
checks as Andreas did, but for the rough hueristic of deciding whether
we have a minor problem in a single inode, or a massive problem caused
by garbage written into the inode table or another inode table block
getting written into the wrong place on disk (which we can only do if
metadata checksums are enabled, but that's OK), we can get away with
doing only the obvious "local" checks.

After all, in practice, it's usually either problems in a single inode
(usually caused by a kernel bug or a memory bit flip), or complete
garbage written into the inode table block, or an inode table block
written to wrong place on disk, on top of another inode table block.
So we just need a rough hueristic to distinguish between these cases.

Once we've decided whether the entire inode table block is insane or
not, then what we do is if an inode has any problems at all during the
pass1 scan, we check to see if the inode table block is marked insane.
If it is considered insane, then we just clear the i_links_count and
set dtime, effectively zapping the inode, no questions asked.
Otherwise, we proceed doing the individual fix ups of each inode field.

Does that make sense?

					- Ted

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 06/18] dumpe2fs: add switch to disable checksum verification
  2014-07-26  0:34 ` [PATCH 06/18] dumpe2fs: add switch to disable checksum verification Darrick J. Wong
@ 2014-07-26 20:58   ` Theodore Ts'o
  2014-07-28  7:48     ` Darrick J. Wong
  0 siblings, 1 reply; 46+ messages in thread
From: Theodore Ts'o @ 2014-07-26 20:58 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Fri, Jul 25, 2014 at 05:34:16PM -0700, Darrick J. Wong wrote:
> Add a -n switch to turn off checksum verification.

Instead of adding a -n flag, I wonder if the better thing to do is if
the various functions that might return a checksum error error out, we
print a warning message indicating checksum failure occured, and then
retry with EXT2_FLAG_IGNORE_CSUM_ERRORS.  That is, either retry the
ext2fs_open with the IGNORE_CSUM_ERRORS, or if the file system is
already open, or in EXT2_FLAG_IGNORE_CSUM_ERRORS into fs->flags and
then retry the ext2fs_read_bitmaps() or whatever.

What do you think?

					- Ted

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 08/18] e2fsck: fix the various checksum error messages
  2014-07-26  0:34 ` [PATCH 08/18] e2fsck: fix the various checksum error messages Darrick J. Wong
@ 2014-07-26 21:09   ` Theodore Ts'o
  2014-07-28  7:57     ` Darrick J. Wong
  0 siblings, 1 reply; 46+ messages in thread
From: Theodore Ts'o @ 2014-07-26 21:09 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Fri, Jul 25, 2014 at 05:34:28PM -0700, Darrick J. Wong wrote:
> Make the "EA block passes checks but fails checksum" message less
> strange, and make the other checksum error messages actually print a
> period at the end of the sentence.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Applied (with some slight adjustments since I haven't pulled in the
previous patch pending discussion).  I note that there was no need to
adjust any of the expected messages in the e2fsprogs regression tests.

This indicates that we don't have enough (well, any) regression tests
covering the metadata checksum feature.  :-(

						- Ted

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 09/18] e2fsck: insert a missing dirent tail for checksums if possible
  2014-07-26  0:34 ` [PATCH 09/18] e2fsck: insert a missing dirent tail for checksums if possible Darrick J. Wong
@ 2014-07-26 21:13   ` Theodore Ts'o
  0 siblings, 0 replies; 46+ messages in thread
From: Theodore Ts'o @ 2014-07-26 21:13 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Fri, Jul 25, 2014 at 05:34:35PM -0700, Darrick J. Wong wrote:
> If e2fsck is writing a block of directory entries to disk, it should
> adjust the dirents to add the dirent tail if one is missing.  It's not
> a big deal if there's no space to do this since rehash (pass 3A) will
> reconstruct directories for us.  However, we may as well avoid
> unnecessary work.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Thanks, applied.

					- Ted

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 10/18] e2fsck: write dir blocks after new inode when reconstructing root/lost+found
  2014-07-26  0:34 ` [PATCH 10/18] e2fsck: write dir blocks after new inode when reconstructing root/lost+found Darrick J. Wong
@ 2014-07-26 21:18   ` Theodore Ts'o
  0 siblings, 0 replies; 46+ messages in thread
From: Theodore Ts'o @ 2014-07-26 21:18 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Fri, Jul 25, 2014 at 05:34:41PM -0700, Darrick J. Wong wrote:
> If we trash the root directory block, e2fsck will find inode 11 (the
> old lost+found) and try to attach it to l+f.  The lost+found checker
> also fails to find l+f and tries to add one to the root dir.  The root
> dir is not found but is recreated with incorrect checksums, so linking
> in the l+f dir fails and the l+f '..' entry isn't set.  Since both
> dirs now fail checksum verification, they're both referred to rehash
> to have that fixed, but because l+f doesn't have a '..' entry, rehash
> crashes because l+f has < 2 entries.
> 
> On a checksumming filesystem, the routines in e2fsck that recreate
> /lost+found and / must write the new directory block *after* the inode
> has been written to disk because the checksum depends on i_generation.
> Add a regression test while we're at it.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Thanks, applied.

						- Ted

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 11/18] libext2/fsck: correctly preserve fs flags when modifying ignore-csum-error flag
  2014-07-26  0:34 ` [PATCH 11/18] libext2/fsck: correctly preserve fs flags when modifying ignore-csum-error flag Darrick J. Wong
@ 2014-07-27 23:27   ` Theodore Ts'o
  2014-07-28  8:06     ` Darrick J. Wong
  0 siblings, 1 reply; 46+ messages in thread
From: Theodore Ts'o @ 2014-07-27 23:27 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Fri, Jul 25, 2014 at 05:34:47PM -0700, Darrick J. Wong wrote:
> nor is it correct
> to assume that we can unconditionally set (or clear) the "ignore csum
> error" flag bit.

For functions inside e2fsck, why is this true?  All of the places
where we are are EXT2_FLAG_IGNORE_CSUM_ERRORS are places where we set
it, and then clear it after an ext2 call.  So as near as I can tell it
shouldn't matter.

I can understand why we need to be careful for functions inside
libext2fs, but in that particular case, none of the downstream
functions of ext2fs_read_inode_full() modify fs->flags.

So I'm really puzzled what problem this patch actually solves.  Was
this a theoretical concern, or was there something I missed?

       		   	       	   	 - Ted

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 12/18] e2fsck: toggle checksum verification error reporting appropriately
  2014-07-26  0:34 ` [PATCH 12/18] e2fsck: toggle checksum verification error reporting appropriately Darrick J. Wong
@ 2014-07-27 23:37   ` Theodore Ts'o
  2014-07-28  7:38     ` Darrick J. Wong
  0 siblings, 1 reply; 46+ messages in thread
From: Theodore Ts'o @ 2014-07-27 23:37 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Fri, Jul 25, 2014 at 05:34:53PM -0700, Darrick J. Wong wrote:
> There are a few mistakes in the checksum verification error reporting
> logic.  First, when we're doing a re-check of an inode that failed
> earlier, we must never ignore checksum errors.  Second, if we're
> performing sanity checks after an initial checksum verification
> failure, then we /should/ disable checksum error reporting for
> block_iterate because that function will re-read the inode from disk.
> This fixes the numerous "inode checksum failure" problems that cause
> e2fsck to abort.

I'm starting to wonder if we just set IGNORE_CSUM_ERRORS when we open
the file system, and explicitly check the checksums in e2fsck.  It
might make the logic clearer, especially when we start trying to be a
bit more sophisticated in handling checksum errors.

Otherwise, if we get a checksum error, we would have to set the flag
and then retry the read.

					- Ted

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 15/18] e2fsck: clear badblocks inode when checksum fails
  2014-07-26  0:35 ` [PATCH 15/18] e2fsck: clear badblocks inode when checksum fails Darrick J. Wong
@ 2014-07-27 23:42   ` Theodore Ts'o
  0 siblings, 0 replies; 46+ messages in thread
From: Theodore Ts'o @ 2014-07-27 23:42 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Fri, Jul 25, 2014 at 05:35:12PM -0700, Darrick J. Wong wrote:
> If the badblocks inode fails checksum verification, just clear the
> inode and move on.  If we don't do this, we can end up importing a lot
> of garbage into the badblocks list, which will then cause fsck to try
> to regenerate anything that was sitting atop the supposedly damaged
> blocks.  Given that most hardware will remap bad sectors transparently
> from ext4, the number of people this could affect adversely is pretty
> low.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Thanks, applied.

					- Ted

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 16/18] e2fsck: leave room for checksum structure when salvaging a directory
  2014-07-26  0:35 ` [PATCH 16/18] e2fsck: leave room for checksum structure when salvaging a directory Darrick J. Wong
@ 2014-07-27 23:45   ` Theodore Ts'o
  0 siblings, 0 replies; 46+ messages in thread
From: Theodore Ts'o @ 2014-07-27 23:45 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Fri, Jul 25, 2014 at 05:35:18PM -0700, Darrick J. Wong wrote:
> When we're salvaging a directory, leave room at the end of the block
> for the checksum entry so that e2fsck can write the checksummed dir
> block out later.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Applied, thanks.

					- Ted

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 17/18] e2fsck: make insert_dirent_tail more robust
  2014-07-26  0:35 ` [PATCH 17/18] e2fsck: make insert_dirent_tail more robust Darrick J. Wong
@ 2014-07-27 23:48   ` Theodore Ts'o
  0 siblings, 0 replies; 46+ messages in thread
From: Theodore Ts'o @ 2014-07-27 23:48 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Fri, Jul 25, 2014 at 05:35:25PM -0700, Darrick J. Wong wrote:
> Fix the routine that adds dirent checksum structures to the directory
> block to handle oddball situations a bit more robustly.
> 
> First, when we're walking the entry array, we might encounter an
> entry that ends exactly one byte before where the checksum entry needs
> to start, i.e. there's space for the tail entry, but it needs to be
> reinitialized.  When that happens, we should proceed until d points to
> that space so that the tail entry can be initialized.
> 
> Second, it's possible that we've been fed a directory block where the
> entries end just short of the end of the block.  In this case, we need
> to adjust the size of the last entry to point exactly to where the
> dirent tail starts.  The current code requires that entries end
> exactly on the block boundary, but this is not always the case with
> damaged filesystems.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Applied, thanks.

					- Ted

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 18/18] e2fsck: don't offer to fix the checksum of fixed extents
  2014-07-26  0:35 ` [PATCH 18/18] e2fsck: don't offer to fix the checksum of fixed extents Darrick J. Wong
@ 2014-07-27 23:52   ` Theodore Ts'o
  0 siblings, 0 replies; 46+ messages in thread
From: Theodore Ts'o @ 2014-07-27 23:52 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Fri, Jul 25, 2014 at 05:35:31PM -0700, Darrick J. Wong wrote:
> If an extent fails checksum and the sanity checks, and the user elects
> to fix the extents, don't bother asking (the second time) if the user
> would like to fix the checksum.  Refactor some redundant code to make
> what's going on a little cleaner.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Applied, thanks.

					- Ted

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 01/18] e2fsck: reserve blocks for root/lost+found directory repair
  2014-07-26 19:47   ` Theodore Ts'o
@ 2014-07-28  7:27     ` Darrick J. Wong
  0 siblings, 0 replies; 46+ messages in thread
From: Darrick J. Wong @ 2014-07-28  7:27 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: linux-ext4

On Sat, Jul 26, 2014 at 03:47:03PM -0400, Theodore Ts'o wrote:
> On Fri, Jul 25, 2014 at 05:33:45PM -0700, Darrick J. Wong wrote:
> > +static void reserve_block_for_lnf_repair(e2fsck_t ctx)
> > +{
> > +	blk64_t		blk = 0;
> > +	errcode_t	err;
> > +	ext2_filsys	fs = ctx->fs;
> > +	const char	*name = "lost+found";
> > +	ext2_ino_t	ino;
> > +
> > +	ctx->lnf_repair_block = 0;
> > +	if (!ext2fs_lookup(fs, EXT2_ROOT_INO, name, sizeof(name)-1, 0, &ino))
> > +		return;
> 
> Let me guess... this originally read:
> 
> 	const char	name[] = "lost+found";
> 
> But you changed it without rerunning the regression tests.  :-(

Oops.

Actually, I /did/ rerun the regression tests, and nothing blew up.  Now I'm
puzzling over why it worked.  I'll look into that tomorrow.

Thanks for catching that.

--D
> 
> Another reason why there is no such thing as not running the
> regression tests too many times, even after the most trivial changes.
> 
> I'll fix this up and commit it, with the following comment added:
> 
>     [ Fixed up an obvious C trap: const char * and const char [] are not
>       the same thing when you are taking the size of the parameter.
>       People, run your regression tests!  Like spinache, it's good for you.  :-)
>       -- tytso ]
> 
> 						- Ted
> 

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 12/18] e2fsck: toggle checksum verification error reporting appropriately
  2014-07-27 23:37   ` Theodore Ts'o
@ 2014-07-28  7:38     ` Darrick J. Wong
  2014-07-28 11:41       ` Theodore Ts'o
  0 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2014-07-28  7:38 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: linux-ext4

On Sun, Jul 27, 2014 at 07:37:22PM -0400, Theodore Ts'o wrote:
> On Fri, Jul 25, 2014 at 05:34:53PM -0700, Darrick J. Wong wrote:
> > There are a few mistakes in the checksum verification error reporting
> > logic.  First, when we're doing a re-check of an inode that failed
> > earlier, we must never ignore checksum errors.  Second, if we're
> > performing sanity checks after an initial checksum verification
> > failure, then we /should/ disable checksum error reporting for
> > block_iterate because that function will re-read the inode from disk.
> > This fixes the numerous "inode checksum failure" problems that cause
> > e2fsck to abort.
> 
> I'm starting to wonder if we just set IGNORE_CSUM_ERRORS when we open
> the file system, and explicitly check the checksums in e2fsck.  It
> might make the logic clearer, especially when we start trying to be a
> bit more sophisticated in handling checksum errors.

We could make e2fsck verify the checksums itself, though we'd have to provide a
way for either (a) libext2fs to provide the raw buffer data to e2fsck or (b)
e2fsck to find the block number in question and (re)read the raw buffer, since
the checksums are computed against the on-disk structures.

This could get a bit nasty for EA and extent handling, since e2fsck doesn't
touch the underlying blocks directly, and the checksums are per-block, not per
FS object.

> Otherwise, if we get a checksum error, we would have to set the flag
> and then retry the read.

Also doable.

--D
> 
> 					- Ted

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 06/18] dumpe2fs: add switch to disable checksum verification
  2014-07-26 20:58   ` Theodore Ts'o
@ 2014-07-28  7:48     ` Darrick J. Wong
  0 siblings, 0 replies; 46+ messages in thread
From: Darrick J. Wong @ 2014-07-28  7:48 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: linux-ext4

On Sat, Jul 26, 2014 at 04:58:08PM -0400, Theodore Ts'o wrote:
> On Fri, Jul 25, 2014 at 05:34:16PM -0700, Darrick J. Wong wrote:
> > Add a -n switch to turn off checksum verification.
> 
> Instead of adding a -n flag, I wonder if the better thing to do is if
> the various functions that might return a checksum error error out, we
> print a warning message indicating checksum failure occured, and then
> retry with EXT2_FLAG_IGNORE_CSUM_ERRORS.  That is, either retry the
> ext2fs_open with the IGNORE_CSUM_ERRORS, or if the file system is
> already open, or in EXT2_FLAG_IGNORE_CSUM_ERRORS into fs->flags and
> then retry the ext2fs_read_bitmaps() or whatever.
> 
> What do you think?

My reason for this approach is that forcing the user to tack on "-n" makes it
less likely that a checksum error will be buried in the output and forgotten.

That said I don't have any objection to this approach either.  The checksum
complaint could be printed at the end.  I think the error message should also
tell the user to run e2fsck.

--D
> 
> 					- Ted

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 08/18] e2fsck: fix the various checksum error messages
  2014-07-26 21:09   ` Theodore Ts'o
@ 2014-07-28  7:57     ` Darrick J. Wong
  0 siblings, 0 replies; 46+ messages in thread
From: Darrick J. Wong @ 2014-07-28  7:57 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: linux-ext4

On Sat, Jul 26, 2014 at 05:09:53PM -0400, Theodore Ts'o wrote:
> On Fri, Jul 25, 2014 at 05:34:28PM -0700, Darrick J. Wong wrote:
> > Make the "EA block passes checks but fails checksum" message less
> > strange, and make the other checksum error messages actually print a
> > period at the end of the sentence.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Applied (with some slight adjustments since I haven't pulled in the
> previous patch pending discussion).  I note that there was no need to
> adjust any of the expected messages in the e2fsprogs regression tests.
> 
> This indicates that we don't have enough (well, any) regression tests
> covering the metadata checksum feature.  :-(

That's what the metadata checksum test script was supposed to be about.  I'm
not sure if it's actually getting through to anyone -- Google doesn't seem to
have any links to it.

It's a big script that takes a bunch of mke2fs flags and runs through a bunch
of corruption detection tests to see if the running kernel and e2fsck notice.
The nice thing about the script is that it can cycle through a lot of different
features, block sizes, etc. pretty quickly, which seems difficult to do with
the plain 'make check' tests.

I could look into turning them into a bunch of ext4 xfstests.

--D
> 
> 						- Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 11/18] libext2/fsck: correctly preserve fs flags when modifying ignore-csum-error flag
  2014-07-27 23:27   ` Theodore Ts'o
@ 2014-07-28  8:06     ` Darrick J. Wong
  0 siblings, 0 replies; 46+ messages in thread
From: Darrick J. Wong @ 2014-07-28  8:06 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: linux-ext4

On Sun, Jul 27, 2014 at 07:27:46PM -0400, Theodore Ts'o wrote:
> On Fri, Jul 25, 2014 at 05:34:47PM -0700, Darrick J. Wong wrote:
> > nor is it correct
> > to assume that we can unconditionally set (or clear) the "ignore csum
> > error" flag bit.
> 
> For functions inside e2fsck, why is this true?  All of the places
> where we are are EXT2_FLAG_IGNORE_CSUM_ERRORS are places where we set
> it, and then clear it after an ext2 call.  So as near as I can tell it
> shouldn't matter.
> 
> I can understand why we need to be careful for functions inside
> libext2fs, but in that particular case, none of the downstream
> functions of ext2fs_read_inode_full() modify fs->flags.
> 
> So I'm really puzzled what problem this patch actually solves.  Was
> this a theoretical concern, or was there something I missed?

It's mostly a theoretical concern -- rather than having to decide where it's ok
to set the flag and unset it later vs. where we have to set the flag and then
restore it to the previous value, I'd rather just be careful every time, so
that the next time I read through this code I'll know that IGNORE_CSUM_ERRORS
is always put back to its previous value and the other flags are left alone, no
matter how much e2fsck or libext2fs might get refactored.

One of those sites where we modify IGNORE_CSUM_ERRORS actually would
incorrectly clobber the other flag bits that were set inside the library (I
think it's the write_dir_block4 in pass2.c), so I decided to fix all of the
sites.

--D
> 
>        		   	       	   	 - Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 07/18] e2fsck: verify checksums after checking everything else
  2014-07-26 20:53   ` Theodore Ts'o
@ 2014-07-28  8:27     ` Darrick J. Wong
  0 siblings, 0 replies; 46+ messages in thread
From: Darrick J. Wong @ 2014-07-28  8:27 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: linux-ext4

On Sat, Jul 26, 2014 at 04:53:16PM -0400, Theodore Ts'o wrote:
> On Fri, Jul 25, 2014 at 05:34:22PM -0700, Darrick J. Wong wrote:
> > There's a particular problem with e2fsck's user interface where
> > checksum errors are concerned:  Fixing the first complaint about
> > a checksum problem results in the inode being cleared even if e2fsck
> > could otherwise have recovered it.  While this mode is useful for
> > cleaning the remaining broken crud off the filesystem, we could at
> > least default to checking everything /else/ and only complaining about
> > the incorrect checksum if fsck finds nothing else wrong.
> > 
> > So, plumb in a config option.  We default to "verify and checksum"
> > unless the user tell us otherwise.
> 
> I'm not convinced this is the right way to go.  Telling the user that
> they need to muck with the config file depending on what sort of file
> system corruption they have seems rather unsatisfying.
> 
> This is what I'd much rather do.  Add a "sanity checking" mode to the
> inode scanning functions which gets enabled when EXT2_SF_SANITY_CHECK
> is set via ext2fs_inode_scan_flags().  What the sanity check mode does
> is every time the inode scan functions read in a new inode table
> block, it performs a "sanity check" on the inode table block.  
> 
> The sanity check is carried out as follows.  If a majority of the
> inodes in the inode table block are "insane" then set the
> EXT2_SF_INSANE_ITABLE_BLOCK flag in scan flags, if not, clear this
> flag.  If checksum is incorrect, the inode is considered insane.  If
> the extent flag is set, and the extent header looks insane, then the
> inode is considered insane.  For indirect blocks, if more than 50% of
> the blocks in i_blocks[] are invalid, then inode is considered insane.
> 
> This is basically a simiplified version of an algorithm which Andreas
> has been carrying in Lustre's e2fsprogs for a while, which tries to
> apply a hueristic check over multiple inodes to decide whether if we
> would be better off just zapping all of the inodes in an inode table
> block.  The reason why I never integrated that change into mainline is
> that in order to make it work, it violated a large number of
> abstractions, and so I considered too ugly to live.
> 
> The advantage of doing this all inside lib/ext2fs/inode.c's inode
> scanning function is that it's much cleaner.  We can't do as many
> checks as Andreas did, but for the rough hueristic of deciding whether
> we have a minor problem in a single inode, or a massive problem caused
> by garbage written into the inode table or another inode table block
> getting written into the wrong place on disk (which we can only do if
> metadata checksums are enabled, but that's OK), we can get away with
> doing only the obvious "local" checks.
> 
> After all, in practice, it's usually either problems in a single inode
> (usually caused by a kernel bug or a memory bit flip), or complete
> garbage written into the inode table block, or an inode table block
> written to wrong place on disk, on top of another inode table block.
> So we just need a rough hueristic to distinguish between these cases.
> 
> Once we've decided whether the entire inode table block is insane or
> not, then what we do is if an inode has any problems at all during the
> pass1 scan, we check to see if the inode table block is marked insane.
> If it is considered insane, then we just clear the i_links_count and
> set dtime, effectively zapping the inode, no questions asked.
> Otherwise, we proceed doing the individual fix ups of each inode field.
> 
> Does that make sense?

Yes, that makes sense for dealing with the inodes.  What about the other FS
object blocks, such as directories, EAs, and extents?

Perhaps I'll try to define some insane heuristics:

For EA and extent blocks we could declare the block insane if the checksum
fails and the magic number is missing.  Seems pretty straightforward.

For classic directory blocks, we could declare the block insane if the checksum
fails and the end of the block is not {00 00 00 00 0C 00 00 DE XX XX XX XX}.

For htree directory blocks, we could similarly declare insanity if the checksum
fails and the beginning of the block are not the required fake dir entries.

...and if it's insane, zap it immediately; otherwise, run the usual checks and
fix the checksum if the other checks pass.

Hmm, that doesn't seem so bad.  What do people think?

--D
> 
> 					- Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 04/18] e2fsck: fix rule-violating lblk->pblk mappings on bigalloc filesystems
  2014-07-26 20:27     ` Theodore Ts'o
@ 2014-07-28  8:28       ` Darrick J. Wong
  2014-07-28 17:55       ` Darrick J. Wong
  1 sibling, 0 replies; 46+ messages in thread
From: Darrick J. Wong @ 2014-07-28  8:28 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Andreas Dilger, linux-ext4

On Sat, Jul 26, 2014 at 04:27:28PM -0400, Theodore Ts'o wrote:
> On Sat, Jul 26, 2014 at 12:02:10AM -0600, Andreas Dilger wrote:
> > Wouldn't it be possible to use this information to determine which
> > of the inodes sharing a duplicate mapped block is the right one
> > and which inode is the bad one?
> 
> Yes, although I'm not sure it's worth the effort.  Putting in more
> intelligent hueristics for using the metadata checksums will help all
> file systems, and not just bigalloc file systems --- and I think it
> would be easier.
> 
> > On Jul 25, 2014, at 18:34, "Darrick J. Wong" <darrick.wong@oracle.com> wrote:
> > 
> > As far as I can tell, logical block mappings on a bigalloc filesystem are
> > supposed to follow a few constraints:
> > 
> > * The logical cluster offset must match the physical cluster offset.
> > * A logical cluster may not map to multiple physical clusters.
> > 
> > Since the multiply-claimed block recovery code can be used to fix these
> > problems, teach e2fsck to find these transgressions and fix them.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Thanks, applied, but a couple of tips about writing test scripts.
> 
> You originally had this:
> 
> 	../misc/tune2fs -L test_fs $TMPFILE
> 	../e2fsck/e2fsck -fy $TMPFILE > $OUT
> 	../e2fsck/e2fsck -fy $TMPFILE >> $OUT
> 	../e2fsck/e2fsck -fy $TMPFILE >> $OUT
> 
> This spits out the version numbers to stderr, which is ugly.  I fixed
> up the binary image file to have the test_fs label, and I changed the
> lines above to:
> 
> 	$FSCK -fy $TMPFILE 2>&1 | sed -f $cmd_dir/filter.sed > $OUT
> 	$FSCK -fy $TMPFILE 2>&1 | sed -f $cmd_dir/filter.sed >> $OUT
> 	$FSCK -fy $TMPFILE 2>&1 | sed -f $cmd_dir/filter.sed >> $OUT
> 
> The advantage of doing things this way is that any errors get captured
> in the log file.  Also, if the user has requested valgrind be used,
> that gets reflected in the value of $FSCK.

Got it, thanks.  I'll do that next time.

--D
> 
> Cheers,
> 
> 					- Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 12/18] e2fsck: toggle checksum verification error reporting appropriately
  2014-07-28  7:38     ` Darrick J. Wong
@ 2014-07-28 11:41       ` Theodore Ts'o
  0 siblings, 0 replies; 46+ messages in thread
From: Theodore Ts'o @ 2014-07-28 11:41 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Mon, Jul 28, 2014 at 12:38:26AM -0700, Darrick J. Wong wrote:
> 
> We could make e2fsck verify the checksums itself, though we'd have to provide a
> way for either (a) libext2fs to provide the raw buffer data to e2fsck or (b)
> e2fsck to find the block number in question and (re)read the raw buffer, since
> the checksums are computed against the on-disk structures.

Hmm.  Another possibility is we could change the functions so that if
there is a checksum error, we return the error code, but we also
return the contents of the inode, directory entry, etc. to the caller.

       	   	       	   	  	    	   - Ted

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 04/18] e2fsck: fix rule-violating lblk->pblk mappings on bigalloc filesystems
  2014-07-26 20:27     ` Theodore Ts'o
  2014-07-28  8:28       ` Darrick J. Wong
@ 2014-07-28 17:55       ` Darrick J. Wong
  2014-07-28 19:32         ` Theodore Ts'o
  1 sibling, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2014-07-28 17:55 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Andreas Dilger, linux-ext4

On Sat, Jul 26, 2014 at 04:27:28PM -0400, Theodore Ts'o wrote:
> On Sat, Jul 26, 2014 at 12:02:10AM -0600, Andreas Dilger wrote:
> > Wouldn't it be possible to use this information to determine which
> > of the inodes sharing a duplicate mapped block is the right one
> > and which inode is the bad one?
> 
> Yes, although I'm not sure it's worth the effort.  Putting in more
> intelligent hueristics for using the metadata checksums will help all
> file systems, and not just bigalloc file systems --- and I think it
> would be easier.
> 
> > On Jul 25, 2014, at 18:34, "Darrick J. Wong" <darrick.wong@oracle.com> wrote:
> > 
> > As far as I can tell, logical block mappings on a bigalloc filesystem are
> > supposed to follow a few constraints:
> > 
> > * The logical cluster offset must match the physical cluster offset.
> > * A logical cluster may not map to multiple physical clusters.
> > 
> > Since the multiply-claimed block recovery code can be used to fix these
> > problems, teach e2fsck to find these transgressions and fix them.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Thanks, applied, but a couple of tips about writing test scripts.
> 
> You originally had this:
> 
> 	../misc/tune2fs -L test_fs $TMPFILE
> 	../e2fsck/e2fsck -fy $TMPFILE > $OUT
> 	../e2fsck/e2fsck -fy $TMPFILE >> $OUT
> 	../e2fsck/e2fsck -fy $TMPFILE >> $OUT
> 
> This spits out the version numbers to stderr, which is ugly.  I fixed
> up the binary image file to have the test_fs label, and I changed the
> lines above to:
> 
> 	$FSCK -fy $TMPFILE 2>&1 | sed -f $cmd_dir/filter.sed > $OUT
> 	$FSCK -fy $TMPFILE 2>&1 | sed -f $cmd_dir/filter.sed >> $OUT
> 	$FSCK -fy $TMPFILE 2>&1 | sed -f $cmd_dir/filter.sed >> $OUT
> 
> The advantage of doing things this way is that any errors get captured
> in the log file.  Also, if the user has requested valgrind be used,
> that gets reflected in the value of $FSCK.

Hmm... the f_badcluster test seems to be missing from -maint and -next.
Did it get lost in the merge?

--D
> 
> Cheers,
> 
> 					- Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 04/18] e2fsck: fix rule-violating lblk->pblk mappings on bigalloc filesystems
  2014-07-28 17:55       ` Darrick J. Wong
@ 2014-07-28 19:32         ` Theodore Ts'o
  0 siblings, 0 replies; 46+ messages in thread
From: Theodore Ts'o @ 2014-07-28 19:32 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Andreas Dilger, linux-ext4

On Mon, Jul 28, 2014 at 10:55:54AM -0700, Darrick J. Wong wrote:
> 
> Hmm... the f_badcluster test seems to be missing from -maint and -next.
> Did it get lost in the merge?

Oh, whoops.  It's in my tree, but I forgot to "git add" the directory.
(Because of some patch conflicts that caused "git am" to be unhappy, I
ended up having to apply the patch manually.)

I'll fix it up and push it out.

						- Ted

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2014-07-28 19:32 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-26  0:33 [PATCH 00/18] e2fsprogs patchbomb 7/14, part 2 Darrick J. Wong
2014-07-26  0:33 ` [PATCH 01/18] e2fsck: reserve blocks for root/lost+found directory repair Darrick J. Wong
2014-07-26 19:47   ` Theodore Ts'o
2014-07-28  7:27     ` Darrick J. Wong
2014-07-26  0:33 ` [PATCH 02/18] e2fsck: fix merge error in "clear uninit flag on directory extents" Darrick J. Wong
2014-07-26 20:04   ` Theodore Ts'o
2014-07-26  0:33 ` [PATCH 03/18] e2fsck: perform implied cluster allocations when filling a directory hole Darrick J. Wong
2014-07-26 20:08   ` Theodore Ts'o
2014-07-26  0:34 ` [PATCH 04/18] e2fsck: fix rule-violating lblk->pblk mappings on bigalloc filesystems Darrick J. Wong
2014-07-26  6:02   ` Andreas Dilger
2014-07-26 20:27     ` Theodore Ts'o
2014-07-28  8:28       ` Darrick J. Wong
2014-07-28 17:55       ` Darrick J. Wong
2014-07-28 19:32         ` Theodore Ts'o
2014-07-26  0:34 ` [PATCH 05/18] e2fsck: during pass1b delete_file, only free a cluster once Darrick J. Wong
2014-07-26 20:30   ` Theodore Ts'o
2014-07-26  0:34 ` [PATCH 06/18] dumpe2fs: add switch to disable checksum verification Darrick J. Wong
2014-07-26 20:58   ` Theodore Ts'o
2014-07-28  7:48     ` Darrick J. Wong
2014-07-26  0:34 ` [PATCH 07/18] e2fsck: verify checksums after checking everything else Darrick J. Wong
2014-07-26 20:53   ` Theodore Ts'o
2014-07-28  8:27     ` Darrick J. Wong
2014-07-26  0:34 ` [PATCH 08/18] e2fsck: fix the various checksum error messages Darrick J. Wong
2014-07-26 21:09   ` Theodore Ts'o
2014-07-28  7:57     ` Darrick J. Wong
2014-07-26  0:34 ` [PATCH 09/18] e2fsck: insert a missing dirent tail for checksums if possible Darrick J. Wong
2014-07-26 21:13   ` Theodore Ts'o
2014-07-26  0:34 ` [PATCH 10/18] e2fsck: write dir blocks after new inode when reconstructing root/lost+found Darrick J. Wong
2014-07-26 21:18   ` Theodore Ts'o
2014-07-26  0:34 ` [PATCH 11/18] libext2/fsck: correctly preserve fs flags when modifying ignore-csum-error flag Darrick J. Wong
2014-07-27 23:27   ` Theodore Ts'o
2014-07-28  8:06     ` Darrick J. Wong
2014-07-26  0:34 ` [PATCH 12/18] e2fsck: toggle checksum verification error reporting appropriately Darrick J. Wong
2014-07-27 23:37   ` Theodore Ts'o
2014-07-28  7:38     ` Darrick J. Wong
2014-07-28 11:41       ` Theodore Ts'o
2014-07-26  0:34 ` [PATCH 13/18] libext2fs: Don't cache inodes that fail checksum verification Darrick J. Wong
2014-07-26  0:35 ` [PATCH 14/18] e2fsck: always recheck an inode checksum failure Darrick J. Wong
2014-07-26  0:35 ` [PATCH 15/18] e2fsck: clear badblocks inode when checksum fails Darrick J. Wong
2014-07-27 23:42   ` Theodore Ts'o
2014-07-26  0:35 ` [PATCH 16/18] e2fsck: leave room for checksum structure when salvaging a directory Darrick J. Wong
2014-07-27 23:45   ` Theodore Ts'o
2014-07-26  0:35 ` [PATCH 17/18] e2fsck: make insert_dirent_tail more robust Darrick J. Wong
2014-07-27 23:48   ` Theodore Ts'o
2014-07-26  0:35 ` [PATCH 18/18] e2fsck: don't offer to fix the checksum of fixed extents Darrick J. Wong
2014-07-27 23:52   ` Theodore Ts'o

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.