linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: chrisl@gnuchina.org
To: "Theodore Ts'o" <tytso@mit.edu>,
	Ryan Cumming <ryan@completely.kicks-ass.org>,
	Andreas Dilger <adilger@clusterfs.com>,
	linux-kernel@vger.kernel.org, ext2-devel@lists.sourceforge.net
Subject: [PATCH] fix htree dir corrupt after fsck -fD
Date: Sun, 29 Sep 2002 00:03:15 -0700	[thread overview]
Message-ID: <20020929070315.GA6876@vmware.com> (raw)
In-Reply-To: <20020928141330.GA653@think.thunk.org>

OK, I believe this two patch will fix the problem.

When e2fsck rehash the directory index. it will left one empty
dir entry at the end of block if next dir entry can not fit in.
I think it is OK to do so but I also make a patch to make the empty
space merge with previous entry to consistent with the kernel.

dx_make_map need to skip the empty entry. So here is the patch for
review before I push it back to htree BK repository.

I already do the initial test and it fix the problem in kernel and
e2fsck. 

Chris

PS,

Ted, is e2fsck endian free now? I notice some code in
copy_dir_entries() do not have cpu_to_le32 for rec_len etc.



===== namei.c 1.8 vs edited =====
--- 1.8/fs/ext3/namei.c Fri Sep 20 00:40:12 2002
+++ edited/namei.c      Sat Sep 28 23:43:56 2002
@@ -601,11 +601,13 @@

        while ((char *) de < base + size)
        {
-               ext3fs_dirhash(de->name, de->name_len, &h);
-               map[count].hash = h.hash;
-               map[count].offs = (u32) ((char *) de - base);
+               if (de->name_len && de->inode) {
+                       ext3fs_dirhash(de->name, de->name_len, &h);
+                       map[count].hash = h.hash;
+                       map[count].offs = (u32) ((char *) de - base);
+                       count++;
+               }
                de = (ext3_dirent *) ((char *) de + le16_to_cpu(de->rec_len));
-               count++;
        }
        return count;
 }

===== rehash.c 1.3 vs edited =====
--- 1.3/e2fsck/rehash.c Fri Sep  6 07:14:11 2002
+++ edited/rehash.c     Sat Sep 28 23:19:16 2002
@@ -278,22 +278,16 @@
        for (i=0; i < fd->num_array; i++) {
                ent = fd->harray + i;
                rec_len = EXT2_DIR_REC_LEN(ent->dir->name_len & 0xFF);
-               dirent = (struct ext2_dir_entry *) (block_start + offset);
-               left = fs->blocksize - offset;
                if (rec_len > left) {
-                       if (left) {
-                               dirent->rec_len = left;
-                               dirent->inode = 0;
-                               dirent->name_len = 0;
-                               offset += left;
-                               left = 0;
-                       }
+                       if (left)
+                               dirent->rec_len += left;
                        if ((retval = get_next_block(fs, outdir,
                                                      &block_start)))
                                return retval;
-                       offset = 0; left = fs->blocksize;
-                       dirent = (struct ext2_dir_entry *) block_start;
+                       offset = 0;
                }
+               left = fs->blocksize - offset;
+               dirent = (struct ext2_dir_entry *) (block_start + offset);
                if (offset == 0) {
                        if (ent->hash == prev_hash)
                                outdir->hashes[outdir->num-1] = ent->hash | 1;

On Sat, Sep 28, 2002 at 10:13:30AM -0400, Theodore Ts'o wrote:
> On Fri, Sep 27, 2002 at 06:20:27PM -0700, Ryan Cumming wrote:
> >
> > This is while deleteing an old fsstress directory (a full fsck had been 
> > performed since the last time the fsstress directory had been touched) while 
> > running a few instances of the attached program.
> > 
> > You guys have any idea what's going on yet? 
> 
> Some ideas yes, although I don't have a complete solution yet.
> 
> I've been able to replicate it now fairly reliably, with the attached
> shell script and 2.4.19 with the 2.4.19-2 dxdir patch.  It appears to
> be somewhat timing dependent, as where the directory corruption occurs
> is not consistent, but I believe it is in the split code.  Since
> e2fsck -fD packs all of the directories completely, it means that any
> attempt to add a file to directory will guarantee at least one split,
> and possibly two levels of tree splits.  Since the -D option to e2fsck
> has only been relatively recently been available, I believe this is
> why it hasn't been noticed up until now in the testing; directories
> which are indexed "naturally" as they grow don't appear to trigger the
> problem, or are very, very unlikely to trigger the problem.  (One
> potential avenue for exploration is that -D option perfectly sorts all
> of the directory entries in hash order, which doesn't normally occur
> for naturally grown directories, and this may be triggering a
> fencepost error in the split code.)
> 
> The other thing which I've developed is a patch to e2fsprogs 1.29
> (also attached) which fixes the directory corruption without causing
> files to end up in lost+found.  I'm using the dxdir patch in
> production, and I was first able to replicate your problem after I ran
> e2fsck -fD on my /usr partition.  At that point, /usr/bin and
> /usr/share/man/man8 got corrupted.  So I modified e2fsck to be able to
> correct the problem without throwing a directory block's worth of
> dirents into lost+found.
> 
> The nature of the corruption is that a directory entry of size 8
> (which is enough room for a zero-length name) is left in the
> directory.  This is harmless, but it should never happen normally, and
> so the ext3 sanity-checking code flags it as an error.  With this
> patch, e2fsck is much smarter about salvaging corrupt directories, and
> so it can do so without causing any directory entries to be lost.
> (This corrupted, too-small directory entry appears at the beginning of
> the directory block, which is another reason why I strongly suspect
> the dx_split code.)
> 
> BTW, Andreas, I've tried your stack-usage reduction patch (modified to
> sanely deal with out of memory conditions instead of panicking), but
> that doesn't seem to fix the problem.  So whatever it is, it's
> something else, although your patch is still a good one and I've
> commited it to the 2.4 ext3-dxdir BK tree.
> 
> 							- Ted
> 

Content-Description: gen-test-cdir
> #!/bin/sh
> #dd if=/dev/zero of=test.img bs=8k count=2000
> mke2fs -j -F -b 1024 -N 24000 test.img 
> mount -t ext3 -o loop test.img /mnt
> pushd /mnt
> mkdir test 
> cd test
> time seq -f "gabby-qf%f" 1 10000  | xargs touch
> popd
> umount /mnt
> e2fsck -fD test.img
> mount -t ext3 -o loop test.img /mnt
> pushd /mnt/test
> time seq -f "%f" 1 62  | xargs touch
> popd
> umount /mnt
> e2fsck -f test.img
> 

Content-Description: e2fsprogs-patch
> # This is a BitKeeper generated patch for the following project:
> # Project Name: Ext2 filesystem utilities
> # This patch format is intended for GNU patch command version 2.5 or higher.
> # This patch includes the following deltas:
> #	           ChangeSet	1.1020  -> 1.1021 
> #	      e2fsck/pass2.c	1.45    -> 1.46   
> #	    e2fsck/ChangeLog	1.281   -> 1.282  
> #	       e2fsck/unix.c	1.66    -> 1.67   
> #	     e2fsck/rehash.c	1.3     -> 1.4    
> #
> # The following is the BitKeeper ChangeSet Log
> # --------------------------------------------
> # 02/09/28	tytso@think.thunk.org	1.1021
> # Add a more sophisticated algorithm to e2fsck to salvage corrupted
> # directories.
> # 
> # Speed up e2fsck slightly by only updating the master superblock;
> # there is no point to update the backup superblocks.
> # 
> # Fix a small bug in the rehashing code which could leave the indexed
> # flag set even after the directory was compressed instead of indexed.
> # (Not fatal, since the kernel will deal with this, but technically
> # it filesystem isn't consistent, and the filesystem will be marked
> # as being in error when the kernel comes across the directory.  It
> # should also never happen in real life, since directories that small
> # will never be indexed, but better safe than sorry.)
> # 
> # Also change the threshold of when directories are indexed, so that
> # directories of size 2 blocks will be indexed.  Otherwise they will
> # never be indexed by the kernel when they grow.
> # --------------------------------------------
> #
> diff -Nru a/e2fsck/ChangeLog b/e2fsck/ChangeLog
> --- a/e2fsck/ChangeLog	Sat Sep 28 09:50:43 2002
> +++ b/e2fsck/ChangeLog	Sat Sep 28 09:50:43 2002
> @@ -1,3 +1,20 @@
> +2002-09-28  Theodore Ts'o  <tytso@mit.edu>
> +
> +	* rehash.c (write_directory): Clear the index flag if by
> +		reoptimizing the directory, we bring it back into a
> +		non-indexed state.
> +		(e2fsck_rehash_dir): Allow directories that contain two
> +		blocks to be indexed.  Otherwise when they grow, they
> +		never will be indexed by the kernel.
> +
> +	* unix.c (main): Only update the master superblock; there's no
> +		point updating the backup superblocks, and it speeds up
> +		fsck slightly.
> +
> +	* pass2.c (salvage_directory): New function called by
> +		check_dir_block() which is much more sophisticated about
> +		how it salvages corrupted filesystems.
> +
>  2001-09-24  Theodore Tso  <tytso@mit.edu>
>  
>  	* Release of E2fsprogs 1.29
> diff -Nru a/e2fsck/pass2.c b/e2fsck/pass2.c
> --- a/e2fsck/pass2.c	Sat Sep 28 09:50:43 2002
> +++ b/e2fsck/pass2.c	Sat Sep 28 09:50:43 2002
> @@ -574,6 +574,55 @@
>  }
>  #endif /* ENABLE_HTREE */
>  
> +/*
> + * Given a busted directory, try to salvage it somehow.
> + * 
> + */
> +static int salvage_directory(ext2_filsys fs,
> +			      struct ext2_dir_entry *dirent,
> +			      struct ext2_dir_entry *prev,
> +			      int offset)
> +{
> +	char	*cp = (char *) dirent;
> +	int left = fs->blocksize - offset - dirent->rec_len;
> +	int prev_offset = offset - ((char *) dirent - (char *) prev);
> +
> +	/*
> +	 * Special case of directory entry of size 8: copy what's left
> +	 * of the directory block up to cover up the invalid hole.
> +	 */
> +	if ((left >= 12) && (dirent->rec_len == 8)) {
> +		memmove(cp, cp+8, left);
> +		memset(cp + left, 0, 8);
> +		return offset;
> +	}
> +	/*
> +	 * If the directory entry is a multiple of four, so it is
> +	 * valid, let the previous directory entry absorb the invalid
> +	 * one. 
> +	 */
> +	if (prev && dirent->rec_len && (dirent->rec_len % 4) == 0) {
> +		prev->rec_len += dirent->rec_len;
> +		return prev_offset;
> +	}
> +	/*
> +	 * Default salvage method --- kill all of the directory
> +	 * entries for the rest of the block.  We will either try to
> +	 * absorb it into the previous directory entry, or create a
> +	 * new empty directory entry the rest of the directory block.
> +	 */
> +	if (prev) {
> +		prev->rec_len += fs->blocksize - offset;
> +		return prev_offset;
> +	} else {
> +		dirent->rec_len = fs->blocksize - offset;
> +		dirent->name_len = 0;
> +		dirent->inode = 0;
> +		return offset;
> +	}
> +	
> +}
> +
>  static int check_dir_block(ext2_filsys fs,
>  			   struct ext2_db_entry *db,
>  			   void *priv_data)
> @@ -583,7 +632,7 @@
>  #ifdef ENABLE_HTREE
>  	struct dx_dirblock_info	*dx_db = 0;
>  #endif /* ENABLE_HTREE */
> -	struct ext2_dir_entry 	*dirent;
> +	struct ext2_dir_entry 	*dirent, *prev;
>  	ext2_dirhash_t		hash;
>  	int			offset = 0;
>  	int			dir_modified = 0;
> @@ -680,8 +729,8 @@
>  	}
>  #endif /* ENABLE_HTREE */
>  
> +	prev = 0;
>  	do {
> -		dot_state++;
>  		problem = 0;
>  		dirent = (struct ext2_dir_entry *) (buf + offset);
>  		cd->pctx.dirent = dirent;
> @@ -691,10 +740,10 @@
>  		    ((dirent->rec_len % 4) != 0) ||
>  		    (((dirent->name_len & 0xFF)+8) > dirent->rec_len)) {
>  			if (fix_problem(ctx, PR_2_DIR_CORRUPTED, &cd->pctx)) {
> -				dirent->rec_len = fs->blocksize - offset;
> -				dirent->name_len = 0;
> -				dirent->inode = 0;
> +				offset = salvage_directory(fs, dirent,
> +							   prev, offset);
>  				dir_modified++;
> +				continue;
>  			} else
>  				return DIRENT_ABORT;
>  		}
> @@ -705,10 +754,10 @@
>  			}
>  		}
>  
> -		if (dot_state == 1) {
> +		if (dot_state == 0) {
>  			if (check_dot(ctx, dirent, ino, &cd->pctx))
>  				dir_modified++;
> -		} else if (dot_state == 2) {
> +		} else if (dot_state == 1) {
>  			dir = e2fsck_get_dir_info(ctx, ino);
>  			if (!dir) {
>  				fix_problem(ctx, PR_2_NO_DIRINFO, &cd->pctx);
> @@ -749,7 +798,7 @@
>  			 * clear it.
>  			 */
>  			problem = PR_2_BB_INODE;
> -		} else if ((dot_state > 2) &&
> +		} else if ((dot_state > 1) &&
>  			   ((dirent->name_len & 0xFF) == 1) &&
>  			   (dirent->name[0] == '.')) {
>  			/*
> @@ -758,7 +807,7 @@
>  			 * duplicate entry that should be removed.
>  			 */
>  			problem = PR_2_DUP_DOT;
> -		} else if ((dot_state > 2) &&
> +		} else if ((dot_state > 1) &&
>  			   ((dirent->name_len & 0xFF) == 2) &&
>  			   (dirent->name[0] == '.') && 
>  			   (dirent->name[1] == '.')) {
> @@ -768,7 +817,7 @@
>  			 * duplicate entry that should be removed.
>  			 */
>  			problem = PR_2_DUP_DOT_DOT;
> -		} else if ((dot_state > 2) &&
> +		} else if ((dot_state > 1) &&
>  			   (dirent->inode == EXT2_ROOT_INO)) {
>  			/*
>  			 * Don't allow links to the root directory.
> @@ -777,7 +826,7 @@
>  			 * directory hasn't been created yet.
>  			 */
>  			problem = PR_2_LINK_ROOT;
> -		} else if ((dot_state > 2) &&
> +		} else if ((dot_state > 1) &&
>  			   (dirent->name_len & 0xFF) == 0) {
>  			/*
>  			 * Don't allow zero-length directory names.
> @@ -842,7 +891,7 @@
>  		 * hard link.  We assume the first link is correct,
>  		 * and ask the user if he/she wants to clear this one.
>  		 */
> -		if ((dot_state > 2) &&
> +		if ((dot_state > 1) &&
>  		    (ext2fs_test_inode_bitmap(ctx->inode_dir_map,
>  					      dirent->inode))) {
>  			subdir = e2fsck_get_dir_info(ctx, dirent->inode);
> @@ -871,7 +920,9 @@
>  			ctx->fs_links_count++;
>  		ctx->fs_total_count++;
>  	next:
> +		prev = dirent;
>  		offset += dirent->rec_len;
> +		dot_state++;
>  	} while (offset < fs->blocksize);
>  #if 0
>  	printf("\n");
> diff -Nru a/e2fsck/rehash.c b/e2fsck/rehash.c
> --- a/e2fsck/rehash.c	Sat Sep 28 09:50:43 2002
> +++ b/e2fsck/rehash.c	Sat Sep 28 09:50:43 2002
> @@ -522,7 +522,9 @@
>  		return wd.err;
>  
>  	e2fsck_read_inode(ctx, ino, &inode, "rehash_dir");
> -	if (!compress)
> +	if (compress)
> +		inode.i_flags &= ~EXT2_INDEX_FL;
> +	else
>  		inode.i_flags |= EXT2_INDEX_FL;
>  	inode.i_size = outdir->num * fs->blocksize;
>  	inode.i_blocks -= (fs->blocksize / 512) * wd.cleared;
> @@ -561,7 +563,7 @@
>  	fd.dir_size = 0;
>  	fd.compress = 0;
>  	if (!(fs->super->s_feature_compat & EXT2_FEATURE_COMPAT_DIR_INDEX) ||
> -	    (inode.i_size / fs->blocksize) < 3)
> +	    (inode.i_size / fs->blocksize) < 2)
>  		fd.compress = 1;
>  	fd.parent = 0;
>  
> diff -Nru a/e2fsck/unix.c b/e2fsck/unix.c
> --- a/e2fsck/unix.c	Sat Sep 28 09:50:43 2002
> +++ b/e2fsck/unix.c	Sat Sep 28 09:50:43 2002
> @@ -974,8 +974,11 @@
>  		ext2fs_mark_super_dirty(fs);
>  
>  	/*
> -	 * Don't overwrite the backup superblock and block
> -	 * descriptors, until we're sure the filesystem is OK....
> +	 * We only update the master superblock because (a) paranoia;
> +	 * we don't want to corrupt the backup superblocks, and (b) we
> +	 * don't need to update the mount count and last checked
> +	 * fields in the backup superblock (the kernel doesn't
> +	 * update the backup superblocks anyway).
>  	 */
>  	fs->flags |= EXT2_FLAG_MASTER_SB_ONLY;
>  
> @@ -1058,9 +1061,7 @@
>  			exit_value |= FSCK_REBOOT;
>  		}
>  	}
> -	if (ext2fs_test_valid(fs))
> -		fs->flags &= ~EXT2_FLAG_MASTER_SB_ONLY;
> -	else {
> +	if (!ext2fs_test_valid(fs)) {
>  		printf(_("\n%s: ********** WARNING: Filesystem still has "
>  			 "errors **********\n\n"), ctx->device_name);
>  		exit_value |= FSCK_UNCORRECTED;


  parent reply	other threads:[~2002-09-29  6:59 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-09-25 20:03 [BK PATCH] Add ext3 indexed directory (htree) support tytso
2002-09-25 20:34 ` Andreas Dilger
2002-09-25 20:41 ` Dave Jones
2002-09-25 21:08   ` Andreas Dilger
2002-09-25 21:34   ` Theodore Ts'o
2002-09-25 22:54 ` Jeff Garzik
2002-09-25 23:29   ` Theodore Ts'o
2002-09-25 23:45     ` Ryan Cumming
2002-09-26  3:27       ` Theodore Ts'o
2002-09-26  5:23         ` Ryan Cumming
2002-09-26  5:57           ` Theodore Ts'o
2002-09-26  6:22             ` Ryan Cumming
2002-09-26 14:05               ` Theodore Ts'o
2002-09-26  6:25             ` Ryan Cumming
2002-09-26 11:25               ` Daniel Egger
2002-09-26  7:41             ` Ryan Cumming
2002-09-26 13:23               ` Theodore Ts'o
2002-09-26 15:42               ` Theodore Ts'o
2002-09-26 19:08                 ` Ryan Cumming
2002-09-26 19:51                   ` Horst von Brand
2002-09-26 19:59                     ` Ryan Cumming
2002-09-26 22:04                   ` Theodore Ts'o
2002-09-26 22:53                     ` Ryan Cumming
2002-09-26 23:57                       ` Theodore Ts'o
2002-09-27  1:00                         ` Ryan Cumming
2002-09-27  3:24                           ` Theodore Ts'o
2002-09-27  4:12                         ` Andreas Dilger
2002-09-27  7:55                           ` Ryan Cumming
2002-09-28  1:20                           ` Ryan Cumming
2002-09-28  1:46                             ` Ryan Cumming
2002-09-28 14:13                             ` Theodore Ts'o
2002-09-28 14:18                               ` Theodore Ts'o
2002-09-28 22:35                                 ` Ryan Cumming
2002-09-28 17:27                               ` [Ext2-devel] " Andreas Dilger
2002-09-28 18:43                                 ` chrisl
2002-09-28 19:45                                 ` chrisl
2002-09-28 22:30                               ` Ryan Cumming
2002-09-29  7:03                               ` chrisl [this message]
2002-09-29  8:16                                 ` [PATCH] fix htree dir corrupt after fsck -fD Ryan Cumming
2002-09-29  8:36                                   ` Ryan Cumming
2002-09-30  2:46                                   ` Ryan Cumming
2002-09-29 14:13                                 ` Theodore Ts'o
2002-09-25 23:31 ` [BK PATCH] Add ext3 indexed directory (htree) support Daniel Egger
2002-09-26  0:32   ` Randy.Dunlap
2002-09-26  0:50 ` Aaron Lehmann
2002-09-26  3:28   ` Theodore Ts'o
2002-09-29 17:12 [PATCH] fix htree dir corrupt after fsck -fD Marc-Christian Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20020929070315.GA6876@vmware.com \
    --to=chrisl@gnuchina.org \
    --cc=adilger@clusterfs.com \
    --cc=ext2-devel@lists.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ryan@completely.kicks-ass.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).