linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/7 v2] e2fsprogs: Better handling of indexed directories
@ 2020-02-13 10:15 Jan Kara
  2020-02-13 10:15 ` [PATCH 1/7] e2fsck: Clarify overflow link count error message Jan Kara
                   ` (6 more replies)
  0 siblings, 7 replies; 28+ messages in thread
From: Jan Kara @ 2020-02-13 10:15 UTC (permalink / raw)
  To: Ted Tso; +Cc: linux-ext4, Jan Kara

Hello,

Currently, libext2fs does not implement adding entry into htree directory. It
just bluntly clears EXT2_INDEX_FL and then treats the directory as non-indexed.
This breaks when metadata checksums are enabled and although ext2fs_link()
tries to fixup the directory, it doesn't always fixup all the checksums and
I have some doubts about practicality of just discarding htree information for
really large directories. This patch series implements full support for adding
entry into htree directory and some tests to test the functionality.

The first patch in the series is somewhat unrelated, it just clarifies handling
of overflown directory i_nlink handling in e2fsck which confused me initially
when analyzing the issue.

The second patch fixes a bug in e2fsck when rehashing indexed directories which
I've found during testing my series.

The third patch prepares ext2fs_mkdir() and ext2fs_symlink() to safely work
with ext2fs_link() - this demonstrates there's a breakage potential in the
following changes to ext2fs_link() for external applications using
ext2fs_link() because it can now modify the directory inode and allocate
blocks. If people are concerned about this, we could create ext2fs_link2()
with the new behavior and just restrict ext2fs_link() to bail with error
when called on indexed directory with metadata_csum enabled.

Next three patches implement the support for linking into indexed directories
and tests.

The last patch then fixes tune2fs to properly recompute directory checksums
when disabling dir_index feature.

Changes since v1:
* Fixed growing of h-tree to 3-levels
* Added e2fsck fix
* Added tune2fs fix
* Fixed ext2fs_mkdir() and ext2fs_symlink() to work with new ext2fs_link()
* Reworked tests

								Honza

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/7] e2fsck: Clarify overflow link count error message
  2020-02-13 10:15 [PATCH 0/7 v2] e2fsprogs: Better handling of indexed directories Jan Kara
@ 2020-02-13 10:15 ` Jan Kara
  2020-02-14 19:27   ` Andreas Dilger
  2020-03-07 18:52   ` Theodore Y. Ts'o
  2020-02-13 10:15 ` [PATCH 2/7] e2fsck: Fix indexed dir rehash failure with metadata_csum enabled Jan Kara
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 28+ messages in thread
From: Jan Kara @ 2020-02-13 10:15 UTC (permalink / raw)
  To: Ted Tso; +Cc: linux-ext4, Jan Kara

When directory link count is set to overflow value (1) but during pass 4
we find out the exact link count would fit, we either silently fix this
(which is not great because e2fsck then reports the fs was modified but
output doesn't indicate why in any way), or we report that link count is
wrong and ask whether we should fix it (in case -n option was
specified). The second case is even more misleading because it suggests
non-trivial fs corruption which then gets silently fixed on the next
run. Similarly to how we fix up other non-problems, just create a new
error message for the case directory link count is not overflown anymore
and always report it to clarify what is going on.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 e2fsck/pass4.c   | 20 ++++++++++++++++----
 e2fsck/problem.c |  5 +++++
 e2fsck/problem.h |  3 +++
 3 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/e2fsck/pass4.c b/e2fsck/pass4.c
index 10be7f87180d..8c2d2f1fca12 100644
--- a/e2fsck/pass4.c
+++ b/e2fsck/pass4.c
@@ -237,6 +237,8 @@ void e2fsck_pass4(e2fsck_t ctx)
 			link_counted = 1;
 		}
 		if (link_counted != link_count) {
+			int fix_nlink = 0;
+
 			e2fsck_read_inode_full(ctx, i, EXT2_INODE(inode),
 					       inode_size, "pass4");
 			pctx.ino = i;
@@ -250,10 +252,20 @@ void e2fsck_pass4(e2fsck_t ctx)
 			pctx.num = link_counted;
 			/* i_link_count was previously exceeded, but no longer
 			 * is, fix this but don't consider it an error */
-			if ((isdir && link_counted > 1 &&
-			     (inode->i_flags & EXT2_INDEX_FL) &&
-			     link_count == 1 && !(ctx->options & E2F_OPT_NO)) ||
-			    fix_problem(ctx, PR_4_BAD_REF_COUNT, &pctx)) {
+			if (isdir && link_counted > 1 &&
+			    (inode->i_flags & EXT2_INDEX_FL) &&
+			    link_count == 1) {
+				if ((ctx->options & E2F_OPT_READONLY) == 0) {
+					fix_nlink =
+						fix_problem(ctx,
+							PR_4_DIR_OVERFLOW_REF_COUNT,
+							&pctx);
+				}
+			} else {
+				fix_nlink = fix_problem(ctx, PR_4_BAD_REF_COUNT,
+						&pctx);
+			}
+			if (fix_nlink) {
 				inode->i_links_count = link_counted;
 				e2fsck_write_inode_full(ctx, i,
 							EXT2_INODE(inode),
diff --git a/e2fsck/problem.c b/e2fsck/problem.c
index c7c0ba986006..e79c853b2096 100644
--- a/e2fsck/problem.c
+++ b/e2fsck/problem.c
@@ -2035,6 +2035,11 @@ static struct e2fsck_problem problem_table[] = {
 	  N_("@d exceeds max links, but no DIR_NLINK feature in @S.\n"),
 	  PROMPT_FIX, 0, 0, 0, 0 },
 
+	/* Directory inode ref count set to overflow but could be exact value */
+	{ PR_4_DIR_OVERFLOW_REF_COUNT,
+	  N_("@d @i %i ref count set to overflow but could be exact value %N.  "),
+	  PROMPT_FIX, PR_PREEN_OK, 0, 0, 0 },
+
 	/* Pass 5 errors */
 
 	/* Pass 5: Checking group summary information */
diff --git a/e2fsck/problem.h b/e2fsck/problem.h
index c7f65f6dee0f..4185e5175cab 100644
--- a/e2fsck/problem.h
+++ b/e2fsck/problem.h
@@ -1164,6 +1164,9 @@ struct problem_context {
 /* directory exceeds max links, but no DIR_NLINK feature in superblock */
 #define PR_4_DIR_NLINK_FEATURE		0x040006
 
+/* Directory ref count set to overflow but it doesn't have to be */
+#define PR_4_DIR_OVERFLOW_REF_COUNT	0x040007
+
 /*
  * Pass 5 errors
  */
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 2/7] e2fsck: Fix indexed dir rehash failure with metadata_csum enabled
  2020-02-13 10:15 [PATCH 0/7 v2] e2fsprogs: Better handling of indexed directories Jan Kara
  2020-02-13 10:15 ` [PATCH 1/7] e2fsck: Clarify overflow link count error message Jan Kara
@ 2020-02-13 10:15 ` Jan Kara
  2020-02-14 19:28   ` Andreas Dilger
  2020-03-07 23:17   ` Theodore Y. Ts'o
  2020-02-13 10:15 ` [PATCH 3/7] ext2fs: Update allocation info earlier in ext2fs_mkdir() and ext2fs_symlink() Jan Kara
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 28+ messages in thread
From: Jan Kara @ 2020-02-13 10:15 UTC (permalink / raw)
  To: Ted Tso; +Cc: linux-ext4, Jan Kara

E2fsck directory rehashing code can fail with ENOSPC due to a bug in
ext2fs_htree_intnode_maxrecs() which fails to take metadata checksum
into account and thus e.g. e2fsck can decide to create 1 indirect level
of index tree when two are actually needed. Fix the logic to account for
metadata checksum.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 lib/ext2fs/ext2fs.h | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
index 93ecf29c568d..5fde3343b1f1 100644
--- a/lib/ext2fs/ext2fs.h
+++ b/lib/ext2fs/ext2fs.h
@@ -1783,7 +1783,6 @@ extern blk_t ext2fs_group_first_block(ext2_filsys fs, dgrp_t group);
 extern blk_t ext2fs_group_last_block(ext2_filsys fs, dgrp_t group);
 extern blk_t ext2fs_inode_data_blocks(ext2_filsys fs,
 				      struct ext2_inode *inode);
-extern int ext2fs_htree_intnode_maxrecs(ext2_filsys fs, int blocks);
 extern unsigned int ext2fs_div_ceil(unsigned int a, unsigned int b);
 extern __u64 ext2fs_div64_ceil(__u64 a, __u64 b);
 extern int ext2fs_dirent_name_len(const struct ext2_dir_entry *entry);
@@ -2015,9 +2014,14 @@ _INLINE_ blk_t ext2fs_inode_data_blocks(ext2_filsys fs,
 	return (blk_t) ext2fs_inode_data_blocks2(fs, inode);
 }
 
-_INLINE_ int ext2fs_htree_intnode_maxrecs(ext2_filsys fs, int blocks)
+static inline int ext2fs_htree_intnode_maxrecs(ext2_filsys fs, int blocks)
 {
-	return blocks * ((fs->blocksize - 8) / sizeof(struct ext2_dx_entry));
+	int csum_size = 0;
+
+	if (ext2fs_has_feature_metadata_csum(fs->super))
+		csum_size = sizeof(struct ext2_dx_tail);
+	return blocks * ((fs->blocksize - (8 + csum_size)) /
+						sizeof(struct ext2_dx_entry));
 }
 
 /*
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 3/7] ext2fs: Update allocation info earlier in ext2fs_mkdir() and ext2fs_symlink()
  2020-02-13 10:15 [PATCH 0/7 v2] e2fsprogs: Better handling of indexed directories Jan Kara
  2020-02-13 10:15 ` [PATCH 1/7] e2fsck: Clarify overflow link count error message Jan Kara
  2020-02-13 10:15 ` [PATCH 2/7] e2fsck: Fix indexed dir rehash failure with metadata_csum enabled Jan Kara
@ 2020-02-13 10:15 ` Jan Kara
  2020-02-14 19:37   ` Andreas Dilger
  2020-03-08  0:02   ` Theodore Y. Ts'o
  2020-02-13 10:15 ` [PATCH 4/7] ext2fs: Implement dir entry creation in htree directories Jan Kara
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 28+ messages in thread
From: Jan Kara @ 2020-02-13 10:15 UTC (permalink / raw)
  To: Ted Tso; +Cc: linux-ext4, Jan Kara

Currently, ext2fs_mkdir() and ext2fs_symlink() update allocation bitmaps
and other information only close to the end of the function, in
particular after calling to ext2fs_link(). When ext2fs_link() will
support indexed directories, it will also need to allocate blocks and
that would cause filesystem corruption in case allocation info isn't
properly updated. So make sure ext2fs_mkdir() and ext2fs_symlink()
update allocation info before calling into ext2fs_link().

Signed-off-by: Jan Kara <jack@suse.cz>
---
 lib/ext2fs/mkdir.c   | 14 +++++++-------
 lib/ext2fs/symlink.c | 14 +++++++-------
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/lib/ext2fs/mkdir.c b/lib/ext2fs/mkdir.c
index 2a63aad16715..947003ebf309 100644
--- a/lib/ext2fs/mkdir.c
+++ b/lib/ext2fs/mkdir.c
@@ -143,6 +143,13 @@ errcode_t ext2fs_mkdir(ext2_filsys fs, ext2_ino_t parent, ext2_ino_t inum,
 		}
 	}
 
+	/*
+	 * Update accounting....
+	 */
+	if (!inline_data)
+		ext2fs_block_alloc_stats2(fs, blk, +1);
+	ext2fs_inode_alloc_stats2(fs, ino, +1, 1);
+
 	/*
 	 * Link the directory into the filesystem hierarchy
 	 */
@@ -175,13 +182,6 @@ errcode_t ext2fs_mkdir(ext2_filsys fs, ext2_ino_t parent, ext2_ino_t inum,
 			goto cleanup;
 	}
 
-	/*
-	 * Update accounting....
-	 */
-	if (!inline_data)
-		ext2fs_block_alloc_stats2(fs, blk, +1);
-	ext2fs_inode_alloc_stats2(fs, ino, +1, 1);
-
 cleanup:
 	if (block)
 		ext2fs_free_mem(&block);
diff --git a/lib/ext2fs/symlink.c b/lib/ext2fs/symlink.c
index 7f78c5f75549..3e07a539daf3 100644
--- a/lib/ext2fs/symlink.c
+++ b/lib/ext2fs/symlink.c
@@ -162,6 +162,13 @@ need_block:
 			goto cleanup;
 	}
 
+	/*
+	 * Update accounting....
+	 */
+	if (!fastlink && !inlinelink)
+		ext2fs_block_alloc_stats2(fs, blk, +1);
+	ext2fs_inode_alloc_stats2(fs, ino, +1, 0);
+
 	/*
 	 * Link the symlink into the filesystem hierarchy
 	 */
@@ -179,13 +186,6 @@ need_block:
 			goto cleanup;
 	}
 
-	/*
-	 * Update accounting....
-	 */
-	if (!fastlink && !inlinelink)
-		ext2fs_block_alloc_stats2(fs, blk, +1);
-	ext2fs_inode_alloc_stats2(fs, ino, +1, 0);
-
 cleanup:
 	if (block_buf)
 		ext2fs_free_mem(&block_buf);
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 4/7] ext2fs: Implement dir entry creation in htree directories
  2020-02-13 10:15 [PATCH 0/7 v2] e2fsprogs: Better handling of indexed directories Jan Kara
                   ` (2 preceding siblings ...)
  2020-02-13 10:15 ` [PATCH 3/7] ext2fs: Update allocation info earlier in ext2fs_mkdir() and ext2fs_symlink() Jan Kara
@ 2020-02-13 10:15 ` Jan Kara
  2020-03-15 16:43   ` Theodore Y. Ts'o
  2020-02-13 10:16 ` [PATCH 5/7] tests: Modify f_large_dir test to excercise indexed dir handling Jan Kara
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 28+ messages in thread
From: Jan Kara @ 2020-02-13 10:15 UTC (permalink / raw)
  To: Ted Tso; +Cc: linux-ext4, Jan Kara

Implement proper creation of new directory entries in htree directories
in ext2fs_link(). So far we just cleared EXT2_INDEX_FL and treated
directory as unindexed however this results in mismatched checksums if
metadata checksums are in use because checksums are placed in different
places depending on htree node type.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 lib/ext2fs/link.c | 549 ++++++++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 497 insertions(+), 52 deletions(-)

diff --git a/lib/ext2fs/link.c b/lib/ext2fs/link.c
index 65dc8877a5c7..6f523aee718c 100644
--- a/lib/ext2fs/link.c
+++ b/lib/ext2fs/link.c
@@ -18,6 +18,155 @@
 
 #include "ext2_fs.h"
 #include "ext2fs.h"
+#include "ext2fsP.h"
+
+#define EXT2_DX_ROOT_OFF 24
+
+struct dx_frame {
+	void *buf;
+	blk64_t pblock;
+	struct ext2_dx_countlimit *head;
+	struct ext2_dx_entry *entries;
+	struct ext2_dx_entry *at;
+};
+
+struct dx_lookup_info {
+	const char *name;
+	int namelen;
+	int hash_alg;
+	__u32 hash;
+	int levels;
+	struct dx_frame frames[EXT4_HTREE_LEVEL];
+};
+
+static errcode_t alloc_dx_frame(ext2_filsys fs, struct dx_frame *frame)
+{
+	return ext2fs_get_mem(fs->blocksize, &frame->buf);
+}
+
+static void dx_release(struct dx_lookup_info *info)
+{
+	struct ext2_dx_root_info *root;
+	int level;
+
+	for (level = 0; level < info->levels; level++) {
+		if (info->frames[level].buf == NULL)
+			break;
+		ext2fs_free_mem(&(info->frames[level].buf));
+	}
+	info->levels = 0;
+}
+
+static void dx_search_entry(struct dx_frame *frame, int count, __u32 hash)
+{
+	struct ext2_dx_entry *p, *q, *m;
+
+	p = frame->entries + 1;
+	q = frame->entries + count - 1;
+	while (p <= q) {
+		m = p + (q - p) / 2;
+		if (ext2fs_le32_to_cpu(m->hash) > hash)
+			q = m - 1;
+		else
+			p = m + 1;
+	}
+	frame->at = p - 1;
+}
+
+static errcode_t load_logical_dir_block(ext2_filsys fs, ext2_ino_t dir,
+					struct ext2_inode *diri, blk64_t block,
+					blk64_t *pblk, void *buf)
+{
+	errcode_t errcode;
+	int ret_flags;
+
+	errcode = ext2fs_bmap2(fs, dir, diri, NULL, 0, block, &ret_flags,
+			       pblk);
+	if (errcode)
+		return errcode;
+	if (ret_flags & BMAP_RET_UNINIT)
+		return EXT2_ET_DIR_CORRUPTED;
+	return ext2fs_read_dir_block4(fs, *pblk, buf, 0, dir);
+}
+
+static errcode_t dx_lookup(ext2_filsys fs, ext2_ino_t dir,
+			   struct ext2_inode *diri, struct dx_lookup_info *info)
+{
+	struct ext2_dx_root_info *root;
+	errcode_t errcode;
+	int level = 0;
+	int count, limit;
+	int hash_alg;
+	int hash_flags = diri->i_flags & EXT4_CASEFOLD_FL;
+	__u32 hash, minor_hash;
+	struct dx_frame *frame;
+
+	errcode = alloc_dx_frame(fs, &(info->frames[0]));
+	if (errcode)
+		return errcode;
+	info->levels = 1;
+
+	errcode = load_logical_dir_block(fs, dir, diri, 0,
+					 &(info->frames[0].pblock),
+					 info->frames[0].buf);
+	if (errcode)
+		goto out_err;
+	root = info->frames[0].buf + EXT2_DX_ROOT_OFF;
+	hash_alg = root->hash_version;
+	if (hash_alg != EXT2_HASH_TEA && hash_alg != EXT2_HASH_HALF_MD4 &&
+	    hash_alg != EXT2_HASH_LEGACY) {
+		errcode = EXT2_ET_DIRHASH_UNSUPP;
+		goto out_err;
+	}
+	if (hash_alg <= EXT2_HASH_TEA &&
+	    fs->super->s_flags & EXT2_FLAGS_UNSIGNED_HASH)
+		hash_alg += 3;
+	if (root->indirect_levels >= ext2_dir_htree_level(fs)) {
+		errcode = EXT2_ET_DIR_CORRUPTED;
+		goto out_err;
+	}
+	info->hash_alg = hash_alg;
+
+	errcode = ext2fs_dirhash2(hash_alg, info->name, info->namelen,
+				  fs->encoding, hash_flags,
+				  fs->super->s_hash_seed, &info->hash,
+				  &minor_hash);
+	if (errcode)
+		goto out_err;
+
+	for (level = 0; level <= root->indirect_levels; level++) {
+		frame = &(info->frames[level]);
+		if (level > 0) {
+			errcode = alloc_dx_frame(fs, frame);
+			if (errcode)
+				goto out_err;
+			info->levels++;
+
+			errcode = load_logical_dir_block(fs, dir, diri,
+				ext2fs_le32_to_cpu(info->frames[level-1].at->block) & 0x0fffffff,
+				&(frame->pblock), frame->buf);
+			if (errcode)
+				goto out_err;
+		}
+		errcode = ext2fs_get_dx_countlimit(fs, frame->buf,
+						   &(frame->head), NULL);
+		if (errcode)
+			goto out_err;
+		count = ext2fs_le16_to_cpu(frame->head->count);
+		limit = ext2fs_le16_to_cpu(frame->head->limit);
+		frame->entries = (struct ext2_dx_entry *)(frame->head);
+		if (!count || count > limit) {
+			errcode = EXT2_ET_DIR_CORRUPTED;
+			goto out_err;
+		}
+
+		dx_search_entry(frame, count, info->hash);
+	}
+	return 0;
+out_err:
+	dx_release(info);
+	return errcode;
+}
 
 struct link_struct  {
 	ext2_filsys	fs;
@@ -31,7 +180,9 @@ struct link_struct  {
 	struct ext2_super_block *sb;
 };
 
-static int link_proc(struct ext2_dir_entry *dirent,
+static int link_proc(ext2_ino_t dir EXT2FS_ATTR((unused)),
+		     int entru EXT2FS_ATTR((unused)),
+		     struct ext2_dir_entry *dirent,
 		     int	offset,
 		     int	blocksize,
 		     char	*buf,
@@ -70,40 +221,6 @@ static int link_proc(struct ext2_dir_entry *dirent,
 		ret = DIRENT_CHANGED;
 	}
 
-	/*
-	 * Since ext2fs_link blows away htree data, we need to be
-	 * careful -- if metadata_csum is enabled and we're passed in
-	 * a dirent that contains htree data, we need to create the
-	 * fake entry at the end of the block that hides the checksum.
-	 */
-
-	/* De-convert a dx_node block */
-	if (csum_size &&
-	    curr_rec_len == ls->fs->blocksize &&
-	    !dirent->inode) {
-		curr_rec_len -= csum_size;
-		ls->err = ext2fs_set_rec_len(ls->fs, curr_rec_len, dirent);
-		if (ls->err)
-			return DIRENT_ABORT;
-		t = EXT2_DIRENT_TAIL(buf, ls->fs->blocksize);
-		ext2fs_initialize_dirent_tail(ls->fs, t);
-		ret = DIRENT_CHANGED;
-	}
-
-	/* De-convert a dx_root block */
-	if (csum_size &&
-	    curr_rec_len == ls->fs->blocksize - EXT2_DIR_REC_LEN(1) &&
-	    offset == EXT2_DIR_REC_LEN(1) &&
-	    dirent->name[0] == '.' && dirent->name[1] == '.') {
-		curr_rec_len -= csum_size;
-		ls->err = ext2fs_set_rec_len(ls->fs, curr_rec_len, dirent);
-		if (ls->err)
-			return DIRENT_ABORT;
-		t = EXT2_DIRENT_TAIL(buf, ls->fs->blocksize);
-		ext2fs_initialize_dirent_tail(ls->fs, t);
-		ret = DIRENT_CHANGED;
-	}
-
 	/*
 	 * If the directory entry is used, see if we can split the
 	 * directory entry to make room for the new name.  If so,
@@ -144,6 +261,343 @@ static int link_proc(struct ext2_dir_entry *dirent,
 	return DIRENT_ABORT|DIRENT_CHANGED;
 }
 
+static errcode_t add_dirent_to_buf(ext2_filsys fs, e2_blkcnt_t blockcnt,
+				   char *buf, ext2_ino_t dir,
+				   struct ext2_inode *diri, const char *name,
+				   ext2_ino_t ino, int flags, blk64_t *pblkp)
+{
+	struct dir_context ctx;
+	struct link_struct ls;
+	errcode_t retval;
+
+	retval = load_logical_dir_block(fs, dir, diri, blockcnt, pblkp, buf);
+	if (retval)
+		return retval;
+	ctx.errcode = 0;
+	ctx.func = link_proc;
+	ctx.dir = dir;
+	ctx.flags = DIRENT_FLAG_INCLUDE_EMPTY;
+	ctx.buf = buf;
+	ctx.priv_data = &ls;
+
+	ls.fs = fs;
+	ls.name = name;
+	ls.namelen = strlen(name);
+	ls.inode = ino;
+	ls.flags = flags;
+	ls.done = 0;
+	ls.sb = fs->super;
+	ls.blocksize = fs->blocksize;
+	ls.err = 0;
+
+	ext2fs_process_dir_block(fs, pblkp, blockcnt, 0, 0, &ctx);
+	if (ctx.errcode)
+		return ctx.errcode;
+	if (ls.err)
+		return ls.err;
+	if (!ls.done)
+		return EXT2_ET_DIR_NO_SPACE;
+	return 0;
+}
+
+struct dx_hash_map {
+	__u32 hash;
+	int size;
+	int off;
+};
+
+static EXT2_QSORT_TYPE dx_hash_map_cmp(const void *ap, const void *bp)
+{
+	const struct dx_hash_map *a = ap, *b = bp;
+
+	if (a->hash < b->hash)
+		return -1;
+	if (a->hash > b->hash)
+		return 1;
+	return 0;
+}
+
+static errcode_t dx_move_dirents(ext2_filsys fs, struct dx_hash_map *map,
+				 int count, void *from, void *to)
+{
+	struct ext2_dir_entry *de;
+	int i;
+	int rec_len;
+	errcode_t retval;
+	int csum_size = 0;
+	void *base = to;
+
+	if (ext2fs_has_feature_metadata_csum(fs->super))
+		csum_size = sizeof(struct ext2_dir_entry_tail);
+
+	for (i = 0; i < count; i++) {
+		de = from + map[i].off;
+		rec_len = EXT2_DIR_REC_LEN(ext2fs_dirent_name_len(de));
+		memcpy(to, de, rec_len);
+		retval = ext2fs_set_rec_len(fs, rec_len, to);
+		if (retval)
+			return retval;
+		to += rec_len;
+	}
+	/*
+	 * Update rec_len of the last dir entry to stretch to the end of block
+	 */
+	to -= rec_len;
+	rec_len = fs->blocksize - (to - base) - csum_size;
+	retval = ext2fs_set_rec_len(fs, rec_len, to);
+	if (retval)
+		return retval;
+	if (csum_size)
+		ext2fs_initialize_dirent_tail(fs,
+				EXT2_DIRENT_TAIL(base, fs->blocksize));
+	return 0;
+}
+
+static errcode_t dx_insert_entry(ext2_filsys fs, ext2_ino_t dir,
+				 struct dx_lookup_info *info, int level,
+				 __u32 hash, blk64_t lblk)
+{
+	int pcount;
+	struct ext2_dx_entry *top, *new;
+
+	pcount = ext2fs_le16_to_cpu(info->frames[level].head->count);
+	top = info->frames[level].entries + pcount;
+	new = info->frames[level].at + 1;
+	memmove(new + 1, new, (char *)top - (char *)new);
+	new->hash = ext2fs_cpu_to_le32(hash);
+	new->block = ext2fs_cpu_to_le32(lblk);
+	info->frames[level].head->count = ext2fs_cpu_to_le16(pcount + 1);
+	return ext2fs_write_dir_block4(fs, info->frames[level].pblock,
+				       info->frames[level].buf, 0, dir);
+}
+
+static errcode_t dx_split_leaf(ext2_filsys fs, ext2_ino_t dir,
+			       struct ext2_inode *diri,
+			       struct dx_lookup_info *info, void *buf,
+			       blk64_t leaf_pblk, blk64_t new_lblk,
+			       blk64_t new_pblk)
+{
+	int hash_flags = diri->i_flags & EXT4_CASEFOLD_FL;
+	struct ext2_dir_entry *de;
+	void *buf2;
+	errcode_t retval = 0;
+	int rec_len;
+	int offset, move_size;
+	int i, count = 0;
+	struct dx_hash_map *map;
+	int continued;
+	__u32 minor_hash;
+
+	retval = ext2fs_get_mem(fs->blocksize, &buf2);
+	if (retval)
+		return retval;
+	retval = ext2fs_get_array(fs->blocksize / 12,
+				  sizeof(struct dx_hash_map), &map);
+	if (retval) {
+		ext2fs_free_mem(&buf2);
+		return retval;
+	}
+	for (offset = 0; offset < fs->blocksize; offset += rec_len) {
+		de = buf + offset;
+		retval = ext2fs_get_rec_len(fs, de, &rec_len);
+		if (retval)
+			goto out;
+		if (ext2fs_dirent_name_len(de) > 0 && de->inode) {
+			map[count].off = offset;
+			map[count].size = rec_len;
+			retval = ext2fs_dirhash2(info->hash_alg, de->name,
+					ext2fs_dirent_name_len(de),
+					fs->encoding, hash_flags,
+					fs->super->s_hash_seed,
+					&(map[count].hash),
+					&minor_hash);
+			if (retval)
+				goto out;
+			count++;
+		}
+	}
+	qsort(map, count, sizeof(struct dx_hash_map), dx_hash_map_cmp);
+	move_size = 0;
+	/* Find place to split block */
+	for (i = count - 1; i >= 0; i--) {
+		if (move_size + map[i].size / 2 > fs->blocksize / 2)
+			break;
+		move_size += map[i].size;
+	}
+	/* Let i be the first entry to move */
+	i++;
+	/* Move selected directory entries to new block */
+	retval = dx_move_dirents(fs, map + i, count - i, buf, buf2);
+	if (retval)
+		goto out;
+	retval = ext2fs_write_dir_block4(fs, new_pblk, buf2, 0, dir);
+	if (retval)
+		goto out;
+	/* Repack remaining entries in the old block */
+	retval = dx_move_dirents(fs, map, i, buf, buf2);
+	if (retval)
+		goto out;
+	retval = ext2fs_write_dir_block4(fs, leaf_pblk, buf2, 0, dir);
+	if (retval)
+		goto out;
+	/* Update parent node */
+	continued = map[i].hash == map[i-1].hash;
+	retval = dx_insert_entry(fs, dir, info, info->levels - 1,
+				 map[i].hash + continued, new_lblk);
+out:
+	ext2fs_free_mem(&buf2);
+	ext2fs_free_mem(&map);
+	return retval;
+}
+
+static errcode_t dx_grow_tree(ext2_filsys fs, ext2_ino_t dir,
+			      struct ext2_inode *diri,
+			      struct dx_lookup_info *info, void *buf,
+			      blk64_t leaf_pblk)
+{
+	int i;
+	errcode_t retval;
+	ext2_off64_t size = EXT2_I_SIZE(diri);
+	blk64_t lblk, pblk;
+	struct ext2_dir_entry *de;
+	struct ext2_dx_countlimit *head;
+	int csum_size = 0;
+	int count;
+
+	if (ext2fs_has_feature_metadata_csum(fs->super))
+		csum_size = sizeof(struct ext2_dx_tail);
+
+	/* Find level which can accommodate new child */
+	for (i = info->levels - 1; i >= 0; i--)
+		if (ext2fs_le16_to_cpu(info->frames[i].head->count) <
+		    ext2fs_le16_to_cpu(info->frames[i].head->limit))
+			break;
+	/* Need to grow tree depth? */
+	if (i < 0 && info->levels > ext2_dir_htree_level(fs))
+		return EXT2_ET_DIR_NO_SPACE;
+	lblk = size / fs->blocksize;
+	size += fs->blocksize;
+	retval = ext2fs_inode_size_set(fs, diri, size);
+	if (retval)
+		return retval;
+	retval = ext2fs_fallocate(fs,
+			EXT2_FALLOCATE_FORCE_INIT | EXT2_FALLOCATE_ZERO_BLOCKS,
+			dir, diri, 0, lblk, 1);
+	if (retval)
+		return retval;
+	retval = ext2fs_write_inode(fs, dir, diri);
+	if (retval)
+		return retval;
+	retval = ext2fs_bmap2(fs, dir, diri, NULL, 0, lblk, NULL, &pblk);
+	if (retval)
+		return retval;
+	/* Only leaf addition needed? */
+	if (i == info->levels - 1)
+		return dx_split_leaf(fs, dir, diri, info, buf, leaf_pblk,
+				     lblk, pblk);
+
+	de = buf;
+	de->inode = 0;
+	ext2fs_dirent_set_name_len(de, 0);
+	ext2fs_dirent_set_file_type(de, 0);
+	retval = ext2fs_set_rec_len(fs, fs->blocksize, de);
+	if (retval)
+		return retval;
+	head = buf + 8;
+	count = ext2fs_le16_to_cpu(info->frames[i+1].head->count);
+	/* Growing tree depth? */
+	if (i < 0) {
+		struct ext2_dx_root_info *root;
+
+		memcpy(head, info->frames[0].entries,
+		       count * sizeof(struct ext2_dx_entry));
+		head->limit = ext2fs_cpu_to_le16(
+				(fs->blocksize - (8 + csum_size)) /
+				sizeof(struct ext2_dx_entry));
+		/* head->count gets set by memcpy above to correct value */
+
+		/* Now update tree root */
+		info->frames[0].head->count = ext2fs_cpu_to_le16(1);
+		info->frames[0].entries[0].block = ext2fs_cpu_to_le32(lblk);
+		root = info->frames[0].buf + EXT2_DX_ROOT_OFF;
+		root->indirect_levels++;
+	} else {
+		/* Splitting internal node in two */
+		int count1 = count / 2;
+		int count2 = count - count1;
+		__u32 split_hash = ext2fs_le32_to_cpu(info->frames[i+1].entries[count1].hash);
+
+		memcpy(head, info->frames[i+1].entries + count1,
+		       count2 * sizeof(struct ext2_dx_entry));
+		head->count = ext2fs_cpu_to_le16(count2);
+		head->limit = ext2fs_cpu_to_le16(
+				(fs->blocksize - (8 + csum_size)) /
+				sizeof(struct ext2_dx_entry));
+		info->frames[i+1].head->count = ext2fs_cpu_to_le16(count1);
+
+		/* Update parent node */
+		retval = dx_insert_entry(fs, dir, info, i, split_hash, lblk);
+		if (retval)
+			return retval;
+
+	}
+	/* Writeout split block / updated root */
+	retval = ext2fs_write_dir_block4(fs, info->frames[i+1].pblock,
+					 info->frames[i+1].buf, 0, dir);
+	if (retval)
+		return retval;
+	/* Writeout new tree block */
+	retval = ext2fs_write_dir_block4(fs, pblk, buf, 0, dir);
+	if (retval)
+		return retval;
+	return 0;
+}
+
+static errcode_t dx_link(ext2_filsys fs, ext2_ino_t dir,
+			 struct ext2_inode *diri, const char *name,
+			 ext2_ino_t ino, int flags)
+{
+	struct dx_lookup_info dx_info;
+	errcode_t retval;
+	void *blockbuf;
+	int restart = 0;
+	blk64_t leaf_pblk;
+
+	retval = ext2fs_get_mem(fs->blocksize, &blockbuf);
+	if (retval)
+		return retval;
+
+	dx_info.name = name;
+	dx_info.namelen = strlen(name);
+again:
+	retval = dx_lookup(fs, dir, diri, &dx_info);
+	if (retval < 0)
+		goto free_buf;
+
+	retval = add_dirent_to_buf(fs,
+		ext2fs_le32_to_cpu(dx_info.frames[dx_info.levels-1].at->block) & 0x0fffffff,
+		blockbuf, dir, diri, name, ino, flags, &leaf_pblk);
+	/*
+	 * Success or error other than ENOSPC...? We are done. We may need upto
+	 * two tries to add entry. One to split htree node and another to add
+	 * new leaf block.
+	 */
+	if (restart >= dx_info.levels || retval != EXT2_ET_DIR_NO_SPACE)
+		goto free_frames;
+	retval = dx_grow_tree(fs, dir, diri, &dx_info, blockbuf, leaf_pblk);
+	if (retval)
+		goto free_frames;
+	/* Restart everything now that the tree is larger */
+	restart++;
+	dx_release(&dx_info);
+	goto again;
+free_frames:
+	dx_release(&dx_info);
+free_buf:
+	ext2fs_free_mem(&blockbuf);
+	return retval;
+}
+
 /*
  * Note: the low 3 bits of the flags field are used as the directory
  * entry filetype.
@@ -163,6 +617,12 @@ errcode_t ext2fs_link(ext2_filsys fs, ext2_ino_t dir, const char *name,
 	if (!(fs->flags & EXT2_FLAG_RW))
 		return EXT2_ET_RO_FILSYS;
 
+	if ((retval = ext2fs_read_inode(fs, dir, &inode)) != 0)
+		return retval;
+
+	if (inode.i_flags & EXT2_INDEX_FL)
+		return dx_link(fs, dir, &inode, name, ino, flags);
+
 	ls.fs = fs;
 	ls.name = name;
 	ls.namelen = name ? strlen(name) : 0;
@@ -173,8 +633,8 @@ errcode_t ext2fs_link(ext2_filsys fs, ext2_ino_t dir, const char *name,
 	ls.blocksize = fs->blocksize;
 	ls.err = 0;
 
-	retval = ext2fs_dir_iterate(fs, dir, DIRENT_FLAG_INCLUDE_EMPTY,
-				    0, link_proc, &ls);
+	retval = ext2fs_dir_iterate2(fs, dir, DIRENT_FLAG_INCLUDE_EMPTY,
+				     NULL, link_proc, &ls);
 	if (retval)
 		return retval;
 	if (ls.err)
@@ -182,20 +642,5 @@ errcode_t ext2fs_link(ext2_filsys fs, ext2_ino_t dir, const char *name,
 
 	if (!ls.done)
 		return EXT2_ET_DIR_NO_SPACE;
-
-	if ((retval = ext2fs_read_inode(fs, dir, &inode)) != 0)
-		return retval;
-
-	/*
-	 * If this function changes to preserve the htree, remove the
-	 * two hunks in link_proc that shove checksum tails into the
-	 * former dx_root/dx_node blocks.
-	 */
-	if (inode.i_flags & EXT2_INDEX_FL) {
-		inode.i_flags &= ~EXT2_INDEX_FL;
-		if ((retval = ext2fs_write_inode(fs, dir, &inode)) != 0)
-			return retval;
-	}
-
 	return 0;
 }
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 5/7] tests: Modify f_large_dir test to excercise indexed dir handling
  2020-02-13 10:15 [PATCH 0/7 v2] e2fsprogs: Better handling of indexed directories Jan Kara
                   ` (3 preceding siblings ...)
  2020-02-13 10:15 ` [PATCH 4/7] ext2fs: Implement dir entry creation in htree directories Jan Kara
@ 2020-02-13 10:16 ` Jan Kara
  2020-02-18 20:29   ` Andreas Dilger
  2020-03-15 16:43   ` Theodore Y. Ts'o
  2020-02-13 10:16 ` [PATCH 6/7] tests: Add test to excercise indexed directories with metadata_csum Jan Kara
  2020-02-13 10:16 ` [PATCH 7/7] tune2fs: Update dir checksums when clearing dir_index feature Jan Kara
  6 siblings, 2 replies; 28+ messages in thread
From: Jan Kara @ 2020-02-13 10:16 UTC (permalink / raw)
  To: Ted Tso; +Cc: linux-ext4, Jan Kara

Modify f_large_dir test to create indexed directory and create entries
in it. That way the new code in ext2fs_link() for addition of entries
into indexed directories gets executed including various special cases
when growing htree.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 tests/f_large_dir/expect | 10 ++++++++++
 tests/f_large_dir/script | 27 ++++++++++++++++++++++-----
 2 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/tests/f_large_dir/expect b/tests/f_large_dir/expect
index 8f7d99dc1ee7..028234cc6bb5 100644
--- a/tests/f_large_dir/expect
+++ b/tests/f_large_dir/expect
@@ -6,6 +6,16 @@ Allocating group tables:      \b\b\b\b\bdone
 Writing inode tables:      \b\b\b\b\bdone                            
 Writing superblocks and filesystem accounting information:      \b\b\b\b\bdone
 
+Pass 1: Checking inodes, blocks, and sizes
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 3A: Optimizing directories
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+
+test.img: ***** FILE SYSTEM WAS MODIFIED *****
+test.img: 17/65072 files (5.9% non-contiguous), 9732/108341 blocks
+Exit status is 0
 Pass 1: Checking inodes, blocks, and sizes
 Pass 2: Checking directory structure
 Pass 3: Checking directory connectivity
diff --git a/tests/f_large_dir/script b/tests/f_large_dir/script
index 9af042ca6ca8..e3235836f997 100644
--- a/tests/f_large_dir/script
+++ b/tests/f_large_dir/script
@@ -26,17 +26,33 @@ $MKE2FS -b 1024 -O large_dir,uninit_bg -N $((ENTRIES + 50)) \
 RC=$?
 if [ $RC -eq 0 ]; then
 {
-	START=$SECONDS
+	# First some initial fs setup to create indexed dir
 	echo "mkdir /foo"
 	echo "cd /foo"
 	touch $TMPFILE.tmp
 	echo "write $TMPFILE.tmp foofile"
 	i=0
-	last=0
+	while test $i -lt $DIRENT_PER_LEAF ; do
+		printf "mkdir d%0254u\n" $i
+		i=$((i + 1));
+	done
+	echo "expand ./"
+	printf "mkdir d%0254u\n" $i
+} | $DEBUGFS -w $TMPFILE > /dev/null 2>> $OUT.new
+	RC=$?
+	# e2fsck should optimize the dir to become indexed
+	$E2FSCK -yfD $TMPFILE >> $OUT.new 2>&1
+	status=$?
+	echo Exit status is $status >> $OUT.new
+fi
+
+if [ $RC -eq 0 ]; then
+{
+	START=$SECONDS
+	i=$(($DIRENT_PER_LEAF+1))
+	last=$i
+	echo "cd /foo"
 	while test $i -lt $ENTRIES ; do
-	    if test $((i % DIRENT_PER_LEAF)) -eq 0; then
-	    	echo "expand ./"
-	    fi
 	    ELAPSED=$((SECONDS - START))
 	    if test $((i % 5000)) -eq 0 -a $ELAPSED -gt 10; then
 		RATE=$(((i - last) / ELAPSED))
@@ -54,6 +70,7 @@ if [ $RC -eq 0 ]; then
 } | $DEBUGFS -w $TMPFILE > /dev/null 2>> $OUT.new
 	RC=$?
 fi
+
 if [ $RC -eq 0 ]; then
 	$E2FSCK -yfD $TMPFILE >> $OUT.new 2>&1
 	status=$?
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 6/7] tests: Add test to excercise indexed directories with metadata_csum
  2020-02-13 10:15 [PATCH 0/7 v2] e2fsprogs: Better handling of indexed directories Jan Kara
                   ` (4 preceding siblings ...)
  2020-02-13 10:16 ` [PATCH 5/7] tests: Modify f_large_dir test to excercise indexed dir handling Jan Kara
@ 2020-02-13 10:16 ` Jan Kara
  2020-02-18 20:34   ` Andreas Dilger
  2020-02-13 10:16 ` [PATCH 7/7] tune2fs: Update dir checksums when clearing dir_index feature Jan Kara
  6 siblings, 1 reply; 28+ messages in thread
From: Jan Kara @ 2020-02-13 10:16 UTC (permalink / raw)
  To: Ted Tso; +Cc: linux-ext4, Jan Kara

Indexed directories have somewhat different format when metadata_csum is
enabled. Add test to excercise linking in indexed directories and e2fsck
rehash code in this case.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 tests/f_large_dir_csum/expect       | 32 ++++++++++++++
 tests/f_large_dir_csum/is_slow_test |  0
 tests/f_large_dir_csum/name         |  1 +
 tests/f_large_dir_csum/script       | 84 +++++++++++++++++++++++++++++++++++++
 4 files changed, 117 insertions(+)
 create mode 100644 tests/f_large_dir_csum/expect
 create mode 100644 tests/f_large_dir_csum/is_slow_test
 create mode 100644 tests/f_large_dir_csum/name
 create mode 100644 tests/f_large_dir_csum/script

diff --git a/tests/f_large_dir_csum/expect b/tests/f_large_dir_csum/expect
new file mode 100644
index 000000000000..aa9f33f1d25d
--- /dev/null
+++ b/tests/f_large_dir_csum/expect
@@ -0,0 +1,32 @@
+Creating filesystem with 31002 1k blocks and 64 inodes
+Superblock backups stored on blocks: 
+	8193, 24577
+
+Allocating group tables:    \b\b\bdone                            
+Writing inode tables:    \b\b\bdone                            
+Writing superblocks and filesystem accounting information:    \b\b\bdone
+
+Pass 1: Checking inodes, blocks, and sizes
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 3A: Optimizing directories
+Pass 4: Checking reference counts
+Inode 13 ref count is 1, should be 5.  Fix? yes
+
+Pass 5: Checking group summary information
+
+test.img: ***** FILE SYSTEM WAS MODIFIED *****
+test.img: 13/64 files (0.0% non-contiguous), 766/31002 blocks
+Exit status is 1
+Pass 1: Checking inodes, blocks, and sizes
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 3A: Optimizing directories
+Pass 4: Checking reference counts
+Inode 13 ref count is 5, should be 46504.  Fix? yes
+
+Pass 5: Checking group summary information
+
+test.img: ***** FILE SYSTEM WAS MODIFIED *****
+test.img: 13/64 files (0.0% non-contiguous), 16390/31002 blocks
+Exit status is 1
diff --git a/tests/f_large_dir_csum/is_slow_test b/tests/f_large_dir_csum/is_slow_test
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/tests/f_large_dir_csum/name b/tests/f_large_dir_csum/name
new file mode 100644
index 000000000000..2b37c8c21f79
--- /dev/null
+++ b/tests/f_large_dir_csum/name
@@ -0,0 +1 @@
+optimize 3 level htree directories with metadata checksums
diff --git a/tests/f_large_dir_csum/script b/tests/f_large_dir_csum/script
new file mode 100644
index 000000000000..286a965d5e6a
--- /dev/null
+++ b/tests/f_large_dir_csum/script
@@ -0,0 +1,84 @@
+OUT=$test_name.log
+EXP=$test_dir/expect
+E2FSCK=../e2fsck/e2fsck
+
+NAMELEN=255
+DIRENT_SZ=8
+BLOCKSZ=1024
+INODESZ=128
+CSUM_SZ=8
+CSUM_TAIL_SZ=12
+DIRENT_PER_LEAF=$(((BLOCKSZ - CSUM_TAIL_SZ) / (NAMELEN + DIRENT_SZ)))
+HEADER=32
+INDEX_SZ=8
+INDEX_L1=$(((BLOCKSZ - HEADER - CSUM_SZ) / INDEX_SZ))
+INDEX_L2=$(((BLOCKSZ - DIRENT_SZ - CSUM_SZ) / INDEX_SZ))
+DIRBLK=$((3 + INDEX_L1 * INDEX_L2))
+ENTRIES=$((DIRBLK * DIRENT_PER_LEAF))
+# directory leaf blocks - get twice as much because the leaves won't be full
+# and there are also other filesystem blocks.
+FSIZE=$((DIRBLK * 2))
+
+$MKE2FS -b 1024 -O extents,64bit,large_dir,uninit_bg,metadata_csum -N 50 \
+	-I $INODESZ -F $TMPFILE $FSIZE > $OUT.new 2>&1
+RC=$?
+if [ $RC -eq 0 ]; then
+{
+	# First some initial fs setup to create indexed dir
+	echo "mkdir /foo"
+	echo "cd /foo"
+	touch $TMPFILE.tmp
+	echo "write $TMPFILE.tmp foofile"
+	i=0
+	while test $i -lt $DIRENT_PER_LEAF ; do
+		printf "ln foofile f%0254u\n" $i
+		i=$((i + 1));
+	done
+	echo "expand ./"
+	printf "ln foofile f%0254u\n" $i
+} | $DEBUGFS -w $TMPFILE > /dev/null 2>> $OUT.new
+	RC=$?
+	# e2fsck should optimize the dir to become indexed
+	$E2FSCK -yfD $TMPFILE >> $OUT.new 2>&1
+	status=$?
+	echo Exit status is $status >> $OUT.new
+fi
+
+if [ $RC -eq 0 ]; then
+{
+	START=$SECONDS
+	i=$(($DIRENT_PER_LEAF+1))
+	last=$i
+	echo "cd /foo"
+	while test $i -lt $ENTRIES ; do
+	    ELAPSED=$((SECONDS - START))
+	    if test $((i % 5000)) -eq 0 -a $ELAPSED -gt 10; then
+		RATE=$(((i - last) / ELAPSED))
+		echo "$test_name: $i/$ENTRIES links, ${ELAPSED}s @ $RATE/s" >&2
+		START=$SECONDS
+		last=$i
+	    fi
+	    printf "ln foofile f%0254u\n" $i
+	    i=$((i + 1))
+	done
+} | $DEBUGFS -w $TMPFILE > /dev/null 2>> $OUT.new
+	RC=$?
+fi
+
+if [ $RC -eq 0 ]; then
+	$E2FSCK -yfD $TMPFILE >> $OUT.new 2>&1
+	status=$?
+	echo Exit status is $status >> $OUT.new
+	sed -f $cmd_dir/filter.sed -e "s;$TMPFILE;test.img;" $OUT.new > $OUT
+	rm -f $OUT.new
+
+	cmp -s $OUT $EXP
+	RC=$?
+fi
+if [ $RC -eq 0 ]; then
+	echo "$test_name: $test_description: ok"
+	touch $test_name.ok
+else
+	echo "$test_name: $test_description: failed"
+	diff -u $EXP $OUT > $test_name.failed
+fi
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 7/7] tune2fs: Update dir checksums when clearing dir_index feature
  2020-02-13 10:15 [PATCH 0/7 v2] e2fsprogs: Better handling of indexed directories Jan Kara
                   ` (5 preceding siblings ...)
  2020-02-13 10:16 ` [PATCH 6/7] tests: Add test to excercise indexed directories with metadata_csum Jan Kara
@ 2020-02-13 10:16 ` Jan Kara
  2020-02-18 20:50   ` Andreas Dilger
  2020-03-15 17:15   ` Theodore Y. Ts'o
  6 siblings, 2 replies; 28+ messages in thread
From: Jan Kara @ 2020-02-13 10:16 UTC (permalink / raw)
  To: Ted Tso; +Cc: linux-ext4, Jan Kara

When clearing dir_index feature while metadata_csum is enabled, we have
to rewrite checksums of all indexed directories to update checksums of
internal tree nodes.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 misc/tune2fs.c | 143 ++++++++++++++++++++++++++++++++++++++-------------------
 1 file changed, 95 insertions(+), 48 deletions(-)

diff --git a/misc/tune2fs.c b/misc/tune2fs.c
index a0448f63d1d5..254246fd858b 100644
--- a/misc/tune2fs.c
+++ b/misc/tune2fs.c
@@ -508,7 +508,8 @@ struct rewrite_dir_context {
 	char *buf;
 	errcode_t errcode;
 	ext2_ino_t dir;
-	int is_htree;
+	int is_htree:1;
+	int clear_htree:1;
 };
 
 static int rewrite_dir_block(ext2_filsys fs,
@@ -527,8 +528,13 @@ static int rewrite_dir_block(ext2_filsys fs,
 	if (ctx->errcode)
 		return BLOCK_ABORT;
 
-	/* if htree node... */
-	if (ctx->is_htree)
+	/*
+	 * if htree node... Note that if we are clearing htree structures from
+	 * the directory, we treat the htree internal block as an ordinary leaf.
+	 * The code below will do the right thing and make space for checksum
+	 * there.
+	 */
+	if (ctx->is_htree && !ctx->clear_htree)
 		ext2fs_get_dx_countlimit(fs, (struct ext2_dir_entry *)ctx->buf,
 					 &dcl, &dcl_offset);
 	if (dcl) {
@@ -657,7 +663,8 @@ static errcode_t rewrite_directory(ext2_filsys fs, ext2_ino_t dir,
 	if (retval)
 		return retval;
 
-	ctx.is_htree = (inode->i_flags & EXT2_INDEX_FL);
+	ctx.is_htree = !!(inode->i_flags & EXT2_INDEX_FL);
+	ctx.clear_htree = !ext2fs_has_feature_dir_index(fs->super);
 	ctx.dir = dir;
 	ctx.errcode = 0;
 	retval = ext2fs_block_iterate3(fs, dir, BLOCK_FLAG_READ_ONLY |
@@ -668,6 +675,13 @@ static errcode_t rewrite_directory(ext2_filsys fs, ext2_ino_t dir,
 	if (retval)
 		return retval;
 
+	if (ctx.is_htree && ctx.clear_htree) {
+		inode->i_flags &= ~EXT2_INDEX_FL;
+		retval = ext2fs_write_inode(fs, dir, inode);
+		if (retval)
+			return retval;
+	}
+
 	return ctx.errcode;
 }
 
@@ -822,28 +836,67 @@ static void rewrite_one_inode(struct rewrite_context *ctx, ext2_ino_t ino,
 		fatal_err(retval, "while rewriting extended attribute");
 }
 
-/*
- * Forcibly set checksums in all inodes.
- */
-static void rewrite_inodes(ext2_filsys fs)
+#define REWRITE_EA_FL		0x01	/* Rewrite EA inodes */
+#define REWRITE_DIR_FL		0x02	/* Rewrite directories */
+#define REWRITE_NONDIR_FL	0x04	/* Rewrite other inodes */
+#define REWRITE_ALL (REWRITE_EA_FL | REWRITE_DIR_FL | REWRITE_NONDIR_FL)
+
+static void rewrite_inodes_pass(struct rewrite_context *ctx, unsigned int flags)
 {
 	ext2_inode_scan	scan;
 	errcode_t	retval;
 	ext2_ino_t	ino;
 	struct ext2_inode *inode;
-	int pass;
+	int rewrite;
+
+	retval = ext2fs_get_mem(ctx->inode_size, &inode);
+	if (retval)
+		fatal_err(retval, "while allocating memory");
+
+	retval = ext2fs_open_inode_scan(ctx->fs, 0, &scan);
+	if (retval)
+		fatal_err(retval, "while opening inode scan");
+
+	do {
+		retval = ext2fs_get_next_inode_full(scan, &ino, inode,
+						    ctx->inode_size);
+		if (retval)
+			fatal_err(retval, "while getting next inode");
+		if (!ino)
+			break;
+
+		rewrite = 0;
+		if (inode->i_flags & EXT4_EA_INODE_FL) {
+			if (flags & REWRITE_EA_FL)
+				rewrite = 1;
+		} else if (LINUX_S_ISDIR(inode->i_mode)) {
+			if (flags & REWRITE_DIR_FL)
+				rewrite = 1;
+		} else {
+			if (flags & REWRITE_NONDIR_FL)
+				rewrite = 1;
+		}
+		if (rewrite)
+			rewrite_one_inode(ctx, ino, inode);
+	} while (ino);
+	ext2fs_close_inode_scan(scan);
+	ext2fs_free_mem(&inode);
+}
+
+/*
+ * Forcibly rewrite checksums in inodes specified by 'flags'
+ */
+static void rewrite_inodes(ext2_filsys fs, unsigned int flags)
+{
 	struct rewrite_context ctx = {
 		.fs = fs,
 		.inode_size = EXT2_INODE_SIZE(fs->super),
 	};
+	errcode_t retval;
 
 	if (fs->super->s_creator_os == EXT2_OS_HURD)
 		return;
 
-	retval = ext2fs_get_mem(ctx.inode_size, &inode);
-	if (retval)
-		fatal_err(retval, "while allocating memory");
-
 	retval = ext2fs_get_memzero(ctx.inode_size, &ctx.zero_inode);
 	if (retval)
 		fatal_err(retval, "while allocating memory");
@@ -862,39 +915,16 @@ static void rewrite_inodes(ext2_filsys fs)
 	 *
 	 * pass 2: go over other inodes to update their checksums.
 	 */
-	if (ext2fs_has_feature_ea_inode(fs->super))
-		pass = 1;
-	else
-		pass = 2;
-	for (;pass <= 2; pass++) {
-		retval = ext2fs_open_inode_scan(fs, 0, &scan);
-		if (retval)
-			fatal_err(retval, "while opening inode scan");
-
-		do {
-			retval = ext2fs_get_next_inode_full(scan, &ino, inode,
-							    ctx.inode_size);
-			if (retval)
-				fatal_err(retval, "while getting next inode");
-			if (!ino)
-				break;
-
-			if (((pass == 1) &&
-			     (inode->i_flags & EXT4_EA_INODE_FL)) ||
-			    ((pass == 2) &&
-			     !(inode->i_flags & EXT4_EA_INODE_FL)))
-				rewrite_one_inode(&ctx, ino, inode);
-		} while (ino);
-
-		ext2fs_close_inode_scan(scan);
-	}
+	if (ext2fs_has_feature_ea_inode(fs->super) && (flags & REWRITE_EA_FL))
+		rewrite_inodes_pass(&ctx, REWRITE_EA_FL);
+	flags &= ~REWRITE_EA_FL;
+	rewrite_inodes_pass(&ctx, flags);
 
 	ext2fs_free_mem(&ctx.zero_inode);
 	ext2fs_free_mem(&ctx.ea_buf);
-	ext2fs_free_mem(&inode);
 }
 
-static void rewrite_metadata_checksums(ext2_filsys fs)
+static void rewrite_metadata_checksums(ext2_filsys fs, unsigned int flags)
 {
 	errcode_t retval;
 	dgrp_t i;
@@ -906,7 +936,7 @@ static void rewrite_metadata_checksums(ext2_filsys fs)
 	retval = ext2fs_read_bitmaps(fs);
 	if (retval)
 		fatal_err(retval, "while reading bitmaps");
-	rewrite_inodes(fs);
+	rewrite_inodes(fs, flags);
 	ext2fs_mark_ib_dirty(fs);
 	ext2fs_mark_bb_dirty(fs);
 	ext2fs_mmp_update2(fs, 1);
@@ -1205,6 +1235,23 @@ mmp_error:
 			uuid_generate((unsigned char *) sb->s_hash_seed);
 	}
 
+	if (FEATURE_OFF(E2P_FEATURE_COMPAT, EXT2_FEATURE_COMPAT_DIR_INDEX) &&
+	    ext2fs_has_feature_metadata_csum(sb)) {
+		check_fsck_needed(fs,
+			_("Disabling directory index on filesystem with "
+			  "checksums could take some time."));
+		if (mount_flags & EXT2_MF_MOUNTED) {
+			fputs(_("Cannot disable dir_index on a mounted "
+				"filesystem!\n"), stderr);
+			exit(1);
+		}
+		/*
+		 * Clearing dir_index on checksummed filesystem requires
+		 * rewriting all directories to update checksums.
+		 */
+		rewrite_checksums |= REWRITE_DIR_FL;
+	}
+
 	if (FEATURE_OFF(E2P_FEATURE_INCOMPAT, EXT4_FEATURE_INCOMPAT_FLEX_BG)) {
 		if (ext2fs_check_desc(fs)) {
 			fputs(_("Clearing the flex_bg flag would "
@@ -1248,7 +1295,7 @@ mmp_error:
 				 "The larger fields afforded by this feature "
 				 "enable full-strength checksumming.  "
 				 "Run resize2fs -b to rectify.\n"));
-		rewrite_checksums = 1;
+		rewrite_checksums = REWRITE_ALL;
 		/* metadata_csum supersedes uninit_bg */
 		ext2fs_clear_feature_gdt_csum(fs->super);
 
@@ -1276,7 +1323,7 @@ mmp_error:
 				"filesystem!\n"), stderr);
 			exit(1);
 		}
-		rewrite_checksums = 1;
+		rewrite_checksums = REWRITE_ALL;
 
 		/* Enable uninit_bg unless the user expressly turned it off */
 		memcpy(test_features, old_features, sizeof(test_features));
@@ -1458,7 +1505,7 @@ mmp_error:
 			}
 			check_fsck_needed(fs, _("Recalculating checksums "
 						"could take some time."));
-			rewrite_checksums = 1;
+			rewrite_checksums = REWRITE_ALL;
 		}
 	}
 
@@ -3196,7 +3243,7 @@ _("Warning: The journal is dirty. You may wish to replay the journal like:\n\n"
 			check_fsck_needed(fs,
 				_("Setting the UUID on this "
 				  "filesystem could take some time."));
-			rewrite_checksums = 1;
+			rewrite_checksums = REWRITE_ALL;
 		}
 
 		if (ext2fs_has_group_desc_csum(fs)) {
@@ -3307,7 +3354,7 @@ _("Warning: The journal is dirty. You may wish to replay the journal like:\n\n"
 		if (retval == 0) {
 			printf(_("Setting inode size %lu\n"),
 							new_inode_size);
-			rewrite_checksums = 1;
+			rewrite_checksums = REWRITE_ALL;
 		} else {
 			printf("%s", _("Failed to change inode size\n"));
 			rc = 1;
@@ -3316,7 +3363,7 @@ _("Warning: The journal is dirty. You may wish to replay the journal like:\n\n"
 	}
 
 	if (rewrite_checksums)
-		rewrite_metadata_checksums(fs);
+		rewrite_metadata_checksums(fs, rewrite_checksums);
 
 	if (l_flag)
 		list_super(sb);
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/7] e2fsck: Clarify overflow link count error message
  2020-02-13 10:15 ` [PATCH 1/7] e2fsck: Clarify overflow link count error message Jan Kara
@ 2020-02-14 19:27   ` Andreas Dilger
  2020-03-07 18:52   ` Theodore Y. Ts'o
  1 sibling, 0 replies; 28+ messages in thread
From: Andreas Dilger @ 2020-02-14 19:27 UTC (permalink / raw)
  To: Jan Kara; +Cc: Ted Tso, linux-ext4

[-- Attachment #1: Type: text/plain, Size: 3580 bytes --]

On Feb 13, 2020, at 3:15 AM, Jan Kara <jack@suse.cz> wrote:
> 
> When directory link count is set to overflow value (1) but during pass 4
> we find out the exact link count would fit, we either silently fix this
> (which is not great because e2fsck then reports the fs was modified but
> output doesn't indicate why in any way), or we report that link count is
> wrong and ask whether we should fix it (in case -n option was
> specified). The second case is even more misleading because it suggests
> non-trivial fs corruption which then gets silently fixed on the next
> run. Similarly to how we fix up other non-problems, just create a new
> error message for the case directory link count is not overflown anymore
> and always report it to clarify what is going on.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

Reviewed-by: Andreas Dilger <adilger@dilger.ca>

> ---
> e2fsck/pass4.c   | 20 ++++++++++++++++----
> e2fsck/problem.c |  5 +++++
> e2fsck/problem.h |  3 +++
> 3 files changed, 24 insertions(+), 4 deletions(-)
> 
> diff --git a/e2fsck/pass4.c b/e2fsck/pass4.c
> index 10be7f87180d..8c2d2f1fca12 100644
> --- a/e2fsck/pass4.c
> +++ b/e2fsck/pass4.c
> @@ -237,6 +237,8 @@ void e2fsck_pass4(e2fsck_t ctx)
> 			link_counted = 1;
> 		}
> 		if (link_counted != link_count) {
> +			int fix_nlink = 0;
> +
> 			e2fsck_read_inode_full(ctx, i, EXT2_INODE(inode),
> 					       inode_size, "pass4");
> 			pctx.ino = i;
> @@ -250,10 +252,20 @@ void e2fsck_pass4(e2fsck_t ctx)
> 			pctx.num = link_counted;
> 			/* i_link_count was previously exceeded, but no longer
> 			 * is, fix this but don't consider it an error */
> -			if ((isdir && link_counted > 1 &&
> -			     (inode->i_flags & EXT2_INDEX_FL) &&
> -			     link_count == 1 && !(ctx->options & E2F_OPT_NO)) ||
> -			    fix_problem(ctx, PR_4_BAD_REF_COUNT, &pctx)) {
> +			if (isdir && link_counted > 1 &&
> +			    (inode->i_flags & EXT2_INDEX_FL) &&
> +			    link_count == 1) {
> +				if ((ctx->options & E2F_OPT_READONLY) == 0) {
> +					fix_nlink =
> +						fix_problem(ctx,
> +							PR_4_DIR_OVERFLOW_REF_COUNT,
> +							&pctx);
> +				}
> +			} else {
> +				fix_nlink = fix_problem(ctx, PR_4_BAD_REF_COUNT,
> +						&pctx);
> +			}
> +			if (fix_nlink) {
> 				inode->i_links_count = link_counted;
> 				e2fsck_write_inode_full(ctx, i,
> 							EXT2_INODE(inode),
> diff --git a/e2fsck/problem.c b/e2fsck/problem.c
> index c7c0ba986006..e79c853b2096 100644
> --- a/e2fsck/problem.c
> +++ b/e2fsck/problem.c
> @@ -2035,6 +2035,11 @@ static struct e2fsck_problem problem_table[] = {
> 	  N_("@d exceeds max links, but no DIR_NLINK feature in @S.\n"),
> 	  PROMPT_FIX, 0, 0, 0, 0 },
> 
> +	/* Directory inode ref count set to overflow but could be exact value */
> +	{ PR_4_DIR_OVERFLOW_REF_COUNT,
> +	  N_("@d @i %i ref count set to overflow but could be exact value %N.  "),
> +	  PROMPT_FIX, PR_PREEN_OK, 0, 0, 0 },
> +
> 	/* Pass 5 errors */
> 
> 	/* Pass 5: Checking group summary information */
> diff --git a/e2fsck/problem.h b/e2fsck/problem.h
> index c7f65f6dee0f..4185e5175cab 100644
> --- a/e2fsck/problem.h
> +++ b/e2fsck/problem.h
> @@ -1164,6 +1164,9 @@ struct problem_context {
> /* directory exceeds max links, but no DIR_NLINK feature in superblock */
> #define PR_4_DIR_NLINK_FEATURE		0x040006
> 
> +/* Directory ref count set to overflow but it doesn't have to be */
> +#define PR_4_DIR_OVERFLOW_REF_COUNT	0x040007
> +
> /*
>  * Pass 5 errors
>  */
> --
> 2.16.4
> 


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/7] e2fsck: Fix indexed dir rehash failure with metadata_csum enabled
  2020-02-13 10:15 ` [PATCH 2/7] e2fsck: Fix indexed dir rehash failure with metadata_csum enabled Jan Kara
@ 2020-02-14 19:28   ` Andreas Dilger
  2020-03-07 23:17   ` Theodore Y. Ts'o
  1 sibling, 0 replies; 28+ messages in thread
From: Andreas Dilger @ 2020-02-14 19:28 UTC (permalink / raw)
  To: Jan Kara; +Cc: Ted Tso, linux-ext4

[-- Attachment #1: Type: text/plain, Size: 1951 bytes --]

On Feb 13, 2020, at 3:15 AM, Jan Kara <jack@suse.cz> wrote:
> 
> E2fsck directory rehashing code can fail with ENOSPC due to a bug in
> ext2fs_htree_intnode_maxrecs() which fails to take metadata checksum
> into account and thus e.g. e2fsck can decide to create 1 indirect level
> of index tree when two are actually needed. Fix the logic to account for
> metadata checksum.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

Reviewed-by: Andreas Dilger <adilger@dilger.ca>

> ---
> lib/ext2fs/ext2fs.h | 10 +++++++---
> 1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
> index 93ecf29c568d..5fde3343b1f1 100644
> --- a/lib/ext2fs/ext2fs.h
> +++ b/lib/ext2fs/ext2fs.h
> @@ -1783,7 +1783,6 @@ extern blk_t ext2fs_group_first_block(ext2_filsys fs, dgrp_t group);
> extern blk_t ext2fs_group_last_block(ext2_filsys fs, dgrp_t group);
> extern blk_t ext2fs_inode_data_blocks(ext2_filsys fs,
> 				      struct ext2_inode *inode);
> -extern int ext2fs_htree_intnode_maxrecs(ext2_filsys fs, int blocks);
> extern unsigned int ext2fs_div_ceil(unsigned int a, unsigned int b);
> extern __u64 ext2fs_div64_ceil(__u64 a, __u64 b);
> extern int ext2fs_dirent_name_len(const struct ext2_dir_entry *entry);
> @@ -2015,9 +2014,14 @@ _INLINE_ blk_t ext2fs_inode_data_blocks(ext2_filsys fs,
> 	return (blk_t) ext2fs_inode_data_blocks2(fs, inode);
> }
> 
> -_INLINE_ int ext2fs_htree_intnode_maxrecs(ext2_filsys fs, int blocks)
> +static inline int ext2fs_htree_intnode_maxrecs(ext2_filsys fs, int blocks)
> {
> -	return blocks * ((fs->blocksize - 8) / sizeof(struct ext2_dx_entry));
> +	int csum_size = 0;
> +
> +	if (ext2fs_has_feature_metadata_csum(fs->super))
> +		csum_size = sizeof(struct ext2_dx_tail);
> +	return blocks * ((fs->blocksize - (8 + csum_size)) /
> +						sizeof(struct ext2_dx_entry));
> }
> 
> /*
> --
> 2.16.4
> 


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/7] ext2fs: Update allocation info earlier in ext2fs_mkdir() and ext2fs_symlink()
  2020-02-13 10:15 ` [PATCH 3/7] ext2fs: Update allocation info earlier in ext2fs_mkdir() and ext2fs_symlink() Jan Kara
@ 2020-02-14 19:37   ` Andreas Dilger
  2020-03-08  0:02   ` Theodore Y. Ts'o
  1 sibling, 0 replies; 28+ messages in thread
From: Andreas Dilger @ 2020-02-14 19:37 UTC (permalink / raw)
  To: Jan Kara; +Cc: Ted Tso, linux-ext4

[-- Attachment #1: Type: text/plain, Size: 2987 bytes --]

On Feb 13, 2020, at 3:15 AM, Jan Kara <jack@suse.cz> wrote:
> 
> Currently, ext2fs_mkdir() and ext2fs_symlink() update allocation bitmaps
> and other information only close to the end of the function, in
> particular after calling to ext2fs_link(). When ext2fs_link() will
> support indexed directories, it will also need to allocate blocks and
> that would cause filesystem corruption in case allocation info isn't
> properly updated. So make sure ext2fs_mkdir() and ext2fs_symlink()
> update allocation info before calling into ext2fs_link().
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

I was wondering if this was done at the end of the function to avoid the
need to undo it if there was an error in the middle of the operation?
I suppose the worst that would happen in that case is an extra bit set
in the block bitmap until the next e2fsck, which is a relatively safe
side-effect...  I'm not sure whether e2fsck would abort anyway in the
case either of these functions return an error?

In any case, this is better than what is there currently.

Reviewed-by: Andreas Dilger <adilger@dilger.ca>


> ---
> lib/ext2fs/mkdir.c   | 14 +++++++-------
> lib/ext2fs/symlink.c | 14 +++++++-------
> 2 files changed, 14 insertions(+), 14 deletions(-)
> 
> diff --git a/lib/ext2fs/mkdir.c b/lib/ext2fs/mkdir.c
> index 2a63aad16715..947003ebf309 100644
> --- a/lib/ext2fs/mkdir.c
> +++ b/lib/ext2fs/mkdir.c
> @@ -143,6 +143,13 @@ errcode_t ext2fs_mkdir(ext2_filsys fs, ext2_ino_t parent, ext2_ino_t inum,
> 		}
> 	}
> 
> +	/*
> +	 * Update accounting....
> +	 */
> +	if (!inline_data)
> +		ext2fs_block_alloc_stats2(fs, blk, +1);
> +	ext2fs_inode_alloc_stats2(fs, ino, +1, 1);
> +
> 	/*
> 	 * Link the directory into the filesystem hierarchy
> 	 */
> @@ -175,13 +182,6 @@ errcode_t ext2fs_mkdir(ext2_filsys fs, ext2_ino_t parent, ext2_ino_t inum,
> 			goto cleanup;
> 	}
> 
> -	/*
> -	 * Update accounting....
> -	 */
> -	if (!inline_data)
> -		ext2fs_block_alloc_stats2(fs, blk, +1);
> -	ext2fs_inode_alloc_stats2(fs, ino, +1, 1);
> -
> cleanup:
> 	if (block)
> 		ext2fs_free_mem(&block);
> diff --git a/lib/ext2fs/symlink.c b/lib/ext2fs/symlink.c
> index 7f78c5f75549..3e07a539daf3 100644
> --- a/lib/ext2fs/symlink.c
> +++ b/lib/ext2fs/symlink.c
> @@ -162,6 +162,13 @@ need_block:
> 			goto cleanup;
> 	}
> 
> +	/*
> +	 * Update accounting....
> +	 */
> +	if (!fastlink && !inlinelink)
> +		ext2fs_block_alloc_stats2(fs, blk, +1);
> +	ext2fs_inode_alloc_stats2(fs, ino, +1, 0);
> +
> 	/*
> 	 * Link the symlink into the filesystem hierarchy
> 	 */
> @@ -179,13 +186,6 @@ need_block:
> 			goto cleanup;
> 	}
> 
> -	/*
> -	 * Update accounting....
> -	 */
> -	if (!fastlink && !inlinelink)
> -		ext2fs_block_alloc_stats2(fs, blk, +1);
> -	ext2fs_inode_alloc_stats2(fs, ino, +1, 0);
> -
> cleanup:
> 	if (block_buf)
> 		ext2fs_free_mem(&block_buf);
> --
> 2.16.4
> 


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 5/7] tests: Modify f_large_dir test to excercise indexed dir handling
  2020-02-13 10:16 ` [PATCH 5/7] tests: Modify f_large_dir test to excercise indexed dir handling Jan Kara
@ 2020-02-18 20:29   ` Andreas Dilger
  2020-03-15 16:43   ` Theodore Y. Ts'o
  1 sibling, 0 replies; 28+ messages in thread
From: Andreas Dilger @ 2020-02-18 20:29 UTC (permalink / raw)
  To: Jan Kara; +Cc: Ted Tso, linux-ext4

[-- Attachment #1: Type: text/plain, Size: 3003 bytes --]

On Feb 13, 2020, at 3:16 AM, Jan Kara <jack@suse.cz> wrote:
> 
> Modify f_large_dir test to create indexed directory and create entries
> in it. That way the new code in ext2fs_link() for addition of entries
> into indexed directories gets executed including various special cases
> when growing htree.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

Reviewed-by: Andreas Dilger <adilger@dilger.ca>

> ---
> tests/f_large_dir/expect | 10 ++++++++++
> tests/f_large_dir/script | 27 ++++++++++++++++++++++-----
> 2 files changed, 32 insertions(+), 5 deletions(-)
> 
> diff --git a/tests/f_large_dir/expect b/tests/f_large_dir/expect
> index 8f7d99dc1ee7..028234cc6bb5 100644
> --- a/tests/f_large_dir/expect
> +++ b/tests/f_large_dir/expect
> @@ -6,6 +6,16 @@ Allocating group tables:      \b\b\b\b\bdone
> Writing inode tables:      \b\b\b\b\bdone
> Writing superblocks and filesystem accounting information:      \b\b\b\b\bdone
> 
> +Pass 1: Checking inodes, blocks, and sizes
> +Pass 2: Checking directory structure
> +Pass 3: Checking directory connectivity
> +Pass 3A: Optimizing directories
> +Pass 4: Checking reference counts
> +Pass 5: Checking group summary information
> +
> +test.img: ***** FILE SYSTEM WAS MODIFIED *****
> +test.img: 17/65072 files (5.9% non-contiguous), 9732/108341 blocks
> +Exit status is 0
> Pass 1: Checking inodes, blocks, and sizes
> Pass 2: Checking directory structure
> Pass 3: Checking directory connectivity
> diff --git a/tests/f_large_dir/script b/tests/f_large_dir/script
> index 9af042ca6ca8..e3235836f997 100644
> --- a/tests/f_large_dir/script
> +++ b/tests/f_large_dir/script
> @@ -26,17 +26,33 @@ $MKE2FS -b 1024 -O large_dir,uninit_bg -N $((ENTRIES + 50)) \
> RC=$?
> if [ $RC -eq 0 ]; then
> {
> -	START=$SECONDS
> +	# First some initial fs setup to create indexed dir
> 	echo "mkdir /foo"
> 	echo "cd /foo"
> 	touch $TMPFILE.tmp
> 	echo "write $TMPFILE.tmp foofile"
> 	i=0
> -	last=0
> +	while test $i -lt $DIRENT_PER_LEAF ; do
> +		printf "mkdir d%0254u\n" $i
> +		i=$((i + 1));
> +	done
> +	echo "expand ./"
> +	printf "mkdir d%0254u\n" $i
> +} | $DEBUGFS -w $TMPFILE > /dev/null 2>> $OUT.new
> +	RC=$?
> +	# e2fsck should optimize the dir to become indexed
> +	$E2FSCK -yfD $TMPFILE >> $OUT.new 2>&1
> +	status=$?
> +	echo Exit status is $status >> $OUT.new
> +fi
> +
> +if [ $RC -eq 0 ]; then
> +{
> +	START=$SECONDS
> +	i=$(($DIRENT_PER_LEAF+1))
> +	last=$i
> +	echo "cd /foo"
> 	while test $i -lt $ENTRIES ; do
> -	    if test $((i % DIRENT_PER_LEAF)) -eq 0; then
> -	    	echo "expand ./"
> -	    fi
> 	    ELAPSED=$((SECONDS - START))
> 	    if test $((i % 5000)) -eq 0 -a $ELAPSED -gt 10; then
> 		RATE=$(((i - last) / ELAPSED))
> @@ -54,6 +70,7 @@ if [ $RC -eq 0 ]; then
> } | $DEBUGFS -w $TMPFILE > /dev/null 2>> $OUT.new
> 	RC=$?
> fi
> +
> if [ $RC -eq 0 ]; then
> 	$E2FSCK -yfD $TMPFILE >> $OUT.new 2>&1
> 	status=$?
> --
> 2.16.4
> 


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 6/7] tests: Add test to excercise indexed directories with metadata_csum
  2020-02-13 10:16 ` [PATCH 6/7] tests: Add test to excercise indexed directories with metadata_csum Jan Kara
@ 2020-02-18 20:34   ` Andreas Dilger
  0 siblings, 0 replies; 28+ messages in thread
From: Andreas Dilger @ 2020-02-18 20:34 UTC (permalink / raw)
  To: Jan Kara; +Cc: Ted Tso, linux-ext4

[-- Attachment #1: Type: text/plain, Size: 5363 bytes --]

On Feb 13, 2020, at 3:16 AM, Jan Kara <jack@suse.cz> wrote:
> 
> Indexed directories have somewhat different format when metadata_csum is
> enabled. Add test to excercise linking in indexed directories and e2fsck
> rehash code in this case.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

Reviewed-by: Andreas Dilger <adilger@dilger.ca>

> ---
> tests/f_large_dir_csum/expect       | 32 ++++++++++++++
> tests/f_large_dir_csum/is_slow_test |  0
> tests/f_large_dir_csum/name         |  1 +
> tests/f_large_dir_csum/script       | 84 +++++++++++++++++++++++++++++++++++++
> 4 files changed, 117 insertions(+)
> create mode 100644 tests/f_large_dir_csum/expect
> create mode 100644 tests/f_large_dir_csum/is_slow_test
> create mode 100644 tests/f_large_dir_csum/name
> create mode 100644 tests/f_large_dir_csum/script
> 
> diff --git a/tests/f_large_dir_csum/expect b/tests/f_large_dir_csum/expect
> new file mode 100644
> index 000000000000..aa9f33f1d25d
> --- /dev/null
> +++ b/tests/f_large_dir_csum/expect
> @@ -0,0 +1,32 @@
> +Creating filesystem with 31002 1k blocks and 64 inodes
> +Superblock backups stored on blocks:
> +	8193, 24577
> +
> +Allocating group tables:    \b\b\bdone
> +Writing inode tables:    \b\b\bdone
> +Writing superblocks and filesystem accounting information:    \b\b\bdone
> +
> +Pass 1: Checking inodes, blocks, and sizes
> +Pass 2: Checking directory structure
> +Pass 3: Checking directory connectivity
> +Pass 3A: Optimizing directories
> +Pass 4: Checking reference counts
> +Inode 13 ref count is 1, should be 5.  Fix? yes
> +
> +Pass 5: Checking group summary information
> +
> +test.img: ***** FILE SYSTEM WAS MODIFIED *****
> +test.img: 13/64 files (0.0% non-contiguous), 766/31002 blocks
> +Exit status is 1
> +Pass 1: Checking inodes, blocks, and sizes
> +Pass 2: Checking directory structure
> +Pass 3: Checking directory connectivity
> +Pass 3A: Optimizing directories
> +Pass 4: Checking reference counts
> +Inode 13 ref count is 5, should be 46504.  Fix? yes
> +
> +Pass 5: Checking group summary information
> +
> +test.img: ***** FILE SYSTEM WAS MODIFIED *****
> +test.img: 13/64 files (0.0% non-contiguous), 16390/31002 blocks
> +Exit status is 1
> diff --git a/tests/f_large_dir_csum/is_slow_test b/tests/f_large_dir_csum/is_slow_test
> new file mode 100644
> index 000000000000..e69de29bb2d1
> diff --git a/tests/f_large_dir_csum/name b/tests/f_large_dir_csum/name
> new file mode 100644
> index 000000000000..2b37c8c21f79
> --- /dev/null
> +++ b/tests/f_large_dir_csum/name
> @@ -0,0 +1 @@
> +optimize 3 level htree directories with metadata checksums
> diff --git a/tests/f_large_dir_csum/script b/tests/f_large_dir_csum/script
> new file mode 100644
> index 000000000000..286a965d5e6a
> --- /dev/null
> +++ b/tests/f_large_dir_csum/script
> @@ -0,0 +1,84 @@
> +OUT=$test_name.log
> +EXP=$test_dir/expect
> +E2FSCK=../e2fsck/e2fsck
> +
> +NAMELEN=255
> +DIRENT_SZ=8
> +BLOCKSZ=1024
> +INODESZ=128
> +CSUM_SZ=8
> +CSUM_TAIL_SZ=12
> +DIRENT_PER_LEAF=$(((BLOCKSZ - CSUM_TAIL_SZ) / (NAMELEN + DIRENT_SZ)))
> +HEADER=32
> +INDEX_SZ=8
> +INDEX_L1=$(((BLOCKSZ - HEADER - CSUM_SZ) / INDEX_SZ))
> +INDEX_L2=$(((BLOCKSZ - DIRENT_SZ - CSUM_SZ) / INDEX_SZ))
> +DIRBLK=$((3 + INDEX_L1 * INDEX_L2))
> +ENTRIES=$((DIRBLK * DIRENT_PER_LEAF))
> +# directory leaf blocks - get twice as much because the leaves won't be full
> +# and there are also other filesystem blocks.
> +FSIZE=$((DIRBLK * 2))
> +
> +$MKE2FS -b 1024 -O extents,64bit,large_dir,uninit_bg,metadata_csum -N 50 \
> +	-I $INODESZ -F $TMPFILE $FSIZE > $OUT.new 2>&1
> +RC=$?
> +if [ $RC -eq 0 ]; then
> +{
> +	# First some initial fs setup to create indexed dir
> +	echo "mkdir /foo"
> +	echo "cd /foo"
> +	touch $TMPFILE.tmp
> +	echo "write $TMPFILE.tmp foofile"
> +	i=0
> +	while test $i -lt $DIRENT_PER_LEAF ; do
> +		printf "ln foofile f%0254u\n" $i
> +		i=$((i + 1));
> +	done
> +	echo "expand ./"
> +	printf "ln foofile f%0254u\n" $i
> +} | $DEBUGFS -w $TMPFILE > /dev/null 2>> $OUT.new
> +	RC=$?
> +	# e2fsck should optimize the dir to become indexed
> +	$E2FSCK -yfD $TMPFILE >> $OUT.new 2>&1
> +	status=$?
> +	echo Exit status is $status >> $OUT.new
> +fi
> +
> +if [ $RC -eq 0 ]; then
> +{
> +	START=$SECONDS
> +	i=$(($DIRENT_PER_LEAF+1))
> +	last=$i
> +	echo "cd /foo"
> +	while test $i -lt $ENTRIES ; do
> +	    ELAPSED=$((SECONDS - START))
> +	    if test $((i % 5000)) -eq 0 -a $ELAPSED -gt 10; then
> +		RATE=$(((i - last) / ELAPSED))
> +		echo "$test_name: $i/$ENTRIES links, ${ELAPSED}s @ $RATE/s" >&2
> +		START=$SECONDS
> +		last=$i
> +	    fi
> +	    printf "ln foofile f%0254u\n" $i
> +	    i=$((i + 1))
> +	done
> +} | $DEBUGFS -w $TMPFILE > /dev/null 2>> $OUT.new
> +	RC=$?
> +fi
> +
> +if [ $RC -eq 0 ]; then
> +	$E2FSCK -yfD $TMPFILE >> $OUT.new 2>&1
> +	status=$?
> +	echo Exit status is $status >> $OUT.new
> +	sed -f $cmd_dir/filter.sed -e "s;$TMPFILE;test.img;" $OUT.new > $OUT
> +	rm -f $OUT.new
> +
> +	cmp -s $OUT $EXP
> +	RC=$?
> +fi
> +if [ $RC -eq 0 ]; then
> +	echo "$test_name: $test_description: ok"
> +	touch $test_name.ok
> +else
> +	echo "$test_name: $test_description: failed"
> +	diff -u $EXP $OUT > $test_name.failed
> +fi
> --
> 2.16.4
> 


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] tune2fs: Update dir checksums when clearing dir_index feature
  2020-02-13 10:16 ` [PATCH 7/7] tune2fs: Update dir checksums when clearing dir_index feature Jan Kara
@ 2020-02-18 20:50   ` Andreas Dilger
  2020-02-19 10:23     ` Jan Kara
  2020-03-15 17:15   ` Theodore Y. Ts'o
  1 sibling, 1 reply; 28+ messages in thread
From: Andreas Dilger @ 2020-02-18 20:50 UTC (permalink / raw)
  To: Jan Kara; +Cc: Ted Tso, linux-ext4

[-- Attachment #1: Type: text/plain, Size: 1992 bytes --]

On Feb 13, 2020, at 3:16 AM, Jan Kara <jack@suse.cz> wrote:
> 
> When clearing dir_index feature while metadata_csum is enabled, we have
> to rewrite checksums of all indexed directories to update checksums of
> internal tree nodes.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
> 
> +#define REWRITE_EA_FL		0x01	/* Rewrite EA inodes */
> +#define REWRITE_DIR_FL		0x02	/* Rewrite directories */
> +#define REWRITE_NONDIR_FL	0x04	/* Rewrite other inodes */
> +#define REWRITE_ALL (REWRITE_EA_FL | REWRITE_DIR_FL | REWRITE_NONDIR_FL)
> +
> +static void rewrite_inodes_pass(struct rewrite_context *ctx, unsigned int flags)
> {

My preference these days is to put constants like the above into a named
enum, and then use the enum as the argument to the function rather than
a very generic "int flags" argument.  That makes it clear to the reader
what values the flags may hold, and can immediately tag to the enum, like:

enum rewrite_inodes_flags {
	REWRITE_EA_FL	  = 0x01	/* Rewrite EA inodes */
	REWRITE_DIR_FL	  = 0x02	/* Rewrite directories */
	REWRITE_NONDIR_FL = 0x04	/* Rewrite other inodes */
	REWRITE_ALL	  = REWRITE_EA_FL | REWRITE_DIR_FL | REWRITE_NONDIR_FL
};

static void rewrite_inodes_pass(struct rewrite_context *ctx,
				enum rewrite_inodes_flags rif_flags)
static void rewrite_inodes(ext2_filsys fs, enum rewrite_inodes_flags rif_flags)
static void rewrite_metadata_checksums(ext2_filsys fs,
				       enum rewrite_inodes_flags rif_flags)

Otherwise, when looking at a function that takes "int flags" as an argument,
you have to dig through the code to see what kind of flags these are, and
what possible values they might have.  This is often even more confusing when
there are multiple different kinds of flags accessed within a single function
(not the case here, but happens often enough).

I'm not _against_ the patch, just thought I'd suggest an improvement and see
what people think about it.

Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] tune2fs: Update dir checksums when clearing dir_index feature
  2020-02-18 20:50   ` Andreas Dilger
@ 2020-02-19 10:23     ` Jan Kara
  0 siblings, 0 replies; 28+ messages in thread
From: Jan Kara @ 2020-02-19 10:23 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Jan Kara, Ted Tso, linux-ext4

On Tue 18-02-20 13:50:33, Andreas Dilger wrote:
> On Feb 13, 2020, at 3:16 AM, Jan Kara <jack@suse.cz> wrote:
> > 
> > When clearing dir_index feature while metadata_csum is enabled, we have
> > to rewrite checksums of all indexed directories to update checksums of
> > internal tree nodes.
> > 
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> > 
> > +#define REWRITE_EA_FL		0x01	/* Rewrite EA inodes */
> > +#define REWRITE_DIR_FL		0x02	/* Rewrite directories */
> > +#define REWRITE_NONDIR_FL	0x04	/* Rewrite other inodes */
> > +#define REWRITE_ALL (REWRITE_EA_FL | REWRITE_DIR_FL | REWRITE_NONDIR_FL)
> > +
> > +static void rewrite_inodes_pass(struct rewrite_context *ctx, unsigned int flags)
> > {
> 
> My preference these days is to put constants like the above into a named
> enum, and then use the enum as the argument to the function rather than
> a very generic "int flags" argument.  That makes it clear to the reader
> what values the flags may hold, and can immediately tag to the enum, like:
> 
> enum rewrite_inodes_flags {
> 	REWRITE_EA_FL	  = 0x01	/* Rewrite EA inodes */
> 	REWRITE_DIR_FL	  = 0x02	/* Rewrite directories */
> 	REWRITE_NONDIR_FL = 0x04	/* Rewrite other inodes */
> 	REWRITE_ALL	  = REWRITE_EA_FL | REWRITE_DIR_FL | REWRITE_NONDIR_FL
> };
> 
> static void rewrite_inodes_pass(struct rewrite_context *ctx,
> 				enum rewrite_inodes_flags rif_flags)
> static void rewrite_inodes(ext2_filsys fs, enum rewrite_inodes_flags rif_flags)
> static void rewrite_metadata_checksums(ext2_filsys fs,
> 				       enum rewrite_inodes_flags rif_flags)
> 
> Otherwise, when looking at a function that takes "int flags" as an argument,
> you have to dig through the code to see what kind of flags these are, and
> what possible values they might have.  This is often even more confusing when
> there are multiple different kinds of flags accessed within a single function
> (not the case here, but happens often enough).
> 
> I'm not _against_ the patch, just thought I'd suggest an improvement and see
> what people think about it.

Yeah, the documentation with enum type is nice. What I somewhat dislike is
that enum suggests 'enumeration' but we actually use values (like say
REWRITE_EA_FL | REWRITE_DIR_FL) which are not really enumerated in the type
definitition. So your scheme works only because enum in C is just an int
with a lipstick on it. So I'm somewhat undecided. Ted, what's your opinion
on this?

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/7] e2fsck: Clarify overflow link count error message
  2020-02-13 10:15 ` [PATCH 1/7] e2fsck: Clarify overflow link count error message Jan Kara
  2020-02-14 19:27   ` Andreas Dilger
@ 2020-03-07 18:52   ` Theodore Y. Ts'o
  1 sibling, 0 replies; 28+ messages in thread
From: Theodore Y. Ts'o @ 2020-03-07 18:52 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-ext4

On Thu, Feb 13, 2020 at 11:15:56AM +0100, Jan Kara wrote:
> When directory link count is set to overflow value (1) but during pass 4
> we find out the exact link count would fit, we either silently fix this
> (which is not great because e2fsck then reports the fs was modified but
> output doesn't indicate why in any way), or we report that link count is
> wrong and ask whether we should fix it (in case -n option was
> specified). The second case is even more misleading because it suggests
> non-trivial fs corruption which then gets silently fixed on the next
> run. Similarly to how we fix up other non-problems, just create a new
> error message for the case directory link count is not overflown anymore
> and always report it to clarify what is going on.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

Applied with a fixup to to tests/f_many_subdirs/expect.1, thanks.

(Please remember run "make check" before commiting a change.)

		     	   	  - Ted

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/7] e2fsck: Fix indexed dir rehash failure with metadata_csum enabled
  2020-02-13 10:15 ` [PATCH 2/7] e2fsck: Fix indexed dir rehash failure with metadata_csum enabled Jan Kara
  2020-02-14 19:28   ` Andreas Dilger
@ 2020-03-07 23:17   ` Theodore Y. Ts'o
  2020-03-16  9:30     ` Jan Kara
  1 sibling, 1 reply; 28+ messages in thread
From: Theodore Y. Ts'o @ 2020-03-07 23:17 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-ext4

On Thu, Feb 13, 2020 at 11:15:57AM +0100, Jan Kara wrote:
> E2fsck directory rehashing code can fail with ENOSPC due to a bug in
> ext2fs_htree_intnode_maxrecs() which fails to take metadata checksum
> into account and thus e.g. e2fsck can decide to create 1 indirect level
> of index tree when two are actually needed. Fix the logic to account for
> metadata checksum.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

Applied with a minor change; I didn't want to make this change:

> -_INLINE_ int ext2fs_htree_intnode_maxrecs(ext2_filsys fs, int blocks)
> +static inline int ext2fs_htree_intnode_maxrecs(ext2_filsys fs, int blocks)

... because it would make ext2fs_htree_intmode_maxrecs disappear from libext2fs.so.

So I changed this:

> +	if (ext2fs_has_feature_metadata_csum(fs->super))

to this:

+       if ((EXT2_SB(fs->super)->s_feature_ro_compat &
+            EXT4_FEATURE_RO_COMPAT_METADATA_CSUM) != 0)

to fix the inline related compilation errors.

					- Ted

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/7] ext2fs: Update allocation info earlier in ext2fs_mkdir() and ext2fs_symlink()
  2020-02-13 10:15 ` [PATCH 3/7] ext2fs: Update allocation info earlier in ext2fs_mkdir() and ext2fs_symlink() Jan Kara
  2020-02-14 19:37   ` Andreas Dilger
@ 2020-03-08  0:02   ` Theodore Y. Ts'o
  2020-03-08  2:20     ` Theodore Y. Ts'o
  1 sibling, 1 reply; 28+ messages in thread
From: Theodore Y. Ts'o @ 2020-03-08  0:02 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-ext4

On Thu, Feb 13, 2020 at 11:15:58AM +0100, Jan Kara wrote:
> Currently, ext2fs_mkdir() and ext2fs_symlink() update allocation bitmaps
> and other information only close to the end of the function, in
> particular after calling to ext2fs_link(). When ext2fs_link() will
> support indexed directories, it will also need to allocate blocks and
> that would cause filesystem corruption in case allocation info isn't
> properly updated. So make sure ext2fs_mkdir() and ext2fs_symlink()
> update allocation info before calling into ext2fs_link().

I'm not sure why there would be file system corruption if the
allocation information isn't updated until later?  Can you explain more?

One problem with moving the alliocation updates earlier is that in
case of errors, they need to be undone.  To be fair, there are other
missing error handling if there are failures, but this change would
make things worse.

					- Ted

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/7] ext2fs: Update allocation info earlier in ext2fs_mkdir() and ext2fs_symlink()
  2020-03-08  0:02   ` Theodore Y. Ts'o
@ 2020-03-08  2:20     ` Theodore Y. Ts'o
  2020-03-15 16:15       ` Theodore Y. Ts'o
  0 siblings, 1 reply; 28+ messages in thread
From: Theodore Y. Ts'o @ 2020-03-08  2:20 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-ext4

On Sat, Mar 07, 2020 at 07:02:20PM -0500, Theodore Y. Ts'o wrote:
> On Thu, Feb 13, 2020 at 11:15:58AM +0100, Jan Kara wrote:
> > Currently, ext2fs_mkdir() and ext2fs_symlink() update allocation bitmaps
> > and other information only close to the end of the function, in
> > particular after calling to ext2fs_link(). When ext2fs_link() will
> > support indexed directories, it will also need to allocate blocks and
> > that would cause filesystem corruption in case allocation info isn't
> > properly updated. So make sure ext2fs_mkdir() and ext2fs_symlink()
> > update allocation info before calling into ext2fs_link().
> 
> I'm not sure why there would be file system corruption if the
> allocation information isn't updated until later?  Can you explain more?

So I tried dropping this patch (and applying patches 1, 2, 4, 5, 6, 7)
and I'm not noticing any test failures....

					- Ted

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/7] ext2fs: Update allocation info earlier in ext2fs_mkdir() and ext2fs_symlink()
  2020-03-08  2:20     ` Theodore Y. Ts'o
@ 2020-03-15 16:15       ` Theodore Y. Ts'o
  2020-03-16  9:32         ` Jan Kara
  0 siblings, 1 reply; 28+ messages in thread
From: Theodore Y. Ts'o @ 2020-03-15 16:15 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-ext4

Ah, I see now why this patch is needed.  If we don't update the block
allocation bitmap to indicate block has been taken, and there is a
need to allocate an htree index block, we will end up allocating the
same block twice.

Thanks, I've applied this patch with the following added.

     	     	  	     	 	   - Ted

diff --git a/lib/ext2fs/mkdir.c b/lib/ext2fs/mkdir.c
index 947003eb..437c8ffc 100644
--- a/lib/ext2fs/mkdir.c
+++ b/lib/ext2fs/mkdir.c
@@ -43,6 +43,7 @@ errcode_t ext2fs_mkdir(ext2_filsys fs, ext2_ino_t parent, ext2_ino_t inum,
 	blk64_t			blk;
 	char			*block = 0;
 	int			inline_data = 0;
+	int			drop_refcount = 0;
 
 	EXT2_CHECK_MAGIC(fs, EXT2_ET_MAGIC_EXT2FS_FILSYS);
 
@@ -149,6 +150,7 @@ errcode_t ext2fs_mkdir(ext2_filsys fs, ext2_ino_t parent, ext2_ino_t inum,
 	if (!inline_data)
 		ext2fs_block_alloc_stats2(fs, blk, +1);
 	ext2fs_inode_alloc_stats2(fs, ino, +1, 1);
+	drop_refcount = 1;
 
 	/*
 	 * Link the directory into the filesystem hierarchy
@@ -181,10 +183,16 @@ errcode_t ext2fs_mkdir(ext2_filsys fs, ext2_ino_t parent, ext2_ino_t inum,
 		if (retval)
 			goto cleanup;
 	}
+	drop_refcount = 0;
 
 cleanup:
 	if (block)
 		ext2fs_free_mem(&block);
+	if (drop_refcount) {
+		if (!inline_data)
+			ext2fs_block_alloc_stats2(fs, blk, -1);
+		ext2fs_inode_alloc_stats2(fs, ino, -1, 1);
+	}
 	return retval;
 
 }
diff --git a/lib/ext2fs/symlink.c b/lib/ext2fs/symlink.c
index 3e07a539..a66fb7ec 100644
--- a/lib/ext2fs/symlink.c
+++ b/lib/ext2fs/symlink.c
@@ -54,6 +54,7 @@ errcode_t ext2fs_symlink(ext2_filsys fs, ext2_ino_t parent, ext2_ino_t ino,
 	int			fastlink, inlinelink;
 	unsigned int		target_len;
 	char			*block_buf = 0;
+	int			drop_refcount = 0;
 
 	EXT2_CHECK_MAGIC(fs, EXT2_ET_MAGIC_EXT2FS_FILSYS);
 
@@ -168,6 +169,7 @@ need_block:
 	if (!fastlink && !inlinelink)
 		ext2fs_block_alloc_stats2(fs, blk, +1);
 	ext2fs_inode_alloc_stats2(fs, ino, +1, 0);
+	drop_refcount = 1;
 
 	/*
 	 * Link the symlink into the filesystem hierarchy
@@ -185,10 +187,16 @@ need_block:
 		if (retval)
 			goto cleanup;
 	}
+	drop_refcount = 0;
 
 cleanup:
 	if (block_buf)
 		ext2fs_free_mem(&block_buf);
+	if (drop_refcount) {
+		if (!fastlink && !inlinelink)
+			ext2fs_block_alloc_stats2(fs, blk, -1);
+		ext2fs_inode_alloc_stats2(fs, ino, -1, 0);
+	}
 	return retval;
 }
 

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] ext2fs: Implement dir entry creation in htree directories
  2020-02-13 10:15 ` [PATCH 4/7] ext2fs: Implement dir entry creation in htree directories Jan Kara
@ 2020-03-15 16:43   ` Theodore Y. Ts'o
  0 siblings, 0 replies; 28+ messages in thread
From: Theodore Y. Ts'o @ 2020-03-15 16:43 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-ext4

On Thu, Feb 13, 2020 at 11:15:59AM +0100, Jan Kara wrote:
> Implement proper creation of new directory entries in htree directories
> in ext2fs_link(). So far we just cleared EXT2_INDEX_FL and treated
> directory as unindexed however this results in mismatched checksums if
> metadata checksums are in use because checksums are placed in different
> places depending on htree node type.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

Applied, thanks.

					- Ted

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 5/7] tests: Modify f_large_dir test to excercise indexed dir handling
  2020-02-13 10:16 ` [PATCH 5/7] tests: Modify f_large_dir test to excercise indexed dir handling Jan Kara
  2020-02-18 20:29   ` Andreas Dilger
@ 2020-03-15 16:43   ` Theodore Y. Ts'o
  1 sibling, 0 replies; 28+ messages in thread
From: Theodore Y. Ts'o @ 2020-03-15 16:43 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-ext4

On Thu, Feb 13, 2020 at 11:16:00AM +0100, Jan Kara wrote:
> Modify f_large_dir test to create indexed directory and create entries
> in it. That way the new code in ext2fs_link() for addition of entries
> into indexed directories gets executed including various special cases
> when growing htree.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

Applied, thanks.

					- Ted

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] tune2fs: Update dir checksums when clearing dir_index feature
  2020-02-13 10:16 ` [PATCH 7/7] tune2fs: Update dir checksums when clearing dir_index feature Jan Kara
  2020-02-18 20:50   ` Andreas Dilger
@ 2020-03-15 17:15   ` Theodore Y. Ts'o
  2020-03-16  0:11     ` Andreas Dilger
                       ` (2 more replies)
  1 sibling, 3 replies; 28+ messages in thread
From: Theodore Y. Ts'o @ 2020-03-15 17:15 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-ext4

On Thu, Feb 13, 2020 at 11:16:02AM +0100, Jan Kara wrote:
> When clearing dir_index feature while metadata_csum is enabled, we have
> to rewrite checksums of all indexed directories to update checksums of
> internal tree nodes.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

Thanks, applied.

With regards to the enum, I agree with Jan that using an enum for
bitfields isn't a great fit.  Also, in this case, where it's for a
static function and the definitions don't go beyond a single file, the
advantages of using an enum so we can have strong typing is much less
useful.

One thing which I did notice when trying to test this patch is that

mke2fs -t ext4 -d /usr/projects/e2fsprogs /tmp/foo.img 1G

...does not create any indexed directories.  That's because the
changes to ext2fs_link() only teach e2fsprogs how to add a link to a
directory which is already indexing.  We don't have code which takes a
directory with a single directory block and which doesn't have
directory indexing flag enabled, and converts to a indexed directory.

That might be a nice thing to add at some point.

     	      	     	      	     - Ted

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] tune2fs: Update dir checksums when clearing dir_index feature
  2020-03-15 17:15   ` Theodore Y. Ts'o
@ 2020-03-16  0:11     ` Andreas Dilger
  2020-03-16  9:27     ` Jan Kara
  2020-03-26 14:27     ` Jan Kara
  2 siblings, 0 replies; 28+ messages in thread
From: Andreas Dilger @ 2020-03-16  0:11 UTC (permalink / raw)
  To: Theodore Y. Ts'o; +Cc: Jan Kara, linux-ext4

[-- Attachment #1: Type: text/plain, Size: 999 bytes --]

On Mar 15, 2020, at 11:15 AM, Theodore Y. Ts'o <tytso@mit.edu> wrote:
> 
> With regards to the enum, I agree with Jan that using an enum for
> bitfields isn't a great fit.  Also, in this case, where it's for a
> static function and the definitions don't go beyond a single file, the
> advantages of using an enum so we can have strong typing is much less
> useful.

I don't think that "enum" has to mean "sequential integers", but rather
"an listing of related constants" as defined by Wikipedia:

    An enumeration is a complete, ordered listing of all
    the items in a collection.

Giving a list of related constants a name makes the code easier to
understand, especially when the variables have totally generic names
like "flags".  I'm not saying it isn't possible to figure out what
the possible values of that variable are, by hunting around the code
to see what is assigned to it, but making the code easier to understand
at first reading should have value by itself?

Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] tune2fs: Update dir checksums when clearing dir_index feature
  2020-03-15 17:15   ` Theodore Y. Ts'o
  2020-03-16  0:11     ` Andreas Dilger
@ 2020-03-16  9:27     ` Jan Kara
  2020-03-26 14:27     ` Jan Kara
  2 siblings, 0 replies; 28+ messages in thread
From: Jan Kara @ 2020-03-16  9:27 UTC (permalink / raw)
  To: Theodore Y. Ts'o; +Cc: Jan Kara, linux-ext4

On Sun 15-03-20 13:15:20, Theodore Y. Ts'o wrote:
> On Thu, Feb 13, 2020 at 11:16:02AM +0100, Jan Kara wrote:
> > When clearing dir_index feature while metadata_csum is enabled, we have
> > to rewrite checksums of all indexed directories to update checksums of
> > internal tree nodes.
> > 
> > Signed-off-by: Jan Kara <jack@suse.cz>
> 
> Thanks, applied.
> 
> With regards to the enum, I agree with Jan that using an enum for
> bitfields isn't a great fit.  Also, in this case, where it's for a
> static function and the definitions don't go beyond a single file, the
> advantages of using an enum so we can have strong typing is much less
> useful.
> 
> One thing which I did notice when trying to test this patch is that
> 
> mke2fs -t ext4 -d /usr/projects/e2fsprogs /tmp/foo.img 1G
> 
> ...does not create any indexed directories.  That's because the
> changes to ext2fs_link() only teach e2fsprogs how to add a link to a
> directory which is already indexing.  We don't have code which takes a
> directory with a single directory block and which doesn't have
> directory indexing flag enabled, and converts to a indexed directory.

Yes, I know and it's kind of deliberate because not all users of
ext2fs_link() want that to happen (e.g. linking into lost+found). So I
wasn't sure what a proper API for this functionality would be and decided
to leave that decision for later. Maybe we could export the "optimize dir"
functionality from e2fsck to be generally available in libext2fs?

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/7] e2fsck: Fix indexed dir rehash failure with metadata_csum enabled
  2020-03-07 23:17   ` Theodore Y. Ts'o
@ 2020-03-16  9:30     ` Jan Kara
  0 siblings, 0 replies; 28+ messages in thread
From: Jan Kara @ 2020-03-16  9:30 UTC (permalink / raw)
  To: Theodore Y. Ts'o; +Cc: Jan Kara, linux-ext4

On Sat 07-03-20 18:17:19, Theodore Y. Ts'o wrote:
> On Thu, Feb 13, 2020 at 11:15:57AM +0100, Jan Kara wrote:
> > E2fsck directory rehashing code can fail with ENOSPC due to a bug in
> > ext2fs_htree_intnode_maxrecs() which fails to take metadata checksum
> > into account and thus e.g. e2fsck can decide to create 1 indirect level
> > of index tree when two are actually needed. Fix the logic to account for
> > metadata checksum.
> > 
> > Signed-off-by: Jan Kara <jack@suse.cz>
> 
> Applied with a minor change; I didn't want to make this change:
> 
> > -_INLINE_ int ext2fs_htree_intnode_maxrecs(ext2_filsys fs, int blocks)
> > +static inline int ext2fs_htree_intnode_maxrecs(ext2_filsys fs, int blocks)
> 
> ... because it would make ext2fs_htree_intmode_maxrecs disappear from
> libext2fs.so.

Aha! I was wondering what's going on with those strange inline
statements...

> So I changed this:
> 
> > +	if (ext2fs_has_feature_metadata_csum(fs->super))
> 
> to this:
> 
> +       if ((EXT2_SB(fs->super)->s_feature_ro_compat &
> +            EXT4_FEATURE_RO_COMPAT_METADATA_CSUM) != 0)
> 
> to fix the inline related compilation errors.

Thanks for fixing this!

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/7] ext2fs: Update allocation info earlier in ext2fs_mkdir() and ext2fs_symlink()
  2020-03-15 16:15       ` Theodore Y. Ts'o
@ 2020-03-16  9:32         ` Jan Kara
  0 siblings, 0 replies; 28+ messages in thread
From: Jan Kara @ 2020-03-16  9:32 UTC (permalink / raw)
  To: Theodore Y. Ts'o; +Cc: Jan Kara, linux-ext4

On Sun 15-03-20 12:15:09, Theodore Y. Ts'o wrote:
> Ah, I see now why this patch is needed.  If we don't update the block
> allocation bitmap to indicate block has been taken, and there is a
> need to allocate an htree index block, we will end up allocating the
> same block twice.

Exactly.

> Thanks, I've applied this patch with the following added.

Thanks, that's even better!

								Honza

> 
>      	     	  	     	 	   - Ted
> 
> diff --git a/lib/ext2fs/mkdir.c b/lib/ext2fs/mkdir.c
> index 947003eb..437c8ffc 100644
> --- a/lib/ext2fs/mkdir.c
> +++ b/lib/ext2fs/mkdir.c
> @@ -43,6 +43,7 @@ errcode_t ext2fs_mkdir(ext2_filsys fs, ext2_ino_t parent, ext2_ino_t inum,
>  	blk64_t			blk;
>  	char			*block = 0;
>  	int			inline_data = 0;
> +	int			drop_refcount = 0;
>  
>  	EXT2_CHECK_MAGIC(fs, EXT2_ET_MAGIC_EXT2FS_FILSYS);
>  
> @@ -149,6 +150,7 @@ errcode_t ext2fs_mkdir(ext2_filsys fs, ext2_ino_t parent, ext2_ino_t inum,
>  	if (!inline_data)
>  		ext2fs_block_alloc_stats2(fs, blk, +1);
>  	ext2fs_inode_alloc_stats2(fs, ino, +1, 1);
> +	drop_refcount = 1;
>  
>  	/*
>  	 * Link the directory into the filesystem hierarchy
> @@ -181,10 +183,16 @@ errcode_t ext2fs_mkdir(ext2_filsys fs, ext2_ino_t parent, ext2_ino_t inum,
>  		if (retval)
>  			goto cleanup;
>  	}
> +	drop_refcount = 0;
>  
>  cleanup:
>  	if (block)
>  		ext2fs_free_mem(&block);
> +	if (drop_refcount) {
> +		if (!inline_data)
> +			ext2fs_block_alloc_stats2(fs, blk, -1);
> +		ext2fs_inode_alloc_stats2(fs, ino, -1, 1);
> +	}
>  	return retval;
>  
>  }
> diff --git a/lib/ext2fs/symlink.c b/lib/ext2fs/symlink.c
> index 3e07a539..a66fb7ec 100644
> --- a/lib/ext2fs/symlink.c
> +++ b/lib/ext2fs/symlink.c
> @@ -54,6 +54,7 @@ errcode_t ext2fs_symlink(ext2_filsys fs, ext2_ino_t parent, ext2_ino_t ino,
>  	int			fastlink, inlinelink;
>  	unsigned int		target_len;
>  	char			*block_buf = 0;
> +	int			drop_refcount = 0;
>  
>  	EXT2_CHECK_MAGIC(fs, EXT2_ET_MAGIC_EXT2FS_FILSYS);
>  
> @@ -168,6 +169,7 @@ need_block:
>  	if (!fastlink && !inlinelink)
>  		ext2fs_block_alloc_stats2(fs, blk, +1);
>  	ext2fs_inode_alloc_stats2(fs, ino, +1, 0);
> +	drop_refcount = 1;
>  
>  	/*
>  	 * Link the symlink into the filesystem hierarchy
> @@ -185,10 +187,16 @@ need_block:
>  		if (retval)
>  			goto cleanup;
>  	}
> +	drop_refcount = 0;
>  
>  cleanup:
>  	if (block_buf)
>  		ext2fs_free_mem(&block_buf);
> +	if (drop_refcount) {
> +		if (!fastlink && !inlinelink)
> +			ext2fs_block_alloc_stats2(fs, blk, -1);
> +		ext2fs_inode_alloc_stats2(fs, ino, -1, 0);
> +	}
>  	return retval;
>  }
>  
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] tune2fs: Update dir checksums when clearing dir_index feature
  2020-03-15 17:15   ` Theodore Y. Ts'o
  2020-03-16  0:11     ` Andreas Dilger
  2020-03-16  9:27     ` Jan Kara
@ 2020-03-26 14:27     ` Jan Kara
  2 siblings, 0 replies; 28+ messages in thread
From: Jan Kara @ 2020-03-26 14:27 UTC (permalink / raw)
  To: Theodore Y. Ts'o; +Cc: Jan Kara, linux-ext4

On Sun 15-03-20 13:15:20, Theodore Y. Ts'o wrote:
> On Thu, Feb 13, 2020 at 11:16:02AM +0100, Jan Kara wrote:
> > When clearing dir_index feature while metadata_csum is enabled, we have
> > to rewrite checksums of all indexed directories to update checksums of
> > internal tree nodes.
> > 
> > Signed-off-by: Jan Kara <jack@suse.cz>
> 
> Thanks, applied.

Hum, I'm still not seeing this series in e2fsprogs git. Did it get stuck
somewhere?

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2020-03-26 14:27 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-13 10:15 [PATCH 0/7 v2] e2fsprogs: Better handling of indexed directories Jan Kara
2020-02-13 10:15 ` [PATCH 1/7] e2fsck: Clarify overflow link count error message Jan Kara
2020-02-14 19:27   ` Andreas Dilger
2020-03-07 18:52   ` Theodore Y. Ts'o
2020-02-13 10:15 ` [PATCH 2/7] e2fsck: Fix indexed dir rehash failure with metadata_csum enabled Jan Kara
2020-02-14 19:28   ` Andreas Dilger
2020-03-07 23:17   ` Theodore Y. Ts'o
2020-03-16  9:30     ` Jan Kara
2020-02-13 10:15 ` [PATCH 3/7] ext2fs: Update allocation info earlier in ext2fs_mkdir() and ext2fs_symlink() Jan Kara
2020-02-14 19:37   ` Andreas Dilger
2020-03-08  0:02   ` Theodore Y. Ts'o
2020-03-08  2:20     ` Theodore Y. Ts'o
2020-03-15 16:15       ` Theodore Y. Ts'o
2020-03-16  9:32         ` Jan Kara
2020-02-13 10:15 ` [PATCH 4/7] ext2fs: Implement dir entry creation in htree directories Jan Kara
2020-03-15 16:43   ` Theodore Y. Ts'o
2020-02-13 10:16 ` [PATCH 5/7] tests: Modify f_large_dir test to excercise indexed dir handling Jan Kara
2020-02-18 20:29   ` Andreas Dilger
2020-03-15 16:43   ` Theodore Y. Ts'o
2020-02-13 10:16 ` [PATCH 6/7] tests: Add test to excercise indexed directories with metadata_csum Jan Kara
2020-02-18 20:34   ` Andreas Dilger
2020-02-13 10:16 ` [PATCH 7/7] tune2fs: Update dir checksums when clearing dir_index feature Jan Kara
2020-02-18 20:50   ` Andreas Dilger
2020-02-19 10:23     ` Jan Kara
2020-03-15 17:15   ` Theodore Y. Ts'o
2020-03-16  0:11     ` Andreas Dilger
2020-03-16  9:27     ` Jan Kara
2020-03-26 14:27     ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).