All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6
@ 2014-09-13 22:11 Darrick J. Wong
  2014-09-13 22:11 ` [PATCH 01/34] e2fsck: offer to clear overlapping extents Darrick J. Wong
                   ` (33 more replies)
  0 siblings, 34 replies; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:11 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

Hi all,

This is part 6 of the Summer 2014 e2fsprogs patchset.  Hooray, the end
is in sight!  There are a few bugfixes at the start, but these are
mostly new features for 1.43.

The first six patches fix bugs.  Patch 1 implements detection and
correction of overlapping extents; patch 2 fixes collapsing directory
holes on bigalloc FSes; and patch 3 zaps s_jnl_blocks some more.
Patch 4 fixes ext2fs_new_block2() to call the get_alloc_block hook;
this is the successor to an earlier patch that open-coded the hook
call.  Patch 5 fixes needs_recovery flag handling when modifying the
journal with debugfs, and patch 6 fixes the build system on OSX.

Patches 7-9 implement v2 of the e2fsck readahead functionality,
which promises to reduce fsck runtime by 10-30%.  You might want to
read the report: http://marc.info/?l=linux-ext4&m=140755433701165&w=2
("e2fsck readahead speedup performance report") for all the juicy
details!  The only change since last time was to plumb cache_readahead
calls into test_io.c, per tytso request.

Patches 10 teaches dumpe2fs to emit group descriptor data in a machine
readable format for ease of automated testing.  Patch 11 reorganizes
the human-readable group descriptor dumpe2fs per this week's
discussion.

Patch 12-14 enhance tune2fs, debugfs, dumpe2fs, e2image, and e2fsck to
tell the user what kind of data might be living at the path spec
they're passing into those tools if an ext4 superblock cannot be
found.

Patches 15-16 hooks up ext2fs_zero_blocks2 to the BLKZEROOUT blockdev
ioctl or the FALLOC_FL_ZERO_RANGE feature of fallocate() to zero out
data blocks if possible.  This will be useful for zeroing inode
tables, clearing the journal, and the future ext2fs_fallocate API.
There's also a cleanup patch that ensure that the zero_blocks2 static
buffer gets cleaned up when the FS exits and converts each area that
was writing zero blocks to use the zero_blocks2 call instead.

Patches 17-18 enhance ext2fs_bmap2() to allow the creation of
uninitialized extents.  The functionality is already there; this
simply adds a flag for clients to create uninitialized mappings.
There's also a patch to the fileio routines to handle uninitialized
extents.  These patches are unchanged from December, aside from having
grown some more test cases.

Patches 19-21 add to resize2fs the ability to convert a filesystem to
and from 64bit mode.  These patches are unchanged from December, aside
from having grown some more test cases.

Patches 22-28 implement fallocate for e2fsprogs, and modifies Ted's
mk_hugefiles functionality to use it.  The general fallocate API call
is (regrettably) much more complex than what hugefiles did, since it
must grapple with the possibility that the file already has mapped
blocks.  There were also a lot of bigalloc related subtleties; at some
point it might behoove someone to write a extent tree compressor.
The API call has been plumbed into debugfs, with accompanying tests of
both the fallocate and punch calls.

Patches 29-32 implement fuse2fs, a FUSE server based on libext2fs.
Primarily I've been using it to shake out bugs in the library via
xfstests and the metadata checksumming test program.  It can also be
used to mount ext4 on any OS supporting FUSE, and it can also mount
64k-block filesystems on x86, though I'd be wary of using rw mode.
fuse2fs depends on these new APIs: xattr editing, uninit extent
handling, and the new fallocate call.

Patches 33-34 provide the metadata checksumming test script.  Its
primary advantage over 'make check' is that it allows one to specify a 
variety of different mkfs and mount options.  It's also growing more
tests as a result of fuse2fs exercise.

I've tested these e2fsprogs changes against the -next branch as of
9/11.  The patches have been tested against the 'make check' suite and
some amount of e2fuzz testing on x86_64, i686, ppc64, and aarch64.

Comments and questions are, as always, welcome.

--D

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH 01/34] e2fsck: offer to clear overlapping extents
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
@ 2014-09-13 22:11 ` Darrick J. Wong
  2014-09-19  1:45   ` Theodore Ts'o
  2014-09-13 22:11 ` [PATCH 02/34] e2fsck: fix sliding the directory block down on bigalloc Darrick J. Wong
                   ` (32 subsequent siblings)
  33 siblings, 1 reply; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:11 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

If in the course of iterating extents we find that an otherwise
valid-seeming second extent maps the same logical blocks as a
previously examined first extent, offer to clear the duplicate
mapping.

The test for this is already in f_extents.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 e2fsck/e2fsck.h          |    2 +-
 e2fsck/pass1.c           |   15 +++++++++++++++
 e2fsck/problem.c         |   12 +++++++++++-
 e2fsck/problem.h         |    6 ++++++
 tests/f_extents/expect.1 |   14 ++++++++++----
 tests/f_extents/expect.2 |    2 +-
 6 files changed, 44 insertions(+), 7 deletions(-)


diff --git a/e2fsck/e2fsck.h b/e2fsck/e2fsck.h
index 8f16218..6ca3a6a 100644
--- a/e2fsck/e2fsck.h
+++ b/e2fsck/e2fsck.h
@@ -381,7 +381,7 @@ struct e2fsck_struct {
 };
 
 /* Used by the region allocation code */
-typedef __u32 region_addr_t;
+typedef __u64 region_addr_t;
 typedef struct region_struct *region_t;
 
 #ifndef HAVE_STRNLEN
diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
index 7175c77..aaeb70a 100644
--- a/e2fsck/pass1.c
+++ b/e2fsck/pass1.c
@@ -94,6 +94,7 @@ struct process_block_struct {
 	ext2fs_block_bitmap fs_meta_blocks;
 	e2fsck_t	ctx;
 	blk64_t		bad_ref;
+	region_t	region;
 };
 
 struct process_inode_block {
@@ -2395,6 +2396,10 @@ static void scan_extent_node(e2fsck_t ctx, struct problem_context *pctx,
 			  (1 << (21 - ctx->fs->super->s_log_block_size))))
 			problem = PR_1_TOOBIG_DIR;
 
+		if (is_leaf && problem == 0 && extent.e_len > 0 &&
+		    region_allocate(pb->region, extent.e_lblk, extent.e_len))
+			problem = PR_1_EXTENT_COLLISION;
+
 		/*
 		 * Uninitialized blocks in a directory?  Clear the flag and
 		 * we'll interpret the blocks later.
@@ -2695,6 +2700,14 @@ static void check_blocks_extents(e2fsck_t ctx, struct problem_context *pctx,
 		ctx->extent_depth_count[info.max_depth]++;
 	}
 
+	pb->region = region_create(0, info.max_lblk);
+	if (!pb->region) {
+		ext2fs_extent_free(ehandle);
+		fix_problem(ctx, PR_1_EXTENT_ALLOC_REGION_ABORT, pctx);
+		ctx->flags |= E2F_FLAG_ABORT;
+		return;
+	}
+
 	eof_lblk = ((EXT2_I_SIZE(inode) + fs->blocksize - 1) >>
 		EXT2_BLOCK_SIZE_BITS(fs->super)) - 1;
 	scan_extent_node(ctx, pctx, pb, 0, 0, eof_lblk, ehandle, 1);
@@ -2706,6 +2719,8 @@ static void check_blocks_extents(e2fsck_t ctx, struct problem_context *pctx,
 				   "check_blocks_extents");
 		pctx->errcode = 0;
 	}
+	region_free(pb->region);
+	pb->region = NULL;
 	ext2fs_extent_free(ehandle);
 }
 
diff --git a/e2fsck/problem.c b/e2fsck/problem.c
index 9818539..174f45a 100644
--- a/e2fsck/problem.c
+++ b/e2fsck/problem.c
@@ -784,7 +784,7 @@ static struct e2fsck_problem problem_table[] = {
 
 	/* Error allocating EA region allocation structure */
 	{ PR_1_EA_ALLOC_REGION_ABORT,
-	  N_("@A @a @b %b.  "),
+	  N_("@A @a region allocation structure.  "),
 	  PROMPT_NONE, PR_FATAL},
 
 	/* Error EA allocation collision */
@@ -1091,6 +1091,16 @@ static struct e2fsck_problem problem_table[] = {
 	  N_("Bad block list says the bad block list @i is bad.  "),
 	  PROMPT_CLEAR_INODE, 0 },
 
+	/* Error allocating extent region allocation structure */
+	{ PR_1_EXTENT_ALLOC_REGION_ABORT,
+	  N_("@A @x region allocation structure.  "),
+	  PROMPT_NONE, PR_FATAL},
+
+	/* Inode has a duplicate extent mapping */
+	{ PR_1_EXTENT_COLLISION,
+	  N_("@i %i has a duplicate @x mapping\n\t(logical @b %c, @n physical @b %b, len %N)\n"),
+	  PROMPT_CLEAR, 0 },
+
 	/* Pass 1b errors */
 
 	/* Pass 1B: Rescan for duplicate/bad blocks */
diff --git a/e2fsck/problem.h b/e2fsck/problem.h
index 5b32aeb..3c28166 100644
--- a/e2fsck/problem.h
+++ b/e2fsck/problem.h
@@ -635,6 +635,12 @@ struct problem_context {
 /* badblocks is in badblocks */
 #define PR_1_BADBLOCKS_IN_BADBLOCKS		0x01007B
 
+/* can't allocate extent region */
+#define PR_1_EXTENT_ALLOC_REGION_ABORT		0x01007C
+
+/* leaf extent collision */
+#define PR_1_EXTENT_COLLISION			0x01007D
+
 /*
  * Pass 1b errors
  */
diff --git a/tests/f_extents/expect.1 b/tests/f_extents/expect.1
index 953162c..aeebc7b 100644
--- a/tests/f_extents/expect.1
+++ b/tests/f_extents/expect.1
@@ -11,6 +11,12 @@ Inode 12, i_blocks is 34, should be 0.  Fix? yes
 Inode 13 missing EXTENT_FL, but is in extents format
 Fix? yes
 
+Inode 16 has a duplicate extent mapping
+	(logical block 3, invalid physical block 4613, len 2)
+Clear? yes
+
+Inode 16, i_blocks is 16, should be 12.  Fix? yes
+
 Inode 17 has an invalid extent
 	(logical block 0, invalid physical block 22011707397135, len 15)
 Clear? yes
@@ -31,13 +37,13 @@ Entry 'fbad-flag' in / (2) has deleted/unused inode 18.  Clear? yes
 Pass 3: Checking directory connectivity
 Pass 4: Checking reference counts
 Pass 5: Checking group summary information
-Block bitmap differences:  -1081 +4611 -(5121--5142)
+Block bitmap differences:  -1081 +4611 -(4613--4614) -(5121--5142)
 Fix? yes
 
-Free blocks count wrong for group #0 (7081, counted=7098).
+Free blocks count wrong for group #0 (7081, counted=7100).
 Fix? yes
 
-Free blocks count wrong (7081, counted=7098).
+Free blocks count wrong (7081, counted=7100).
 Fix? yes
 
 Inode bitmap differences:  -18
@@ -51,5 +57,5 @@ Fix? yes
 
 
 test_filesys: ***** FILE SYSTEM WAS MODIFIED *****
-test_filesys: 18/256 files (0.0% non-contiguous), 1094/8192 blocks
+test_filesys: 18/256 files (5.6% non-contiguous), 1092/8192 blocks
 Exit status is 1
diff --git a/tests/f_extents/expect.2 b/tests/f_extents/expect.2
index 6162cdf..5c9d6a6 100644
--- a/tests/f_extents/expect.2
+++ b/tests/f_extents/expect.2
@@ -3,5 +3,5 @@ Pass 2: Checking directory structure
 Pass 3: Checking directory connectivity
 Pass 4: Checking reference counts
 Pass 5: Checking group summary information
-test_filesys: 18/256 files (0.0% non-contiguous), 1094/8192 blocks
+test_filesys: 18/256 files (5.6% non-contiguous), 1092/8192 blocks
 Exit status is 0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 02/34] e2fsck: fix sliding the directory block down on bigalloc
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
  2014-09-13 22:11 ` [PATCH 01/34] e2fsck: offer to clear overlapping extents Darrick J. Wong
@ 2014-09-13 22:11 ` Darrick J. Wong
  2014-09-19  1:45   ` Theodore Ts'o
  2014-09-13 22:11 ` [PATCH 03/34] misc: zero s_jnl_blocks when adding journal online or removing external journal Darrick J. Wong
                   ` (31 subsequent siblings)
  33 siblings, 1 reply; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:11 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

If we find a hole in a directory on a bigalloc filesystem, we need to
obey the cluster alignment rules when collapsing the gap to avoid
later complaints.

Specifically, the calculation of the new logical cluster number was
incorrect, and we need to ensure that the logical cluster alignment
respects the physical cluster alignment, since we've concluded that
the extent's logical block number is wrong.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 e2fsck/pass1.c            |    6 ++--
 tests/f_holedir4/expect.1 |   68 +++++++++++++++++++++++++++++++++++++++++++++
 tests/f_holedir4/expect.2 |   11 +++++++
 tests/f_holedir4/image.gz |  Bin
 tests/f_holedir4/name     |    1 +
 5 files changed, 83 insertions(+), 3 deletions(-)
 create mode 100644 tests/f_holedir4/expect.1
 create mode 100644 tests/f_holedir4/expect.2
 create mode 100644 tests/f_holedir4/image.gz
 create mode 100644 tests/f_holedir4/name


diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
index aaeb70a..db5273e 100644
--- a/e2fsck/pass1.c
+++ b/e2fsck/pass1.c
@@ -2568,9 +2568,9 @@ report_problem:
 			new_lblk = pb->last_block + 1;
 			if (EXT2FS_CLUSTER_RATIO(ctx->fs) > 1)
 				new_lblk = ((new_lblk +
-					     EXT2FS_CLUSTER_RATIO(ctx->fs)) &
-					    EXT2FS_CLUSTER_MASK(ctx->fs)) |
-					   (extent.e_lblk &
+					     EXT2FS_CLUSTER_RATIO(ctx->fs) - 1) &
+					    ~EXT2FS_CLUSTER_MASK(ctx->fs)) |
+					   (extent.e_pblk &
 					    EXT2FS_CLUSTER_MASK(ctx->fs));
 			pctx->blk = extent.e_lblk;
 			pctx->blk2 = new_lblk;
diff --git a/tests/f_holedir4/expect.1 b/tests/f_holedir4/expect.1
new file mode 100644
index 0000000..1e66fb6
--- /dev/null
+++ b/tests/f_holedir4/expect.1
@@ -0,0 +1,68 @@
+Pass 1: Checking inodes, blocks, and sizes
+Directory inode 12 block 211 should be at block 25.  Fix? yes
+
+Inode 12, i_size is 4096, should be 110592.  Fix? yes
+
+Inode 12, i_blocks is 128, should be 256.  Fix? yes
+
+Pass 2: Checking directory structure
+Directory inode 12 has an unallocated block #2.  Allocate? yes
+
+Directory inode 12 has an unallocated block #3.  Allocate? yes
+
+Directory inode 12 has an unallocated block #4.  Allocate? yes
+
+Directory inode 12 has an unallocated block #5.  Allocate? yes
+
+Directory inode 12 has an unallocated block #6.  Allocate? yes
+
+Directory inode 12 has an unallocated block #7.  Allocate? yes
+
+Directory inode 12 has an unallocated block #8.  Allocate? yes
+
+Directory inode 12 has an unallocated block #9.  Allocate? yes
+
+Directory inode 12 has an unallocated block #10.  Allocate? yes
+
+Directory inode 12 has an unallocated block #11.  Allocate? yes
+
+Directory inode 12 has an unallocated block #12.  Allocate? yes
+
+Directory inode 12 has an unallocated block #13.  Allocate? yes
+
+Directory inode 12 has an unallocated block #14.  Allocate? yes
+
+Directory inode 12 has an unallocated block #15.  Allocate? yes
+
+Directory inode 12 has an unallocated block #16.  Allocate? yes
+
+Directory inode 12 has an unallocated block #17.  Allocate? yes
+
+Directory inode 12 has an unallocated block #18.  Allocate? yes
+
+Directory inode 12 has an unallocated block #19.  Allocate? yes
+
+Directory inode 12 has an unallocated block #20.  Allocate? yes
+
+Directory inode 12 has an unallocated block #21.  Allocate? yes
+
+Directory inode 12 has an unallocated block #22.  Allocate? yes
+
+Directory inode 12 has an unallocated block #23.  Allocate? yes
+
+Directory inode 12 has an unallocated block #24.  Allocate? yes
+
+Pass 3: Checking directory connectivity
+Pass 3A: Optimizing directories
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+Free blocks count wrong for group #0 (26, counted=25).
+Fix? yes
+
+Free blocks count wrong (416, counted=400).
+Fix? yes
+
+
+test_filesys: ***** FILE SYSTEM WAS MODIFIED *****
+test_filesys: 13/32 files (7.7% non-contiguous), 112/512 blocks
+Exit status is 1
diff --git a/tests/f_holedir4/expect.2 b/tests/f_holedir4/expect.2
new file mode 100644
index 0000000..1f0e351
--- /dev/null
+++ b/tests/f_holedir4/expect.2
@@ -0,0 +1,11 @@
+Pass 1: Checking inodes, blocks, and sizes
+Inode 12, i_blocks is 3072, should be 128.  Fix? yes
+
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+
+test_filesys: ***** FILE SYSTEM WAS MODIFIED *****
+test_filesys: 13/32 files (0.0% non-contiguous), 112/512 blocks
+Exit status is 1
diff --git a/tests/f_holedir4/image.gz b/tests/f_holedir4/image.gz
new file mode 100644
index 0000000000000000000000000000000000000000..8ab12454dc873c181981df34621d9b6c39551ee1
GIT binary patch
literal 2535
zcmb2|=3vmV6AocwetXk5Ti8*C{lUzQ9+|;GJqliqAGO%@n*z85_ij-?SjV7yDR`q(
z=o-14{B7#LRbP27QHW#FoLBW-!}bTG%aJ>;Rc_u8Z_01kx8%HVowfCB-)a9<bc{VY
z4iuIy+~}z|=j_?j5>87k=JB^qKD+myfLPb(O)EE^O5uo(-oF3n1b_b3i+k(5b#5J8
zRIF<*|5@7hbfxv)%Dl?jKVMHxH|C$Vzy6*@N0fQI&fPhc_JO}Yq`iA}>S^asiL8ZF
zqRzivvpCxAlX98Gz3Im9R<ko4V9fBT_kSA0(`)LWIw#|tU$oZ4kJ`#C3=9nJ_dnnG
zVa^X^y-2NNWCYS=f(Q9IbI*hHSIvrdAkz$xwjbiF7f(G|))#y?e*N>a1<QZ^KIF9d
zZGIkm-I1D<=rcPnwZ85z`5{vy-S~IIh8gCsxAJ}8A93gZzx!)#|GxOg-hQeo`CssB
zlmDLk=l=h)HLmQ*bNjbH??>&+v!46ffA=eY%km;S(cj*&tLk&-zpmeYBP+irRPFZ7
ze=X(L<jbz!*R6hif1e#W?%%8Mbc&PD)%f3<p~dM2@8|O|XQq|Cd${!FP51BbP3L9L
ztv288Y$^RzVmXf@Ip!nuIQUDP(QKWuYNxN|*CMa0s|ta^-Z0NNhXbhOz%O|qx#1r>
zLOH!yKQ;#$8-^CoE_(HJ%Bs-EOh9}0rmWcduiw9`Zf}kcy_`XwJr?`gio6t$EqC1?
vXQl=;;==0J|0`eDlV{kduF((}4S~@R7!83z6ao#-FVz?F-&)SVpuhkC?Cc78

literal 0
HcmV?d00001

diff --git a/tests/f_holedir4/name b/tests/f_holedir4/name
new file mode 100644
index 0000000..5eb55c1
--- /dev/null
+++ b/tests/f_holedir4/name
@@ -0,0 +1 @@
+bigalloc directory with hole and misaligned extent after hole


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 03/34] misc: zero s_jnl_blocks when adding journal online or removing external journal
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
  2014-09-13 22:11 ` [PATCH 01/34] e2fsck: offer to clear overlapping extents Darrick J. Wong
  2014-09-13 22:11 ` [PATCH 02/34] e2fsck: fix sliding the directory block down on bigalloc Darrick J. Wong
@ 2014-09-13 22:11 ` Darrick J. Wong
  2014-09-19  1:45   ` Theodore Ts'o
  2014-09-13 22:11 ` [PATCH 04/34] libext2fs: ext2fs_new_block2() should call alloc_block hook Darrick J. Wong
                   ` (30 subsequent siblings)
  33 siblings, 1 reply; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:11 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4, thomas_reardon

Erase s_jnl_blocks when removing an external journal, or adding an
internal journal online.  We can't add the backup for the internal
journal because we have no good way to get the indirect block or ETB
addresses, so the best we can do is hope that the user runs e2fsck,
which will correct that.  We are motivated to erase during external
journal removal to state emphatically that there's no journal.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reported-by: thomas_reardon@hotmail.com
---
 lib/ext2fs/mkjournal.c |    2 ++
 misc/tune2fs.c         |    1 +
 2 files changed, 3 insertions(+)


diff --git a/lib/ext2fs/mkjournal.c b/lib/ext2fs/mkjournal.c
index 0a7cd18..6f3a862 100644
--- a/lib/ext2fs/mkjournal.c
+++ b/lib/ext2fs/mkjournal.c
@@ -585,6 +585,8 @@ errcode_t ext2fs_add_journal_inode2(ext2_filsys fs, blk_t num_blocks,
 			goto errout;
 		}
 		journal_ino = st.st_ino;
+		memset(fs->super->s_jnl_blocks, 0,
+		       sizeof(fs->super->s_jnl_blocks));
 	} else {
 		if ((mount_flags & EXT2_MF_BUSY) &&
 		    !(fs->flags & EXT2_FLAG_EXCLUSIVE)) {
diff --git a/misc/tune2fs.c b/misc/tune2fs.c
index 80debe7..510e936 100644
--- a/misc/tune2fs.c
+++ b/misc/tune2fs.c
@@ -308,6 +308,7 @@ no_valid_journal:
 		return 1;
 	}
 	fs->super->s_journal_dev = 0;
+	memset(fs->super->s_jnl_blocks, 0, sizeof(fs->super->s_jnl_blocks));
 	uuid_clear(fs->super->s_journal_uuid);
 	ext2fs_mark_super_dirty(fs);
 	fputs(_("Journal removed\n"), stdout);


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 04/34] libext2fs: ext2fs_new_block2() should call alloc_block hook
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (2 preceding siblings ...)
  2014-09-13 22:11 ` [PATCH 03/34] misc: zero s_jnl_blocks when adding journal online or removing external journal Darrick J. Wong
@ 2014-09-13 22:11 ` Darrick J. Wong
  2014-09-13 22:11 ` [PATCH 05/34] debugfs: manage needs_recover feature when messing with the journal Darrick J. Wong
                   ` (29 subsequent siblings)
  33 siblings, 0 replies; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:11 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

If ext2fs_new_block2() is called without a specific block map, we
should call the alloc_block hook before checking fs->block_map.  This
helps us to avoid a bug in e2fsck where we need to allocate a block
but instead of consulting block_found_map, we use the FS bitmaps,
which (prior to pass 5) could be wrong.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 e2fsck/pass1.c     |    2 +-
 lib/ext2fs/alloc.c |   15 +++++++++++++++
 2 files changed, 16 insertions(+), 1 deletion(-)


diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
index db5273e..04bb465 100644
--- a/e2fsck/pass1.c
+++ b/e2fsck/pass1.c
@@ -3725,7 +3725,7 @@ static errcode_t e2fsck_get_alloc_block(ext2_filsys fs, blk64_t goal,
 				return retval;
 		}
 
-		retval = ext2fs_new_block2(fs, goal, 0, &new_block);
+		retval = ext2fs_new_block2(fs, goal, fs->block_map, &new_block);
 		if (retval)
 			return retval;
 	}
diff --git a/lib/ext2fs/alloc.c b/lib/ext2fs/alloc.c
index 578fd7f..d1c1a84 100644
--- a/lib/ext2fs/alloc.c
+++ b/lib/ext2fs/alloc.c
@@ -137,9 +137,23 @@ errcode_t ext2fs_new_block2(ext2_filsys fs, blk64_t goal,
 {
 	errcode_t retval;
 	blk64_t	b = 0;
+	errcode_t (*gab)(ext2_filsys fs, blk64_t goal, blk64_t *ret);
 
 	EXT2_CHECK_MAGIC(fs, EXT2_ET_MAGIC_EXT2FS_FILSYS);
 
+	if (!map && fs->get_alloc_block) {
+		/*
+		 * In case there are clients out there whose get_alloc_block
+		 * handlers call ext2fs_new_block2 with a NULL block map,
+		 * temporarily swap out the function pointer so that we don't
+		 * end up in an infinite loop.
+		 */
+		gab = fs->get_alloc_block;
+		fs->get_alloc_block = NULL;
+		retval = gab(fs, goal, &b);
+		fs->get_alloc_block = gab;
+		goto allocated;
+	}
 	if (!map)
 		map = fs->block_map;
 	if (!map)
@@ -153,6 +167,7 @@ errcode_t ext2fs_new_block2(ext2_filsys fs, blk64_t goal,
 	if ((retval == ENOENT) && (goal != fs->super->s_first_data_block))
 		retval = ext2fs_find_first_zero_block_bitmap2(map,
 			fs->super->s_first_data_block, goal - 1, &b);
+allocated:
 	if (retval == ENOENT)
 		return EXT2_ET_BLOCK_ALLOC_FAIL;
 	if (retval)


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 05/34] debugfs: manage needs_recover feature when messing with the journal
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (3 preceding siblings ...)
  2014-09-13 22:11 ` [PATCH 04/34] libext2fs: ext2fs_new_block2() should call alloc_block hook Darrick J. Wong
@ 2014-09-13 22:11 ` Darrick J. Wong
  2014-09-19  6:01   ` Theodore Ts'o
  2014-09-13 22:11 ` [PATCH 06/34] debugfs: add LIBINTL to debugfs link command Darrick J. Wong
                   ` (28 subsequent siblings)
  33 siblings, 1 reply; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:11 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

Set the needs_recover incompat feature when debugfs writes journal
transactions so that we actually replay the journal contents at the
next mount.

Likewise, clear it if we successfully recover the journal.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 debugfs/do_journal.c |    7 +++++++
 1 file changed, 7 insertions(+)


diff --git a/debugfs/do_journal.c b/debugfs/do_journal.c
index 711ed27..a17af6e 100644
--- a/debugfs/do_journal.c
+++ b/debugfs/do_journal.c
@@ -158,6 +158,8 @@ static errcode_t journal_commit_trans(journal_transaction_t *trans)
 	trans->flags &= ~J_TRANS_OPEN;
 	trans->block++;
 
+	trans->fs->super->s_feature_incompat |= EXT3_FEATURE_INCOMPAT_RECOVER;
+	ext2fs_mark_super_dirty(trans->fs);
 error:
 	if (cbh)
 		brelse(cbh);
@@ -979,4 +981,9 @@ void do_journal_run(int argc, char *argv[])
 	err = ext2fs_run_ext3_journal(&current_fs);
 	if (err)
 		com_err("journal_run", err, "while recovering journal");
+	else {
+		current_fs->super->s_feature_incompat &=
+				~EXT3_FEATURE_INCOMPAT_RECOVER;
+		ext2fs_mark_super_dirty(current_fs);
+	}
 }


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 06/34] debugfs: add LIBINTL to debugfs link command
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (4 preceding siblings ...)
  2014-09-13 22:11 ` [PATCH 05/34] debugfs: manage needs_recover feature when messing with the journal Darrick J. Wong
@ 2014-09-13 22:11 ` Darrick J. Wong
  2014-09-19  4:46   ` Theodore Ts'o
  2014-09-13 22:11 ` [PATCH 07/34] ext2fs: add readahead method to improve scanning Darrick J. Wong
                   ` (27 subsequent siblings)
  33 siblings, 1 reply; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:11 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

Since debugfs now links in the journal code (which in turn depends on
internationalization libraries) we must add a linker option to pull
that in on Mac OSX.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 debugfs/Makefile.in |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


diff --git a/debugfs/Makefile.in b/debugfs/Makefile.in
index 0837151..6220943 100644
--- a/debugfs/Makefile.in
+++ b/debugfs/Makefile.in
@@ -35,13 +35,13 @@ SRCS= debug_cmds.c $(srcdir)/debugfs.c $(srcdir)/util.c $(srcdir)/ls.c \
 	$(srcdir)/../e2fsck/recovery.c $(srcdir)/do_journal.c
 
 LIBS= $(LIBQUOTA) $(LIBEXT2FS) $(LIBE2P) $(LIBSS) $(LIBCOM_ERR) $(LIBBLKID) \
-	$(LIBUUID) $(SYSLIBS)
+	$(LIBUUID) $(SYSLIBS) $(LIBINTL)
 DEPLIBS= $(DEPLIBQUOTA) $(LIBEXT2FS) $(LIBE2P) $(DEPLIBSS) $(DEPLIBCOM_ERR) \
 	$(DEPLIBBLKID) $(DEPLIBUUID)
 
 STATIC_LIBS= $(STATIC_LIBQUOTA) $(STATIC_LIBEXT2FS) $(STATIC_LIBSS) \
 	$(STATIC_LIBCOM_ERR) $(STATIC_LIBBLKID) $(STATIC_LIBUUID) \
-	$(STATIC_LIBE2P) $(SYSLIBS)
+	$(STATIC_LIBE2P) $(SYSLIBS) $(LIBINTL)
 STATIC_DEPLIBS= $(STATIC_LIBEXT2FS) $(DEPSTATIC_LIBSS) \
 		$(DEPSTATIC_LIBCOM_ERR) $(DEPSTATIC_LIBUUID) \
 		$(DEPSTATIC_LIBE2P)


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 07/34] ext2fs: add readahead method to improve scanning
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (5 preceding siblings ...)
  2014-09-13 22:11 ` [PATCH 06/34] debugfs: add LIBINTL to debugfs link command Darrick J. Wong
@ 2014-09-13 22:11 ` Darrick J. Wong
  2014-09-19 16:15   ` Theodore Ts'o
  2014-09-13 22:12 ` [PATCH 08/34] libext2fs/e2fsck: provide routines to read-ahead metadata Darrick J. Wong
                   ` (26 subsequent siblings)
  33 siblings, 1 reply; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:11 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4, Andreas Dilger

Frøm: Andreas Dilger <adilger@whamcloud.com>

Add a readahead method for prefetching ranges of disk blocks.  This is
useful for inode table scanning, and other large contiguous ranges of
blocks, and may also prove useful for random block prefetch, since it
will allow reordering of the IO without waiting synchronously for the
reads to complete.

It is currently using the posix_fadvise(POSIX_FADV_WILLNEED)
interface, as this proved most efficient during our testing.

[darrick.wong@oracle.com]
Make the arguments to the readahead function take the same ULL values
as the other IO functions, and return an appropriate error code when
fadvise isn't available.

v2: Plumb in test_io.c for cache readahead.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 lib/ext2fs/ext2_io.h    |    8 +++++++-
 lib/ext2fs/io_manager.c |    9 +++++++++
 lib/ext2fs/test_io.c    |   22 ++++++++++++++++++++++
 lib/ext2fs/unix_io.c    |   27 ++++++++++++++++++++++++---
 4 files changed, 62 insertions(+), 4 deletions(-)


diff --git a/lib/ext2fs/ext2_io.h b/lib/ext2fs/ext2_io.h
index 1894fb8..4c5a5c5 100644
--- a/lib/ext2fs/ext2_io.h
+++ b/lib/ext2fs/ext2_io.h
@@ -90,7 +90,10 @@ struct struct_io_manager {
 					int count, const void *data);
 	errcode_t (*discard)(io_channel channel, unsigned long long block,
 			     unsigned long long count);
-	long	reserved[16];
+	errcode_t (*cache_readahead)(io_channel channel,
+				     unsigned long long block,
+				     unsigned long long count);
+	long	reserved[15];
 };
 
 #define IO_FLAG_RW		0x0001
@@ -124,6 +127,9 @@ extern errcode_t io_channel_discard(io_channel channel,
 				    unsigned long long count);
 extern errcode_t io_channel_alloc_buf(io_channel channel,
 				      int count, void *ptr);
+extern errcode_t io_channel_cache_readahead(io_channel io,
+					    unsigned long long block,
+					    unsigned long long count);
 
 /* unix_io.c */
 extern io_manager unix_io_manager;
diff --git a/lib/ext2fs/io_manager.c b/lib/ext2fs/io_manager.c
index 34e4859..dc5888d 100644
--- a/lib/ext2fs/io_manager.c
+++ b/lib/ext2fs/io_manager.c
@@ -128,3 +128,12 @@ errcode_t io_channel_alloc_buf(io_channel io, int count, void *ptr)
 	else
 		return ext2fs_get_mem(size, ptr);
 }
+
+errcode_t io_channel_cache_readahead(io_channel io, unsigned long long block,
+				     unsigned long long count)
+{
+	if (!io->manager->cache_readahead)
+		return EXT2_ET_OP_NOT_SUPPORTED;
+
+	return io->manager->cache_readahead(io, block, count);
+}
diff --git a/lib/ext2fs/test_io.c b/lib/ext2fs/test_io.c
index 6f0d035..b03a939 100644
--- a/lib/ext2fs/test_io.c
+++ b/lib/ext2fs/test_io.c
@@ -85,6 +85,7 @@ void (*test_io_cb_write_byte)
 #define TEST_FLAG_DUMP			0x10
 #define TEST_FLAG_SET_OPTION		0x20
 #define TEST_FLAG_DISCARD		0x40
+#define TEST_FLAG_READAHEAD		0x80
 
 static void test_dump_block(io_channel channel,
 			    struct test_private_data *data,
@@ -486,6 +487,26 @@ static errcode_t test_discard(io_channel channel, unsigned long long block,
 	return retval;
 }
 
+static errcode_t test_cache_readahead(io_channel channel,
+				      unsigned long long block,
+				      unsigned long long count)
+{
+	struct test_private_data *data;
+	errcode_t	retval = 0;
+
+	EXT2_CHECK_MAGIC(channel, EXT2_ET_MAGIC_IO_CHANNEL);
+	data = (struct test_private_data *) channel->private_data;
+	EXT2_CHECK_MAGIC(data, EXT2_ET_MAGIC_TEST_IO_CHANNEL);
+
+	if (data->real)
+		retval = io_channel_cache_readahead(data->real, block, count);
+	if (data->flags & TEST_FLAG_READAHEAD)
+		fprintf(data->outfile,
+			"Test_io: readahead(%llu, %llu) returned %s\n",
+			block, count, retval ? error_message(retval) : "OK");
+	return retval;
+}
+
 static struct struct_io_manager struct_test_manager = {
 	.magic		= EXT2_ET_MAGIC_IO_MANAGER,
 	.name		= "Test I/O Manager",
@@ -501,6 +522,7 @@ static struct struct_io_manager struct_test_manager = {
 	.read_blk64	= test_read_blk64,
 	.write_blk64	= test_write_blk64,
 	.discard	= test_discard,
+	.cache_readahead	= test_cache_readahead,
 };
 
 io_manager test_io_manager = &struct_test_manager;
diff --git a/lib/ext2fs/unix_io.c b/lib/ext2fs/unix_io.c
index eb39b28..189adce 100644
--- a/lib/ext2fs/unix_io.c
+++ b/lib/ext2fs/unix_io.c
@@ -15,6 +15,9 @@
  * %End-Header%
  */
 
+#define _XOPEN_SOURCE 600
+#define _DARWIN_C_SOURCE
+#define _FILE_OFFSET_BITS 64
 #define _LARGEFILE_SOURCE
 #define _LARGEFILE64_SOURCE
 #ifndef _GNU_SOURCE
@@ -35,6 +38,9 @@
 #ifdef __linux__
 #include <sys/utsname.h>
 #endif
+#if HAVE_SYS_TYPES_H
+#include <sys/types.h>
+#endif
 #ifdef HAVE_SYS_IOCTL_H
 #include <sys/ioctl.h>
 #endif
@@ -44,9 +50,6 @@
 #if HAVE_SYS_STAT_H
 #include <sys/stat.h>
 #endif
-#if HAVE_SYS_TYPES_H
-#include <sys/types.h>
-#endif
 #if HAVE_SYS_RESOURCE_H
 #include <sys/resource.h>
 #endif
@@ -830,6 +833,23 @@ static errcode_t unix_write_blk64(io_channel channel, unsigned long long block,
 #endif /* NO_IO_CACHE */
 }
 
+static errcode_t unix_cache_readahead(io_channel channel,
+				      unsigned long long block,
+				      unsigned long long count)
+{
+#ifdef POSIX_FADV_WILLNEED
+	struct unix_private_data *data;
+
+	data = (struct unix_private_data *)channel->private_data;
+	return posix_fadvise(data->dev,
+			     (ext2_loff_t)block * channel->block_size,
+			     (ext2_loff_t)count * channel->block_size,
+			     POSIX_FADV_WILLNEED);
+#else
+	return EXT2_ET_OP_NOT_SUPPORTED;
+#endif
+}
+
 static errcode_t unix_write_blk(io_channel channel, unsigned long block,
 				int count, const void *buf)
 {
@@ -981,6 +1001,7 @@ static struct struct_io_manager struct_unix_manager = {
 	.read_blk64	= unix_read_blk64,
 	.write_blk64	= unix_write_blk64,
 	.discard	= unix_discard,
+	.cache_readahead	= unix_cache_readahead,
 };
 
 io_manager unix_io_manager = &struct_unix_manager;

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 08/34] libext2fs/e2fsck: provide routines to read-ahead metadata
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (6 preceding siblings ...)
  2014-09-13 22:11 ` [PATCH 07/34] ext2fs: add readahead method to improve scanning Darrick J. Wong
@ 2014-09-13 22:12 ` Darrick J. Wong
  2014-09-13 22:12 ` [PATCH 09/34] e2fsck: read-ahead metadata during passes 1, 2, and 4 Darrick J. Wong
                   ` (25 subsequent siblings)
  33 siblings, 0 replies; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:12 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

This patch adds to e2fsck the ability to pre-fetch metadata into the
page cache in the hopes of speeding up fsck runs.  There are two new
functions -- the first allows a caller to readahead a list of blocks,
and the second is a helper function that uses that first mechanism to
load group data (bitmaps, inode tables).

These new e2fsck routines require the addition of a dblist API to
allow us to iterate a subset of a dblist.  This will enable
incremental directory block readahead in e2fsck pass 2.

There's also a function to estimate the readahead given a FS.

v2: Add an API to create a dblist with a given number of list elements
pre-allocated.  This enables us to save ~2ms per call to
e2fsck_readahead() (assuming a 2MB RA buffer) by not having to
repeatedly call ext2_resize_mem as we add blocks to the list.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 configure           |    2 
 configure.in        |    1 
 e2fsck/Makefile.in  |    8 +-
 e2fsck/e2fsck.h     |   18 ++++
 e2fsck/readahead.c  |  252 +++++++++++++++++++++++++++++++++++++++++++++++++++
 e2fsck/util.c       |   51 ++++++++++
 lib/config.h.in     |    3 +
 lib/ext2fs/dblist.c |   21 ++++
 lib/ext2fs/ext2fs.h |   10 ++
 9 files changed, 358 insertions(+), 8 deletions(-)
 create mode 100644 e2fsck/readahead.c


diff --git a/configure b/configure
index 65449c9..0ea5fc5 100755
--- a/configure
+++ b/configure
@@ -12404,7 +12404,7 @@ fi
 done
 
 fi
-for ac_header in  	dirent.h 	errno.h 	execinfo.h 	getopt.h 	malloc.h 	mntent.h 	paths.h 	semaphore.h 	setjmp.h 	signal.h 	stdarg.h 	stdint.h 	stdlib.h 	termios.h 	termio.h 	unistd.h 	utime.h 	attr/xattr.h 	linux/falloc.h 	linux/fd.h 	linux/major.h 	linux/loop.h 	net/if_dl.h 	netinet/in.h 	sys/disklabel.h 	sys/disk.h 	sys/file.h 	sys/ioctl.h 	sys/mkdev.h 	sys/mman.h 	sys/mount.h 	sys/prctl.h 	sys/resource.h 	sys/select.h 	sys/socket.h 	sys/sockio.h 	sys/stat.h 	sys/syscall.h 	sys/sysmacros.h 	sys/time.h 	sys/types.h 	sys/un.h 	sys/wait.h
+for ac_header in  	dirent.h 	errno.h 	execinfo.h 	getopt.h 	malloc.h 	mntent.h 	paths.h 	semaphore.h 	setjmp.h 	signal.h 	stdarg.h 	stdint.h 	stdlib.h 	termios.h 	termio.h 	unistd.h 	utime.h 	attr/xattr.h 	linux/falloc.h 	linux/fd.h 	linux/major.h 	linux/loop.h 	net/if_dl.h 	netinet/in.h 	sys/disklabel.h 	sys/disk.h 	sys/file.h 	sys/ioctl.h 	sys/mkdev.h 	sys/mman.h 	sys/mount.h 	sys/prctl.h 	sys/resource.h 	sys/select.h 	sys/socket.h 	sys/sockio.h 	sys/stat.h 	sys/syscall.h 	sys/sysctl.h 	sys/sysmacros.h 	sys/time.h 	sys/types.h 	sys/un.h 	sys/wait.h
 do :
   as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh`
 ac_fn_c_check_header_mongrel "$LINENO" "$ac_header" "$as_ac_Header" "$ac_includes_default"
diff --git a/configure.in b/configure.in
index 97a58c5..5106f96 100644
--- a/configure.in
+++ b/configure.in
@@ -941,6 +941,7 @@ AC_CHECK_HEADERS(m4_flatten([
 	sys/sockio.h
 	sys/stat.h
 	sys/syscall.h
+	sys/sysctl.h
 	sys/sysmacros.h
 	sys/time.h
 	sys/types.h
diff --git a/e2fsck/Makefile.in b/e2fsck/Makefile.in
index a9c377d..8d7e769 100644
--- a/e2fsck/Makefile.in
+++ b/e2fsck/Makefile.in
@@ -62,7 +62,7 @@ OBJS= dict.o unix.o e2fsck.o super.o pass1.o pass1b.o pass2.o \
 	pass3.o pass4.o pass5.o journal.o badblocks.o util.o dirinfo.o \
 	dx_dirinfo.o ehandler.o problem.o message.o quota.o recovery.o \
 	region.o revoke.o ea_refcount.o rehash.o profile.o prof_err.o \
-	logfile.o sigcatcher.o $(MTRACE_OBJ)
+	logfile.o sigcatcher.o readahead.o $(MTRACE_OBJ)
 
 PROFILED_OBJS= profiled/dict.o profiled/unix.o profiled/e2fsck.o \
 	profiled/super.o profiled/pass1.o profiled/pass1b.o \
@@ -73,7 +73,7 @@ PROFILED_OBJS= profiled/dict.o profiled/unix.o profiled/e2fsck.o \
 	profiled/recovery.o profiled/region.o profiled/revoke.o \
 	profiled/ea_refcount.o profiled/rehash.o profiled/profile.o \
 	profiled/prof_err.o profiled/logfile.o \
-	profiled/sigcatcher.o
+	profiled/sigcatcher.o profiled/readahead.o
 
 SRCS= $(srcdir)/e2fsck.c \
 	$(srcdir)/dict.c \
@@ -97,6 +97,7 @@ SRCS= $(srcdir)/e2fsck.c \
 	$(srcdir)/message.c \
 	$(srcdir)/ea_refcount.c \
 	$(srcdir)/rehash.c \
+	$(srcdir)/readahead.c \
 	$(srcdir)/region.c \
 	$(srcdir)/profile.c \
 	$(srcdir)/sigcatcher.c \
@@ -525,3 +526,6 @@ quota.o: $(srcdir)/quota.c $(top_builddir)/lib/config.h \
  $(srcdir)/profile.h prof_err.h $(top_srcdir)/lib/quota/quotaio.h \
  $(top_srcdir)/lib/quota/dqblk_v2.h $(top_srcdir)/lib/quota/quotaio_tree.h \
  $(top_srcdir)/lib/../e2fsck/dict.h $(srcdir)/problem.h
+readahead.o: $(srcdir)/readahead.c $(top_builddir)/lib/config.h \
+ $(top_srcdir)/lib/ext2fs/ext2fs.h $(top_srcdir)/lib/ext2fs/ext2_fs.h \
+ $(top_builddir)/lib/ext2fs/ext2_err.h $(srcdir)/e2fsck.h prof_err.h
diff --git a/e2fsck/e2fsck.h b/e2fsck/e2fsck.h
index 6ca3a6a..8837af9 100644
--- a/e2fsck/e2fsck.h
+++ b/e2fsck/e2fsck.h
@@ -490,6 +490,23 @@ extern ext2_ino_t e2fsck_get_lost_and_found(e2fsck_t ctx, int fix);
 extern errcode_t e2fsck_adjust_inode_count(e2fsck_t ctx, ext2_ino_t ino,
 					   int adj);
 
+/* readahead.c */
+#define E2FSCK_READA_SUPER	(0x01)
+#define E2FSCK_READA_GDT	(0x02)
+#define E2FSCK_READA_BBITMAP	(0x04)
+#define E2FSCK_READA_IBITMAP	(0x08)
+#define E2FSCK_READA_ITABLE	(0x10)
+#define E2FSCK_READA_ALL_FLAGS	(0x1F)
+errcode_t e2fsck_readahead(ext2_filsys fs, int flags, dgrp_t start,
+			   dgrp_t ngroups);
+#define E2FSCK_RA_DBLIST_IGNORE_BLOCKCNT	(0x01)
+#define E2FSCK_RA_DBLIST_ALL_FLAGS		(0x01)
+errcode_t e2fsck_readahead_dblist(ext2_filsys fs, int flags,
+				  ext2_dblist dblist,
+				  unsigned long long start,
+				  unsigned long long count);
+int e2fsck_can_readahead(ext2_filsys fs);
+unsigned long long e2fsck_guess_readahead(ext2_filsys fs);
 
 /* region.c */
 extern region_t region_create(region_addr_t min, region_addr_t max);
@@ -579,6 +596,7 @@ extern errcode_t e2fsck_allocate_subcluster_bitmap(ext2_filsys fs,
 						   int default_type,
 						   const char *profile_name,
 						   ext2fs_block_bitmap *ret);
+unsigned long long get_memory_size(void);
 
 /* unix.c */
 extern void e2fsck_clear_progbar(e2fsck_t ctx);
diff --git a/e2fsck/readahead.c b/e2fsck/readahead.c
new file mode 100644
index 0000000..a35f9f8
--- /dev/null
+++ b/e2fsck/readahead.c
@@ -0,0 +1,252 @@
+/*
+ * readahead.c -- Prefetch filesystem metadata to speed up fsck.
+ *
+ * Copyright (C) 2014 Oracle.
+ *
+ * %Begin-Header%
+ * This file may be redistributed under the terms of the GNU Library
+ * General Public License, version 2.
+ * %End-Header%
+ */
+
+#include "config.h"
+#include <string.h>
+
+#include "e2fsck.h"
+
+#undef DEBUG
+
+#ifdef DEBUG
+# define dbg_printf(f, a...)  do {printf(f, ## a); fflush(stdout); } while (0)
+#else
+# define dbg_printf(f, a...)
+#endif
+
+struct read_dblist {
+	errcode_t err;
+	blk64_t run_start;
+	blk64_t run_len;
+	int flags;
+};
+
+static int readahead_dir_block(ext2_filsys fs, struct ext2_db_entry2 *db,
+			       void *priv_data)
+{
+	struct read_dblist *pr = priv_data;
+	e2_blkcnt_t count = (pr->flags & E2FSCK_RA_DBLIST_IGNORE_BLOCKCNT ?
+			     1 : db->blockcnt);
+
+	if (!pr->run_len || db->blk != pr->run_start + pr->run_len) {
+		if (pr->run_len) {
+			pr->err = io_channel_cache_readahead(fs->io,
+							     pr->run_start,
+							     pr->run_len);
+			dbg_printf("readahead start=%llu len=%llu err=%d\n",
+				   pr->run_start, pr->run_len,
+				   (int)pr->err);
+		}
+		pr->run_start = db->blk;
+		pr->run_len = 0;
+	}
+	pr->run_len += count;
+
+	return pr->err ? DBLIST_ABORT : 0;
+}
+
+errcode_t e2fsck_readahead_dblist(ext2_filsys fs, int flags,
+				  ext2_dblist dblist,
+				  unsigned long long start,
+				  unsigned long long count)
+{
+	errcode_t err;
+	struct read_dblist pr;
+
+	dbg_printf("%s: flags=0x%x\n", __func__, flags);
+	if (flags & ~E2FSCK_RA_DBLIST_ALL_FLAGS)
+		return EXT2_ET_INVALID_ARGUMENT;
+
+	memset(&pr, 0, sizeof(pr));
+	pr.flags = flags;
+	err = ext2fs_dblist_iterate3(dblist, readahead_dir_block, start,
+				     count, &pr);
+	if (pr.err)
+		return pr.err;
+	if (err)
+		return err;
+
+	if (pr.run_len)
+		err = io_channel_cache_readahead(fs->io, pr.run_start,
+						 pr.run_len);
+
+	return err;
+}
+
+static errcode_t e2fsck_readahead_bitmap(ext2_filsys fs,
+					 ext2fs_block_bitmap ra_map)
+{
+	blk64_t start, end, out;
+	errcode_t err;
+
+	start = 1;
+	end = ext2fs_blocks_count(fs->super) - 1;
+
+	err = ext2fs_find_first_set_block_bitmap2(ra_map, start, end, &out);
+	while (err == 0) {
+		start = out;
+		err = ext2fs_find_first_zero_block_bitmap2(ra_map, start, end,
+							   &out);
+		if (err == ENOENT) {
+			out = end;
+			err = 0;
+		} else if (err)
+			break;
+
+		err = io_channel_cache_readahead(fs->io, start, out - start);
+		if (err)
+			break;
+		start = out;
+		err = ext2fs_find_first_set_block_bitmap2(ra_map, start, end,
+							  &out);
+	}
+
+	if (err == ENOENT)
+		err = 0;
+
+	return err;
+}
+
+/* Try not to spew bitmap range errors for readahead */
+static errcode_t mark_bmap_range(ext2_filsys fs, ext2fs_block_bitmap map,
+				 blk64_t blk, unsigned int num)
+{
+	if (blk >= ext2fs_get_generic_bmap_start(map) &&
+	    blk + num <= ext2fs_get_generic_bmap_end(map))
+		ext2fs_mark_block_bitmap_range2(map, blk, num);
+	else
+		return EXT2_ET_INVALID_ARGUMENT;
+	return 0;
+}
+
+static errcode_t mark_bmap(ext2_filsys fs, ext2fs_block_bitmap map, blk64_t blk)
+{
+	if (blk >= ext2fs_get_generic_bmap_start(map) &&
+	    blk <= ext2fs_get_generic_bmap_end(map))
+		ext2fs_mark_block_bitmap2(map, blk);
+	else
+		return EXT2_ET_INVALID_ARGUMENT;
+	return 0;
+}
+
+errcode_t e2fsck_readahead(ext2_filsys fs, int flags, dgrp_t start,
+			   dgrp_t ngroups)
+{
+	blk64_t		super, old_gdt, new_gdt;
+	blk_t		blocks;
+	dgrp_t		i;
+	ext2fs_block_bitmap		ra_map = NULL;
+	dgrp_t		end = start + ngroups;
+	errcode_t	err = 0;
+
+	dbg_printf("%s: flags=0x%x start=%d groups=%d\n", __func__, flags,
+		   start, ngroups);
+	if (flags & ~E2FSCK_READA_ALL_FLAGS)
+		return EXT2_ET_INVALID_ARGUMENT;
+
+	if (end > fs->group_desc_count)
+		end = fs->group_desc_count;
+
+	if (flags == 0)
+		return 0;
+
+	err = ext2fs_allocate_block_bitmap(fs, "readahead bitmap",
+					   &ra_map);
+	if (err)
+		return err;
+
+	for (i = start; i < end; i++) {
+		err = ext2fs_super_and_bgd_loc2(fs, i, &super, &old_gdt,
+						&new_gdt, &blocks);
+		if (err)
+			break;
+
+		if (flags & E2FSCK_READA_SUPER) {
+			err = mark_bmap(fs, ra_map, super);
+			if (err)
+				break;
+		}
+
+		if (flags & E2FSCK_READA_GDT) {
+			err = mark_bmap_range(fs, ra_map,
+					      old_gdt ? old_gdt : new_gdt,
+					      blocks);
+			if (err)
+				break;
+		}
+
+		if ((flags & E2FSCK_READA_BBITMAP) &&
+		    !ext2fs_bg_flags_test(fs, i, EXT2_BG_BLOCK_UNINIT) &&
+		    ext2fs_bg_free_blocks_count(fs, i) <
+				fs->super->s_blocks_per_group) {
+			super = ext2fs_block_bitmap_loc(fs, i);
+			err = mark_bmap(fs, ra_map, super);
+			if (err)
+				break;
+		}
+
+		if ((flags & E2FSCK_READA_IBITMAP) &&
+		    !ext2fs_bg_flags_test(fs, i, EXT2_BG_INODE_UNINIT) &&
+		    ext2fs_bg_free_inodes_count(fs, i) <
+				fs->super->s_inodes_per_group) {
+			super = ext2fs_inode_bitmap_loc(fs, i);
+			err = mark_bmap(fs, ra_map, super);
+			if (err)
+				break;
+		}
+
+		if ((flags & E2FSCK_READA_ITABLE) &&
+		    ext2fs_bg_free_inodes_count(fs, i) <
+				fs->super->s_inodes_per_group) {
+			super = ext2fs_inode_table_loc(fs, i);
+			blocks = fs->inode_blocks_per_group -
+				 (ext2fs_bg_itable_unused(fs, i) *
+				  EXT2_INODE_SIZE(fs->super) / fs->blocksize);
+			err = mark_bmap_range(fs, ra_map, super, blocks);
+			if (err)
+				break;
+		}
+	}
+
+	if (!err)
+		err = e2fsck_readahead_bitmap(fs, ra_map);
+
+	ext2fs_free_block_bitmap(ra_map);
+	return err;
+}
+
+int e2fsck_can_readahead(ext2_filsys fs)
+{
+	errcode_t err;
+
+	err = io_channel_cache_readahead(fs->io, 0, 1);
+	dbg_printf("%s: supp=%d\n", __func__, err != EXT2_ET_OP_NOT_SUPPORTED);
+	return err != EXT2_ET_OP_NOT_SUPPORTED;
+}
+
+unsigned long long e2fsck_guess_readahead(ext2_filsys fs)
+{
+	unsigned long long guess;
+
+	/*
+	 * The optimal readahead sizes were experimentally determined by
+	 * djwong in August 2014.  Setting the RA size to one block group's
+	 * worth of inode table blocks seems to yield the largest reductions
+	 * in e2fsck runtime.
+	 */
+	guess = fs->blocksize * fs->inode_blocks_per_group;
+
+	/* Disable RA if it'd use more 1/100th of RAM. */
+	if (get_memory_size() > (guess * 100))
+		return guess / 1024;
+
+	return 0;
+}
diff --git a/e2fsck/util.c b/e2fsck/util.c
index 8237328..74f20062 100644
--- a/e2fsck/util.c
+++ b/e2fsck/util.c
@@ -37,6 +37,10 @@
 #include <errno.h>
 #endif
 
+#ifdef HAVE_SYS_SYSCTL_H
+#include <sys/sysctl.h>
+#endif
+
 #include "e2fsck.h"
 
 extern e2fsck_t e2fsck_global_ctx;   /* Try your very best not to use this! */
@@ -848,3 +852,50 @@ errcode_t e2fsck_allocate_subcluster_bitmap(ext2_filsys fs, const char *descr,
 	fs->default_bitmap_type = save_type;
 	return retval;
 }
+
+/* Return memory size in bytes */
+unsigned long long get_memory_size(void)
+{
+#if defined(_SC_PHYS_PAGES)
+# if defined(_SC_PAGESIZE)
+	return (unsigned long long)sysconf(_SC_PHYS_PAGES) *
+	       (unsigned long long)sysconf(_SC_PAGESIZE);
+# elif defined(_SC_PAGE_SIZE)
+	return (unsigned long long)sysconf(_SC_PHYS_PAGES) *
+	       (unsigned long long)sysconf(_SC_PAGE_SIZE);
+# endif
+#elif defined(CTL_HW)
+# if (defined(HW_MEMSIZE) || defined(HW_PHYSMEM64))
+#  define CTL_HW_INT64
+# elif (defined(HW_PHYSMEM) || defined(HW_REALMEM))
+#  define CTL_HW_UINT
+# endif
+	int mib[2];
+
+	mib[0] = CTL_HW;
+# if defined(HW_MEMSIZE)
+	mib[1] = HW_MEMSIZE;
+# elif defined(HW_PHYSMEM64)
+	mib[1] = HW_PHYSMEM64;
+# elif defined(HW_REALMEM)
+	mib[1] = HW_REALMEM;
+# elif defined(HW_PYSMEM)
+	mib[1] = HW_PHYSMEM;
+# endif
+# if defined(CTL_HW_INT64)
+	unsigned long long size = 0;
+# elif defined(CTL_HW_UINT)
+	unsigned int size = 0;
+# endif
+# if defined(CTL_HW_INT64) || defined(CTL_HW_UINT)
+	size_t len = sizeof(size);
+
+	if (sysctl(mib, 2, &size, &len, NULL, 0) == 0)
+		return (unsigned long long)size;
+# endif
+	return 0;
+#else
+# warning "Don't know how to detect memory on your platform?"
+	return 0;
+#endif
+}
diff --git a/lib/config.h.in b/lib/config.h.in
index 4dcc966..be8f976 100644
--- a/lib/config.h.in
+++ b/lib/config.h.in
@@ -506,6 +506,9 @@
 /* Define to 1 if you have the <sys/syscall.h> header file. */
 #undef HAVE_SYS_SYSCALL_H
 
+/* Define to 1 if you have the <sys/sysctl.h> header file. */
+#undef HAVE_SYS_SYSCTL_H
+
 /* Define to 1 if you have the <sys/sysmacros.h> header file. */
 #undef HAVE_SYS_SYSMACROS_H
 
diff --git a/lib/ext2fs/dblist.c b/lib/ext2fs/dblist.c
index 942c4f0..bbdb221 100644
--- a/lib/ext2fs/dblist.c
+++ b/lib/ext2fs/dblist.c
@@ -194,20 +194,25 @@ void ext2fs_dblist_sort2(ext2_dblist dblist,
 /*
  * This function iterates over the directory block list
  */
-errcode_t ext2fs_dblist_iterate2(ext2_dblist dblist,
+errcode_t ext2fs_dblist_iterate3(ext2_dblist dblist,
 				 int (*func)(ext2_filsys fs,
 					     struct ext2_db_entry2 *db_info,
 					     void	*priv_data),
+				 unsigned long long start,
+				 unsigned long long count,
 				 void *priv_data)
 {
-	unsigned long long	i;
+	unsigned long long	i, end;
 	int		ret;
 
 	EXT2_CHECK_MAGIC(dblist, EXT2_ET_MAGIC_DBLIST);
 
+	end = start + count;
 	if (!dblist->sorted)
 		ext2fs_dblist_sort2(dblist, 0);
-	for (i=0; i < dblist->count; i++) {
+	if (end > dblist->count)
+		end = dblist->count;
+	for (i = start; i < end; i++) {
 		ret = (*func)(dblist->fs, &dblist->list[i], priv_data);
 		if (ret & DBLIST_ABORT)
 			return 0;
@@ -215,6 +220,16 @@ errcode_t ext2fs_dblist_iterate2(ext2_dblist dblist,
 	return 0;
 }
 
+errcode_t ext2fs_dblist_iterate2(ext2_dblist dblist,
+				 int (*func)(ext2_filsys fs,
+					     struct ext2_db_entry2 *db_info,
+					     void	*priv_data),
+				 void *priv_data)
+{
+	return ext2fs_dblist_iterate3(dblist, func, 0, dblist->count,
+				      priv_data);
+}
+
 static EXT2_QSORT_TYPE dir_block_cmp2(const void *a, const void *b)
 {
 	const struct ext2_db_entry2 *db_a =
diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
index d931fff..bba40ac 100644
--- a/lib/ext2fs/ext2fs.h
+++ b/lib/ext2fs/ext2fs.h
@@ -1055,11 +1055,17 @@ extern void ext2fs_dblist_sort2(ext2_dblist dblist,
 extern errcode_t ext2fs_dblist_iterate(ext2_dblist dblist,
 	int (*func)(ext2_filsys fs, struct ext2_db_entry *db_info,
 		    void	*priv_data),
-       void *priv_data);
+	void *priv_data);
 extern errcode_t ext2fs_dblist_iterate2(ext2_dblist dblist,
 	int (*func)(ext2_filsys fs, struct ext2_db_entry2 *db_info,
 		    void	*priv_data),
-       void *priv_data);
+	void *priv_data);
+extern errcode_t ext2fs_dblist_iterate3(ext2_dblist dblist,
+	int (*func)(ext2_filsys fs, struct ext2_db_entry2 *db_info,
+		    void	*priv_data),
+	unsigned long long start,
+	unsigned long long count,
+	void *priv_data);
 extern errcode_t ext2fs_set_dir_block(ext2_dblist dblist, ext2_ino_t ino,
 				      blk_t blk, int blockcnt);
 extern errcode_t ext2fs_set_dir_block2(ext2_dblist dblist, ext2_ino_t ino,


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 09/34] e2fsck: read-ahead metadata during passes 1, 2, and 4
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (7 preceding siblings ...)
  2014-09-13 22:12 ` [PATCH 08/34] libext2fs/e2fsck: provide routines to read-ahead metadata Darrick J. Wong
@ 2014-09-13 22:12 ` Darrick J. Wong
  2014-09-13 22:12 ` [PATCH 10/34] dumpe2fs: provide a machine-readable group-only mode Darrick J. Wong
                   ` (24 subsequent siblings)
  33 siblings, 0 replies; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:12 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

e2fsck pass1 is modified to use the block group data prefetch function
to try to fetch the inode tables into the pagecache before it is
needed.  We iterate through the blockgroups until we have enough inode
tables that need reading such that we can issue readahead; then we sit
and wait until the last inode table block read of the last group to
start fetching the next bunch.

pass2 is modified to use the dirblock prefetching function to prefetch
the list of directory blocks that are assembled in pass1.  We use the
"iterate a subset of a dblist" and avoid copying the dblist.  Directory
blocks are fetched incrementally as we walk through the directory
block list.  In previous iterations of this patch we would free the
directory blocks after processing, but the performance hit to e2fsck
itself wasn't worth it.  Furthermore, it is anticipated that most
users will then mount the FS and start using the directories, so they
may as well remain in the page cache.

pass4 is modified to prefetch the block and inode bitmaps in
anticipation of pass 5, because pass4 is entirely CPU bound.

In general, these mechanisms can decrease fsck time by 10-40%, if the
host system has sufficient memory and the storage system can provide a
lot of IOPs.  Pretty much any storage system capable of handling
multiple IOs in-flight at any time will see a fairly large performance
boost.  (Single-issue USB mass storage disks seem to suffer badly.)

By default, the readahead buffer size will be set to the size of a block
group's inode table (which is 2MiB for a regular ext4 FS).  The -E
readahead_kb= option can be given to specify the amount of memory to
use for readahead or zero to disable it entirely; or an option can be
given in e2fsck.conf.

v2: Fix an off-by-one error in the pass1 readahead which made the
readahead trigger one inode too late if the block groups are full.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 e2fsck/e2fsck.8.in      |    7 +++++
 e2fsck/e2fsck.conf.5.in |   15 +++++++++++
 e2fsck/e2fsck.h         |    3 ++
 e2fsck/pass1.c          |   65 +++++++++++++++++++++++++++++++++++++++++++++++
 e2fsck/pass2.c          |   38 +++++++++++++++++++++++++++
 e2fsck/pass4.c          |    9 +++++++
 e2fsck/unix.c           |   28 ++++++++++++++++++++
 lib/ext2fs/ext2fs.h     |    1 +
 lib/ext2fs/inode.c      |    3 +-
 9 files changed, 167 insertions(+), 2 deletions(-)


diff --git a/e2fsck/e2fsck.8.in b/e2fsck/e2fsck.8.in
index f5ed758..84ae50f 100644
--- a/e2fsck/e2fsck.8.in
+++ b/e2fsck/e2fsck.8.in
@@ -207,6 +207,13 @@ option may prevent you from further manual data recovery.
 .BI nodiscard
 Do not attempt to discard free blocks and unused inode blocks. This option is
 exactly the opposite of discard option. This is set as default.
+.TP
+.BI readahead_kb
+Use this many KiB of memory to pre-fetch metadata in the hopes of reducing
+e2fsck runtime.  By default, this is set to the size of a block group's inode
+table (typically 2MiB on a regular ext4 filesystem); if this amount is more
+than 1/100 of total physical memory, readahead is disabled.  Set this to zero
+to disable readahead entirely.
 .RE
 .TP
 .B \-f
diff --git a/e2fsck/e2fsck.conf.5.in b/e2fsck/e2fsck.conf.5.in
index 9ebfbbf..e1d0518 100644
--- a/e2fsck/e2fsck.conf.5.in
+++ b/e2fsck/e2fsck.conf.5.in
@@ -205,6 +205,21 @@ of that type are squelched.  This can be useful if the console is slow
 (i.e., connected to a serial port) and so a large amount of output could
 end up delaying the boot process for a long time (potentially hours).
 .TP
+.I readahead_mem_pct
+Use this percentage of memory to try to read in metadata blocks ahead of the
+main e2fsck thread.  This should reduce run times, depending on the speed of
+the underlying storage and the amount of free memory.  There is no default, but
+see
+.B readahead_mem_pct
+for more details.
+.TP
+.I readahead_kb
+Use this amount of memory to read in metadata blocks ahead of the main checking
+thread.  Setting this value to zero disables readahead entirely.  By default,
+this is set the size of one block group's inode table (typically 2MiB on a
+regular ext4 filesystem); if this amount is more than 1/100th of total physical
+memory, readahead is disabled.
+.TP
 .I report_features
 If this boolean relation is true, e2fsck will print the file system
 features as part of its verbose reporting (i.e., if the
diff --git a/e2fsck/e2fsck.h b/e2fsck/e2fsck.h
index 8837af9..b2654ef 100644
--- a/e2fsck/e2fsck.h
+++ b/e2fsck/e2fsck.h
@@ -378,6 +378,9 @@ struct e2fsck_struct {
 	 */
 	void *priv_data;
 	ext2fs_block_bitmap block_metadata_map; /* Metadata blocks */
+
+	/* How much are we allowed to readahead? */
+	unsigned long long readahead_kb;
 };
 
 /* Used by the region allocation code */
diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
index 04bb465..d4760ef 100644
--- a/e2fsck/pass1.c
+++ b/e2fsck/pass1.c
@@ -868,6 +868,60 @@ out:
 	return 0;
 }
 
+static void pass1_readahead(e2fsck_t ctx, dgrp_t *group, ext2_ino_t *next_ino)
+{
+	ext2_ino_t inodes_in_group = 0, inodes_per_block, inodes_per_buffer;
+	dgrp_t start = *group, grp;
+	blk64_t blocks_to_read = 0;
+	errcode_t err = EXT2_ET_INVALID_ARGUMENT;
+
+	if (ctx->readahead_kb == 0)
+		goto out;
+
+	/* Keep iterating groups until we have enough to readahead */
+	inodes_per_block = EXT2_INODES_PER_BLOCK(ctx->fs->super);
+	for (grp = start; grp < ctx->fs->group_desc_count; grp++) {
+		if (ext2fs_bg_flags_test(ctx->fs, grp, EXT2_BG_INODE_UNINIT))
+			continue;
+		inodes_in_group = ctx->fs->super->s_inodes_per_group -
+					ext2fs_bg_itable_unused(ctx->fs, grp);
+		blocks_to_read += (inodes_in_group + inodes_per_block - 1) /
+					inodes_per_block;
+		if (blocks_to_read * ctx->fs->blocksize >
+		    ctx->readahead_kb * 1024)
+			break;
+	}
+
+	err = e2fsck_readahead(ctx->fs, E2FSCK_READA_ITABLE, start,
+			       grp - start + 1);
+	if (err == EAGAIN) {
+		ctx->readahead_kb /= 2;
+		err = 0;
+	}
+
+out:
+	if (err) {
+		/* Error; disable itable readahead */
+		*group = ctx->fs->group_desc_count;
+		*next_ino = ctx->fs->super->s_inodes_count;
+	} else {
+		/*
+		 * Don't do more readahead until we've reached the first inode
+		 * of the last inode scan buffer block for the last group.
+		 */
+		*group = grp + 1;
+		inodes_per_buffer = (ctx->inode_buffer_blocks ?
+				     ctx->inode_buffer_blocks :
+				     EXT2_INODE_SCAN_DEFAULT_BUFFER_BLOCKS) *
+				    ctx->fs->blocksize /
+				    EXT2_INODE_SIZE(ctx->fs->super);
+		inodes_in_group--;
+		*next_ino = inodes_in_group -
+			    (inodes_in_group % inodes_per_buffer) + 1 +
+			    (grp * ctx->fs->super->s_inodes_per_group);
+	}
+}
+
 void e2fsck_pass1(e2fsck_t ctx)
 {
 	int	i;
@@ -890,10 +944,19 @@ void e2fsck_pass1(e2fsck_t ctx)
 	int		low_dtime_check = 1;
 	int		inode_size;
 	int		failed_csum = 0;
+	ext2_ino_t	ino_threshold = 0;
+	dgrp_t		ra_group = 0;
 
 	init_resource_track(&rtrack, ctx->fs->io);
 	clear_problem_context(&pctx);
 
+	/* If we can do readahead, figure out how many groups to pull in. */
+	if (!e2fsck_can_readahead(ctx->fs))
+		ctx->readahead_kb = 0;
+	else if (ctx->readahead_kb == ~0ULL)
+		ctx->readahead_kb = e2fsck_guess_readahead(ctx->fs);
+	pass1_readahead(ctx, &ra_group, &ino_threshold);
+
 	if (!(ctx->options & E2F_OPT_PREEN))
 		fix_problem(ctx, PR_1_PASS_HEADER, &pctx);
 
@@ -1073,6 +1136,8 @@ void e2fsck_pass1(e2fsck_t ctx)
 		old_op = ehandler_operation(_("getting next inode from scan"));
 		pctx.errcode = ext2fs_get_next_inode_full(scan, &ino,
 							  inode, inode_size);
+		if (ino > ino_threshold)
+			pass1_readahead(ctx, &ra_group, &ino_threshold);
 		ehandler_operation(old_op);
 		if (ctx->flags & E2F_FLAG_SIGNAL_MASK)
 			return;
diff --git a/e2fsck/pass2.c b/e2fsck/pass2.c
index 0b9c5c5..2060ed2 100644
--- a/e2fsck/pass2.c
+++ b/e2fsck/pass2.c
@@ -61,6 +61,9 @@
  * Keeps track of how many times an inode is referenced.
  */
 static void deallocate_inode(e2fsck_t ctx, ext2_ino_t ino, char* block_buf);
+static int check_dir_block2(ext2_filsys fs,
+			   struct ext2_db_entry2 *dir_blocks_info,
+			   void *priv_data);
 static int check_dir_block(ext2_filsys fs,
 			   struct ext2_db_entry2 *dir_blocks_info,
 			   void *priv_data);
@@ -77,6 +80,9 @@ struct check_dir_struct {
 	struct problem_context	pctx;
 	int	count, max;
 	e2fsck_t ctx;
+	unsigned long long list_offset;
+	unsigned long long ra_entries;
+	unsigned long long next_ra_off;
 };
 
 void e2fsck_pass2(e2fsck_t ctx)
@@ -96,6 +102,9 @@ void e2fsck_pass2(e2fsck_t ctx)
 	int			i, depth;
 	problem_t		code;
 	int			bad_dir;
+	int (*check_dir_func)(ext2_filsys fs,
+			      struct ext2_db_entry2 *dir_blocks_info,
+			      void *priv_data);
 
 	init_resource_track(&rtrack, ctx->fs->io);
 	clear_problem_context(&cd.pctx);
@@ -139,6 +148,9 @@ void e2fsck_pass2(e2fsck_t ctx)
 	cd.ctx = ctx;
 	cd.count = 1;
 	cd.max = ext2fs_dblist_count2(fs->dblist);
+	cd.list_offset = 0;
+	cd.ra_entries = ctx->readahead_kb * 1024 / ctx->fs->blocksize;
+	cd.next_ra_off = 0;
 
 	if (ctx->progress)
 		(void) (ctx->progress)(ctx, 2, 0, cd.max);
@@ -146,7 +158,8 @@ void e2fsck_pass2(e2fsck_t ctx)
 	if (fs->super->s_feature_compat & EXT2_FEATURE_COMPAT_DIR_INDEX)
 		ext2fs_dblist_sort2(fs->dblist, special_dir_block_cmp);
 
-	cd.pctx.errcode = ext2fs_dblist_iterate2(fs->dblist, check_dir_block,
+	check_dir_func = cd.ra_entries ? check_dir_block2 : check_dir_block;
+	cd.pctx.errcode = ext2fs_dblist_iterate2(fs->dblist, check_dir_func,
 						 &cd);
 	if (ctx->flags & E2F_FLAG_SIGNAL_MASK || ctx->flags & E2F_FLAG_RESTART)
 		return;
@@ -824,6 +837,29 @@ err:
 	return retval;
 }
 
+static int check_dir_block2(ext2_filsys fs,
+			   struct ext2_db_entry2 *db,
+			   void *priv_data)
+{
+	int err;
+	struct check_dir_struct *cd = priv_data;
+
+	if (cd->ra_entries && cd->list_offset >= cd->next_ra_off) {
+		err = e2fsck_readahead_dblist(fs,
+					E2FSCK_RA_DBLIST_IGNORE_BLOCKCNT,
+					fs->dblist,
+					cd->list_offset + cd->ra_entries / 8,
+					cd->ra_entries);
+		if (err)
+			cd->ra_entries = 0;
+		cd->next_ra_off = cd->list_offset + (cd->ra_entries * 7 / 8);
+	}
+
+	err = check_dir_block(fs, db, priv_data);
+	cd->list_offset++;
+	return err;
+}
+
 static int check_dir_block(ext2_filsys fs,
 			   struct ext2_db_entry2 *db,
 			   void *priv_data)
diff --git a/e2fsck/pass4.c b/e2fsck/pass4.c
index 21d93f0..bc9a2c4 100644
--- a/e2fsck/pass4.c
+++ b/e2fsck/pass4.c
@@ -106,6 +106,15 @@ void e2fsck_pass4(e2fsck_t ctx)
 #ifdef MTRACE
 	mtrace_print("Pass 4");
 #endif
+	/*
+	 * Since pass4 is mostly CPU bound, start readahead of bitmaps
+	 * ahead of pass 5 if we haven't already loaded them.
+	 */
+	if (ctx->readahead_kb &&
+	    (fs->block_map == NULL || fs->inode_map == NULL))
+		e2fsck_readahead(fs, E2FSCK_READA_BBITMAP |
+				     E2FSCK_READA_IBITMAP,
+				 0, fs->group_desc_count);
 
 	clear_problem_context(&pctx);
 
diff --git a/e2fsck/unix.c b/e2fsck/unix.c
index 1a089a9..6b0ca96 100644
--- a/e2fsck/unix.c
+++ b/e2fsck/unix.c
@@ -649,6 +649,7 @@ static void parse_extended_opts(e2fsck_t ctx, const char *opts)
 	char	*buf, *token, *next, *p, *arg;
 	int	ea_ver;
 	int	extended_usage = 0;
+	unsigned long long reada_kb;
 
 	buf = string_copy(ctx, opts, 0);
 	for (token = buf; token && *token; token = next) {
@@ -677,6 +678,15 @@ static void parse_extended_opts(e2fsck_t ctx, const char *opts)
 				continue;
 			}
 			ctx->ext_attr_ver = ea_ver;
+		} else if (strcmp(token, "readahead_kb") == 0) {
+			reada_kb = strtoull(arg, &p, 0);
+			if (*p) {
+				fprintf(stderr, "%s",
+					_("Invalid readahead buffer size.\n"));
+				extended_usage++;
+				continue;
+			}
+			ctx->readahead_kb = reada_kb;
 		} else if (strcmp(token, "fragcheck") == 0) {
 			ctx->options |= E2F_OPT_FRAGCHECK;
 			continue;
@@ -716,6 +726,7 @@ static void parse_extended_opts(e2fsck_t ctx, const char *opts)
 		fputs(("\tjournal_only\n"), stderr);
 		fputs(("\tdiscard\n"), stderr);
 		fputs(("\tnodiscard\n"), stderr);
+		fputs(("\treadahead_kb=<buffer size>\n"), stderr);
 		fputc('\n', stderr);
 		exit(1);
 	}
@@ -749,6 +760,7 @@ static errcode_t PRS(int argc, char *argv[], e2fsck_t *ret_ctx)
 #ifdef CONFIG_JBD_DEBUG
 	char 		*jbd_debug;
 #endif
+	unsigned long long phys_mem_kb;
 
 	retval = e2fsck_allocate_context(&ctx);
 	if (retval)
@@ -776,6 +788,8 @@ static errcode_t PRS(int argc, char *argv[], e2fsck_t *ret_ctx)
 	else
 		ctx->program_name = "e2fsck";
 
+	phys_mem_kb = get_memory_size() / 1024;
+	ctx->readahead_kb = ~0ULL;
 	while ((c = getopt (argc, argv, "panyrcC:B:dE:fvtFVM:b:I:j:P:l:L:N:SsDk")) != EOF)
 		switch (c) {
 		case 'C':
@@ -960,6 +974,20 @@ static errcode_t PRS(int argc, char *argv[], e2fsck_t *ret_ctx)
 	if (c)
 		verbose = 1;
 
+	if (ctx->readahead_kb == ~0ULL) {
+		profile_get_integer(ctx->profile, "options",
+				    "readahead_mem_pct", 0, -1, &c);
+		if (c >= 0 && c <= 100)
+			ctx->readahead_kb = phys_mem_kb * c / 100;
+		profile_get_integer(ctx->profile, "options",
+				    "readahead_kb", 0, -1, &c);
+		if (c >= 0)
+			ctx->readahead_kb = c;
+		if (ctx->readahead_kb != ~0ULL &&
+		    ctx->readahead_kb > phys_mem_kb)
+			ctx->readahead_kb = phys_mem_kb;
+	}
+
 	/* Turn off discard in read-only mode */
 	if ((ctx->options & E2F_OPT_NO) &&
 	    (ctx->options & E2F_OPT_DISCARD))
diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
index bba40ac..fe82a32 100644
--- a/lib/ext2fs/ext2fs.h
+++ b/lib/ext2fs/ext2fs.h
@@ -1421,6 +1421,7 @@ extern errcode_t ext2fs_get_next_inode_full(ext2_inode_scan scan,
 					    ext2_ino_t *ino,
 					    struct ext2_inode *inode,
 					    int bufsize);
+#define EXT2_INODE_SCAN_DEFAULT_BUFFER_BLOCKS	8
 extern errcode_t ext2fs_open_inode_scan(ext2_filsys fs, int buffer_blocks,
 				  ext2_inode_scan *ret_scan);
 extern void ext2fs_close_inode_scan(ext2_inode_scan scan);
diff --git a/lib/ext2fs/inode.c b/lib/ext2fs/inode.c
index 4310b82..4b3e14e 100644
--- a/lib/ext2fs/inode.c
+++ b/lib/ext2fs/inode.c
@@ -175,7 +175,8 @@ errcode_t ext2fs_open_inode_scan(ext2_filsys fs, int buffer_blocks,
 	scan->bytes_left = 0;
 	scan->current_group = 0;
 	scan->groups_left = fs->group_desc_count - 1;
-	scan->inode_buffer_blocks = buffer_blocks ? buffer_blocks : 8;
+	scan->inode_buffer_blocks = buffer_blocks ? buffer_blocks :
+				    EXT2_INODE_SCAN_DEFAULT_BUFFER_BLOCKS;
 	scan->current_block = ext2fs_inode_table_loc(scan->fs,
 						     scan->current_group);
 	scan->inodes_left = EXT2_INODES_PER_GROUP(scan->fs->super);


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 10/34] dumpe2fs: provide a machine-readable group-only mode
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (8 preceding siblings ...)
  2014-09-13 22:12 ` [PATCH 09/34] e2fsck: read-ahead metadata during passes 1, 2, and 4 Darrick J. Wong
@ 2014-09-13 22:12 ` Darrick J. Wong
  2014-09-19 16:17   ` Theodore Ts'o
  2014-09-13 22:12 ` [PATCH 11/34] dumpe2fs: output cleanup Darrick J. Wong
                   ` (23 subsequent siblings)
  33 siblings, 1 reply; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:12 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

Spit out just the group descriptor data in a machine readable format.
This is most useful for testing and scripting purposes.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 misc/dumpe2fs.8.in                 |   10 ++++++-
 misc/dumpe2fs.c                    |   40 +++++++++++++++++++++++++---
 tests/d_dumpe2fs_group_only/expect |   51 ++++++++++++++++++++++++++++++++++++
 tests/d_dumpe2fs_group_only/name   |    1 +
 tests/d_dumpe2fs_group_only/script |   43 ++++++++++++++++++++++++++++++
 5 files changed, 139 insertions(+), 6 deletions(-)
 create mode 100644 tests/d_dumpe2fs_group_only/expect
 create mode 100644 tests/d_dumpe2fs_group_only/name
 create mode 100644 tests/d_dumpe2fs_group_only/script


diff --git a/misc/dumpe2fs.8.in b/misc/dumpe2fs.8.in
index befaf94..8d9a559 100644
--- a/misc/dumpe2fs.8.in
+++ b/misc/dumpe2fs.8.in
@@ -8,7 +8,7 @@ dumpe2fs \- dump ext2/ext3/ext4 filesystem information
 .SH SYNOPSIS
 .B dumpe2fs
 [
-.B \-bfhixV
+.B \-bfghixV
 ]
 [
 .B \-o superblock=\fIsuperblock
@@ -49,6 +49,14 @@ is examining the remains of a very badly corrupted filesystem.
 force dumpe2fs to display a filesystem even though it may have some 
 filesystem feature flags which dumpe2fs may not understand (and which
 can cause some of dumpe2fs's display to be suspect).
+.TP
+.B \-g
+display the group descriptor information in a machine readable colon-separated
+value format.  The fields displayed are the group number; the number of the
+first block in the group; the superblock location (or -1 if not present); the
+range of blocks used by the group descriptors (or -1 if not present); the block
+bitmap location; the inode bitmap location; and the range of blocks used by the
+inode table.
 .TP 
 .B \-h
 only display the superblock information and not any of the block
diff --git a/misc/dumpe2fs.c b/misc/dumpe2fs.c
index 4c7bf46..05dc3c5 100644
--- a/misc/dumpe2fs.c
+++ b/misc/dumpe2fs.c
@@ -52,9 +52,9 @@ static int blocks64 = 0;
 
 static void usage(void)
 {
-	fprintf (stderr, _("Usage: %s [-bfhixV] [-o superblock=<num>] "
+	fprintf(stderr, _("Usage: %s [-bfghixV] [-o superblock=<num>] "
 		 "[-o blocksize=<num>] device\n"), program_name);
-	exit (1);
+	exit(1);
 }
 
 static void print_number(unsigned long long num)
@@ -150,7 +150,7 @@ static void print_bg_rel_offset(ext2_filsys fs, blk64_t block, int itable,
 	}
 }
 
-static void list_desc (ext2_filsys fs)
+static void list_desc(ext2_filsys fs, int grp_only)
 {
 	unsigned long i;
 	blk64_t	first_block, last_block;
@@ -187,6 +187,8 @@ static void list_desc (ext2_filsys fs)
 		old_desc_blocks = fs->super->s_first_meta_bg;
 	else
 		old_desc_blocks = fs->desc_blocks;
+	if (grp_only)
+		printf("group:block:super:gdt:bbitmap:ibitmap:itable\n");
 	for (i = 0; i < fs->group_desc_count; i++) {
 		first_block = ext2fs_group_first_block2(fs, i);
 		last_block = ext2fs_group_last_block2(fs, i);
@@ -194,6 +196,27 @@ static void list_desc (ext2_filsys fs)
 		ext2fs_super_and_bgd_loc2(fs, i, &super_blk,
 					  &old_desc_blk, &new_desc_blk, 0);
 
+		if (grp_only) {
+			printf("%lu:%llu:", i, first_block);
+			if (i == 0 || super_blk)
+				printf("%llu:", super_blk);
+			else
+				printf("-1:");
+			if (old_desc_blk) {
+				print_range(old_desc_blk,
+					    old_desc_blk + old_desc_blocks - 1);
+				printf(":");
+			} else if (new_desc_blk)
+				printf("%llu:", new_desc_blk);
+			else
+				printf("-1:");
+			printf("%llu:%llu:%llu\n",
+			       ext2fs_block_bitmap_loc(fs, i),
+			       ext2fs_inode_bitmap_loc(fs, i),
+			       ext2fs_inode_table_loc(fs, i));
+			continue;
+		}
+
 		printf (_("Group %lu: (Blocks "), i);
 		print_range(first_block, last_block);
 		fputs(")", stdout);
@@ -584,6 +607,7 @@ int main (int argc, char ** argv)
 	int		flags;
 	int		header_only = 0;
 	int		c;
+	int		grp_only = 0;
 
 #ifdef ENABLE_NLS
 	setlocale(LC_MESSAGES, "");
@@ -598,7 +622,7 @@ int main (int argc, char ** argv)
 	if (argc && *argv)
 		program_name = *argv;
 
-	while ((c = getopt (argc, argv, "bfhixVo:")) != EOF) {
+	while ((c = getopt(argc, argv, "bfghixVo:")) != EOF) {
 		switch (c) {
 		case 'b':
 			print_badblocks++;
@@ -606,6 +630,9 @@ int main (int argc, char ** argv)
 		case 'f':
 			force++;
 			break;
+		case 'g':
+			grp_only++;
+			break;
 		case 'h':
 			header_only++;
 			break;
@@ -672,6 +699,8 @@ try_open_again:
 	if (print_badblocks) {
 		list_bad_blocks(fs, 1);
 	} else {
+		if (grp_only)
+			goto just_descriptors;
 		list_super (fs->super);
 		if (fs->super->s_feature_incompat &
 		      EXT3_FEATURE_INCOMPAT_JOURNAL_DEV) {
@@ -697,7 +726,8 @@ try_bitmaps_again:
 		}
 		if (!retval && (fs->flags & EXT2_FLAG_IGNORE_CSUM_ERRORS))
 			printf("%s", _("\n*** Checksum errors detected in bitmaps!  Run e2fsck now!\n\n"));
-		list_desc (fs);
+just_descriptors:
+		list_desc(fs, grp_only);
 		if (retval) {
 			printf(_("\n%s: %s: error reading bitmaps: %s\n"),
 			       program_name, device_name,
diff --git a/tests/d_dumpe2fs_group_only/expect b/tests/d_dumpe2fs_group_only/expect
new file mode 100644
index 0000000..78f97a2
--- /dev/null
+++ b/tests/d_dumpe2fs_group_only/expect
@@ -0,0 +1,51 @@
+Creating filesystem with 1048576 4k blocks and 262144 inodes
+Superblock backups stored on blocks: 
+	32768, 98304, 163840, 229376, 294912, 819200, 884736
+
+Allocating group tables:      \b\b\b\b\bdone                            
+Writing inode tables:      \b\b\b\b\bdone                            
+Creating journal (32768 blocks): done
+Writing superblocks and filesystem accounting information:      \b\b\b\b\bdone
+
+Pass 1: Checking inodes, blocks, and sizes
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+test_filesys: 11/262144 files (0.0% non-contiguous), 51278/1048576 blocks
+Exit status is 0
+dumpe2fs output
+
+group:block:super:gdt:bbitmap:ibitmap:itable
+0:0:0:1-1:257:273:289
+1:32768:32768:32769-32769:258:274:801
+2:65536:-1:-1:259:275:1313
+3:98304:98304:98305-98305:260:276:1825
+4:131072:-1:-1:261:277:2337
+5:163840:163840:163841-163841:262:278:2849
+6:196608:-1:-1:263:279:3361
+7:229376:229376:229377-229377:264:280:3873
+8:262144:-1:-1:265:281:4385
+9:294912:294912:294913-294913:266:282:4897
+10:327680:-1:-1:267:283:5409
+11:360448:-1:-1:268:284:5921
+12:393216:-1:-1:269:285:6433
+13:425984:-1:-1:270:286:6945
+14:458752:-1:-1:271:287:7457
+15:491520:-1:-1:272:288:7969
+16:524288:-1:-1:524288:524304:524320
+17:557056:-1:-1:524289:524305:524832
+18:589824:-1:-1:524290:524306:525344
+19:622592:-1:-1:524291:524307:525856
+20:655360:-1:-1:524292:524308:526368
+21:688128:-1:-1:524293:524309:526880
+22:720896:-1:-1:524294:524310:527392
+23:753664:-1:-1:524295:524311:527904
+24:786432:-1:-1:524296:524312:528416
+25:819200:819200:819201-819201:524297:524313:528928
+26:851968:-1:-1:524298:524314:529440
+27:884736:884736:884737-884737:524299:524315:529952
+28:917504:-1:-1:524300:524316:530464
+29:950272:-1:-1:524301:524317:530976
+30:983040:-1:-1:524302:524318:531488
+31:1015808:-1:-1:524303:524319:532000
diff --git a/tests/d_dumpe2fs_group_only/name b/tests/d_dumpe2fs_group_only/name
new file mode 100644
index 0000000..096c020
--- /dev/null
+++ b/tests/d_dumpe2fs_group_only/name
@@ -0,0 +1 @@
+dumpe2fs group only mode
diff --git a/tests/d_dumpe2fs_group_only/script b/tests/d_dumpe2fs_group_only/script
new file mode 100644
index 0000000..127502f
--- /dev/null
+++ b/tests/d_dumpe2fs_group_only/script
@@ -0,0 +1,43 @@
+if test -x $DEBUGFS_EXE; then
+
+FSCK_OPT=-fy
+OUT=$test_name.log
+if [ -f $test_dir/expect.gz ]; then
+	EXP=$test_name.tmp
+	gunzip < $test_dir/expect.gz > $EXP1
+else
+	EXP=$test_dir/expect
+fi
+
+cp /dev/null $OUT
+
+$MKE2FS -F -o Linux -b 4096 -O has_journal -T ext4 $TMPFILE 1048576 2>&1 | sed -f $cmd_dir/filter.sed >> $OUT 2>&1
+
+$FSCK -fy -N test_filesys $TMPFILE > $OUT.new 2>&1
+status=$?
+echo Exit status is $status >> $OUT.new
+sed -f $cmd_dir/filter.sed -e "s;$TMPFILE;test.img;" $OUT.new >> $OUT
+rm -f $OUT.new
+
+echo "dumpe2fs output" >> $OUT
+$DUMPE2FS -g $TMPFILE 2>&1 | sed -f $cmd_dir/filter.sed >> $OUT
+
+rm -f $TMPFILE
+
+cmp -s $OUT $EXP
+status=$?
+
+if [ "$status" = 0 ] ; then
+	echo "$test_name: $test_description: ok"
+	touch $test_name.ok
+else
+	echo "$test_name: $test_description: failed"
+	diff $DIFF_OPTS $EXP $OUT > $test_name.failed
+	rm -f $test_name.tmp
+fi
+
+unset IMAGE FSCK_OPT OUT EXP
+
+else #if test -x $DEBUGFS_EXE; then
+	echo "$test_name: $test_description: skipped"
+fi


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 11/34] dumpe2fs: output cleanup
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (9 preceding siblings ...)
  2014-09-13 22:12 ` [PATCH 10/34] dumpe2fs: provide a machine-readable group-only mode Darrick J. Wong
@ 2014-09-13 22:12 ` Darrick J. Wong
  2014-09-19 16:22   ` Theodore Ts'o
  2014-09-13 22:12 ` [PATCH 12/34] misc: move check_plausibility into a separate file Darrick J. Wong
                   ` (22 subsequent siblings)
  33 siblings, 1 reply; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:12 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4, TR Reardon

Don't display unused inodes twice, and make it clear that we're
printing a descriptor checksum.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Cc: TR Reardon <thomas_reardon@hotmail.com>
---
 misc/dumpe2fs.c  |    8 +++-----
 tests/filter.sed |    3 ++-
 2 files changed, 5 insertions(+), 6 deletions(-)


diff --git a/misc/dumpe2fs.c b/misc/dumpe2fs.c
index 05dc3c5..39505a8 100644
--- a/misc/dumpe2fs.c
+++ b/misc/dumpe2fs.c
@@ -217,20 +217,18 @@ static void list_desc(ext2_filsys fs, int grp_only)
 			continue;
 		}
 
-		printf (_("Group %lu: (Blocks "), i);
+		printf(_("Group %lu: (Blocks "), i);
 		print_range(first_block, last_block);
 		fputs(")", stdout);
-		print_bg_opts(fs, i);
 		if (ext2fs_has_group_desc_csum(fs)) {
 			unsigned csum = ext2fs_bg_checksum(fs, i);
 			unsigned exp_csum = ext2fs_group_desc_csum(fs, i);
 
-			printf(_("  Checksum 0x%04x"), csum);
+			printf(_(" csum 0x%04x"), csum);
 			if (csum != exp_csum)
 				printf(_(" (EXPECTED 0x%04x)"), exp_csum);
-			printf(_(", unused inodes %u\n"),
-			       ext2fs_bg_itable_unused(fs, i));
 		}
+		print_bg_opts(fs, i);
 		has_super = ((i==0) || super_blk);
 		if (has_super) {
 			printf (_("  %s superblock at "),
diff --git a/tests/filter.sed b/tests/filter.sed
index 59fad4e..d9a336c 100644
--- a/tests/filter.sed
+++ b/tests/filter.sed
@@ -21,4 +21,5 @@ s/\\015//g
 /Reserved blocks uid:/s/ (user .*)//
 /Reserved blocks gid:/s/ (group .*)//
 /whichever comes first/d
-/^  Checksum /d
+s/, csum 0x\([0-9a-f]*\)//g
+s/ csum 0x\([0-9a-f]*\)//g


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 12/34] misc: move check_plausibility into a separate file
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (10 preceding siblings ...)
  2014-09-13 22:12 ` [PATCH 11/34] dumpe2fs: output cleanup Darrick J. Wong
@ 2014-09-13 22:12 ` Darrick J. Wong
  2014-09-19 22:16   ` Theodore Ts'o
  2014-09-13 22:12 ` [PATCH 13/34] misc: add plausibility checks to debugfs/tune2fs/dumpe2fs/e2fsck Darrick J. Wong
                   ` (21 subsequent siblings)
  33 siblings, 1 reply; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:12 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

Move check_plausibility() into a separate file so that various
programs can use it without having to declare useless global variables
that the util.c functions seem to require.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 lib/ext2fs/Makefile.in |   11 ++
 misc/Makefile.in       |   20 +++-
 misc/mke2fs.c          |    1 
 misc/plausible.c       |  232 ++++++++++++++++++++++++++++++++++++++++++++++++
 misc/plausible.h       |   28 ++++++
 misc/tune2fs.c         |    1 
 misc/util.c            |  197 -----------------------------------------
 misc/util.h            |   11 --
 8 files changed, 285 insertions(+), 216 deletions(-)
 create mode 100644 misc/plausible.c
 create mode 100644 misc/plausible.h


diff --git a/lib/ext2fs/Makefile.in b/lib/ext2fs/Makefile.in
index 45e733c..343d5d0 100644
--- a/lib/ext2fs/Makefile.in
+++ b/lib/ext2fs/Makefile.in
@@ -23,7 +23,7 @@ DEBUG_OBJS= debug_cmds.o extent_cmds.o tst_cmds.o debugfs.o util.o \
 	ncheck.o icheck.o ls.o lsdel.o dump.o set_fields.o logdump.o \
 	htree.o unused.o e2freefrag.o filefrag.o extent_inode.o zap.o \
 	xattrs.o quota.o tst_libext2fs.o create_inode.o journal.o \
-	revoke.o recovery.o do_journal.o
+	revoke.o recovery.o do_journal.o plausible.o
 
 DEBUG_SRCS= debug_cmds.c extent_cmds.c tst_cmds.c \
 	$(top_srcdir)/debugfs/debugfs.c \
@@ -47,7 +47,8 @@ DEBUG_SRCS= debug_cmds.c extent_cmds.c tst_cmds.c \
 	$(top_srcdir)/debugfs/journal.c \
 	$(top_srcdir)/e2fsck/revoke.c \
 	$(top_srcdir)/e2fsck/recovery.c \
-	$(top_srcdir)/debugfs/do_journal.c
+	$(top_srcdir)/debugfs/do_journal.c \
+	$(top_srcdir)/misc/plausible.c
 
 OBJS= $(DEBUGFS_LIB_OBJS) $(RESIZE_LIB_OBJS) $(E2IMAGE_LIB_OBJS) \
 	$(TEST_IO_LIB_OBJS) \
@@ -412,6 +413,10 @@ recovery.o: $(top_srcdir)/e2fsck/recovery.c
 	$(E) "	CC $<"
 	$(Q) $(CC) $(DEBUGFS_CFLAGS) -c $< -o $@
 
+plausible.o: $(top_srcdir)/misc/plausible.c
+	$(E) "	CC $<"
+	$(Q) $(CC) $(ALL_CFLAGS) -c $< -o $@
+
 do_journal.o: $(top_srcdir)/debugfs/do_journal.c
 	$(E) "	CC $<"
 	$(Q) $(CC) $(DEBUGFS_CFLAGS) -c $< -o $@
@@ -464,7 +469,7 @@ tst_libext2fs: $(DEBUG_OBJS) \
 	$(E) "	LD $@"
 	$(Q) $(CC) -o tst_libext2fs $(ALL_LDFLAGS) -DDEBUG $(DEBUG_OBJS) \
 		$(STATIC_LIBSS) $(STATIC_LIBE2P) $(LIBQUOTA) \
-		$(STATIC_LIBEXT2FS) $(LIBBLKID) $(LIBUUID) \
+		$(STATIC_LIBEXT2FS) $(LIBBLKID) $(LIBUUID) $(LIBMAGIC) \
 		$(STATIC_LIBCOM_ERR) $(SYSLIBS) -I $(top_srcdir)/debugfs
 
 tst_inline: $(srcdir)/inline.c $(STATIC_LIBEXT2FS) $(DEPSTATIC_LIBCOM_ERR)
diff --git a/misc/Makefile.in b/misc/Makefile.in
index 925846e..e49078b 100644
--- a/misc/Makefile.in
+++ b/misc/Makefile.in
@@ -40,10 +40,10 @@ UMANPAGES=	chattr.1 lsattr.1 @UUID_CMT@ uuidgen.1
 
 LPROGS=		@E2INITRD_PROG@
 
-TUNE2FS_OBJS=	tune2fs.o util.o
+TUNE2FS_OBJS=	tune2fs.o util.o plausible.o
 MKLPF_OBJS=	mklost+found.o
 MKE2FS_OBJS=	mke2fs.o util.o profile.o prof_err.o default_profile.o \
-			mk_hugefiles.o create_inode.o
+			mk_hugefiles.o create_inode.o plausible.o
 CHATTR_OBJS=	chattr.o
 LSATTR_OBJS=	lsattr.o
 UUIDGEN_OBJS=	uuidgen.o
@@ -59,11 +59,12 @@ E4DEFRAG_OBJS=	e4defrag.o
 E2FREEFRAG_OBJS= e2freefrag.o
 E2FUZZ_OBJS=	e2fuzz.o
 
-PROFILED_TUNE2FS_OBJS=	profiled/tune2fs.o profiled/util.o
+PROFILED_TUNE2FS_OBJS=	profiled/tune2fs.o profiled/util.o profiled/plausible.o
 PROFILED_MKLPF_OBJS=	profiled/mklost+found.o
 PROFILED_MKE2FS_OBJS=	profiled/mke2fs.o profiled/util.o profiled/profile.o \
 			profiled/prof_err.o profiled/default_profile.o \
-			profiled/mk_hugefiles.o profiled/create_inode.o
+			profiled/mk_hugefiles.o profiled/create_inode.o \
+			profiled/plausible.o
 
 PROFILED_CHATTR_OBJS=	profiled/chattr.o
 PROFILED_LSATTR_OBJS=	profiled/lsattr.o
@@ -86,7 +87,8 @@ SRCS=	$(srcdir)/tune2fs.c $(srcdir)/mklost+found.c $(srcdir)/mke2fs.c $(srcdir)/
 		$(srcdir)/uuidgen.c $(srcdir)/blkid.c $(srcdir)/logsave.c \
 		$(srcdir)/filefrag.c $(srcdir)/base_device.c \
 		$(srcdir)/ismounted.c $(srcdir)/../e2fsck/profile.c \
-		$(srcdir)/e2undo.c $(srcdir)/e2freefrag.c $(srcdir)/create_inode.c
+		$(srcdir)/e2undo.c $(srcdir)/e2freefrag.c $(srcdir)/create_inode.c \
+		$(srcdir)/plausible.c
 
 LIBS= $(LIBEXT2FS) $(LIBCOM_ERR)
 DEPLIBS= $(LIBEXT2FS) $(DEPLIBCOM_ERR)
@@ -698,6 +700,14 @@ badblocks.o: $(srcdir)/badblocks.c $(top_builddir)/lib/config.h \
 fsck.o: $(srcdir)/fsck.c $(top_builddir)/lib/config.h \
  $(top_builddir)/lib/dirpaths.h $(top_srcdir)/version.h \
  $(srcdir)/nls-enable.h $(srcdir)/fsck.h
+plausible.o: $(srcdir)/plausible.c $(top_builddir)/lib/config.h \
+ $(top_builddir)/lib/dirpaths.h $(top_srcdir)/lib/et/com_err.h \
+ $(top_srcdir)/lib/e2p/e2p.h $(top_srcdir)/lib/ext2fs/ext2_fs.h \
+ $(top_builddir)/lib/ext2fs/ext2_types.h $(top_srcdir)/lib/ext2fs/ext2fs.h \
+ $(top_srcdir)/lib/ext2fs/ext3_extents.h $(top_srcdir)/lib/ext2fs/ext2_io.h \
+ $(top_builddir)/lib/ext2fs/ext2_err.h \
+ $(top_srcdir)/lib/ext2fs/ext2_ext_attr.h $(top_srcdir)/lib/ext2fs/bitops.h \
+ $(srcdir)/nls-enable.h $(srcdir)/plausible.h
 util.o: $(srcdir)/util.c $(top_builddir)/lib/config.h \
  $(top_builddir)/lib/dirpaths.h $(top_srcdir)/lib/et/com_err.h \
  $(top_srcdir)/lib/e2p/e2p.h $(top_srcdir)/lib/ext2fs/ext2_fs.h \
diff --git a/misc/mke2fs.c b/misc/mke2fs.c
index 2bc435b..3a963d7 100644
--- a/misc/mke2fs.c
+++ b/misc/mke2fs.c
@@ -52,6 +52,7 @@ extern int optind;
 #include "ext2fs/ext2fsP.h"
 #include "uuid/uuid.h"
 #include "util.h"
+#include "plausible.h"
 #include "profile.h"
 #include "prof_err.h"
 #include "../version.h"
diff --git a/misc/plausible.c b/misc/plausible.c
new file mode 100644
index 0000000..2768e4b
--- /dev/null
+++ b/misc/plausible.c
@@ -0,0 +1,232 @@
+/*
+ * plausible.c --- Figure out if a pathname is ext* or something else.
+ *
+ * Copyright 2014, Oracle, Inc.
+ *
+ * Some parts are:
+ * Copyright 1995, 1996, 1997, 1998, 1999, 2000 by Theodore Ts'o.
+ *
+ * %Begin-Header%
+ * This file may be redistributed under the terms of the GNU Public
+ * License.
+ * %End-Header%
+ */
+
+#define _LARGEFILE_SOURCE
+#define _LARGEFILE64_SOURCE
+
+#include "config.h"
+#include <fcntl.h>
+#include <time.h>
+#ifdef HAVE_LINUX_MAJOR_H
+#include <linux/major.h>
+#endif
+#include <sys/types.h>
+#ifdef HAVE_SYS_STAT_H
+#include <sys/stat.h>
+#endif
+#ifdef HAVE_UNISTD_H
+#include <unistd.h>
+#endif
+#include "plausible.h"
+#include "ext2fs/ext2fs.h"
+#include "nls-enable.h"
+#include "blkid/blkid.h"
+
+static void print_ext2_info(const char *device)
+
+{
+	struct ext2_super_block	*sb;
+	ext2_filsys		fs;
+	errcode_t		retval;
+	time_t			tm;
+	char			buf[80];
+
+	retval = ext2fs_open2(device, 0, EXT2_FLAG_64BITS, 0, 0,
+			      unix_io_manager, &fs);
+	if (retval)
+		return;
+	sb = fs->super;
+
+	if (sb->s_mtime) {
+		tm = sb->s_mtime;
+		if (sb->s_last_mounted[0]) {
+			memset(buf, 0, sizeof(buf));
+			strncpy(buf, sb->s_last_mounted,
+				sizeof(sb->s_last_mounted));
+			printf(_("\tlast mounted on %s on %s"), buf,
+			       ctime(&tm));
+		} else
+			printf(_("\tlast mounted on %s"), ctime(&tm));
+	} else if (sb->s_mkfs_time) {
+		tm = sb->s_mkfs_time;
+		printf(_("\tcreated on %s"), ctime(&tm));
+	} else if (sb->s_wtime) {
+		tm = sb->s_wtime;
+		printf(_("\tlast modified on %s"), ctime(&tm));
+	}
+	ext2fs_close_free(&fs);
+}
+
+/*
+ * return 1 if there is no partition table, 0 if a partition table is
+ * detected, and -1 on an error.
+ */
+static int check_partition_table(const char *device)
+{
+#ifdef HAVE_BLKID_PROBE_ENABLE_PARTITIONS
+	blkid_probe pr;
+	const char *value;
+	int ret;
+
+	pr = blkid_new_probe_from_filename(device);
+	if (!pr)
+		return -1;
+
+	ret = blkid_probe_enable_partitions(pr, 1);
+	if (ret < 0)
+		goto errout;
+
+	ret = blkid_probe_enable_superblocks(pr, 0);
+	if (ret < 0)
+		goto errout;
+
+	ret = blkid_do_fullprobe(pr);
+	if (ret < 0)
+		goto errout;
+
+	ret = blkid_probe_lookup_value(pr, "PTTYPE", &value, NULL);
+	if (ret == 0)
+		fprintf(stderr, _("Found a %s partition table in %s\n"),
+			value, device);
+	else
+		ret = 1;
+
+errout:
+	blkid_free_probe(pr);
+	return ret;
+#else
+	return -1;
+#endif
+}
+
+/*
+ * return 1 if the device looks plausible, creating the file if necessary
+ */
+int check_plausibility(const char *device, int flags, int *ret_is_dev)
+{
+	int fd, ret, is_dev = 0;
+	ext2fs_struct_stat s;
+	int fl = O_RDONLY;
+	blkid_cache cache = NULL;
+	char *fs_type = NULL;
+	char *fs_label = NULL;
+
+	fd = ext2fs_open_file(device, fl, 0666);
+	if ((fd < 0) && (errno == ENOENT) && (flags & NO_SIZE)) {
+		fprintf(stderr, _("The file %s does not exist and no "
+				  "size was specified.\n"), device);
+		exit(1);
+	}
+	if ((fd < 0) && (errno == ENOENT) && (flags & CREATE_FILE)) {
+		fl |= O_CREAT;
+		fd = ext2fs_open_file(device, fl, 0666);
+		if (fd >= 0 && (flags & VERBOSE_CREATE))
+			printf(_("Creating regular file %s\n"), device);
+	}
+	if (fd < 0) {
+		fprintf(stderr, _("Could not open %s: %s\n"),
+			device, error_message(errno));
+		if (errno == ENOENT)
+			fputs(_("\nThe device apparently does not exist; "
+				"did you specify it correctly?\n"), stderr);
+		exit(1);
+	}
+
+	if (ext2fs_fstat(fd, &s) < 0) {
+		perror("stat");
+		exit(1);
+	}
+	close(fd);
+
+	if (S_ISBLK(s.st_mode))
+		is_dev = 1;
+#if defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
+	/* On FreeBSD, all disk devices are character specials */
+	if (S_ISCHR(s.st_mode))
+		is_dev = 1;
+#endif
+	if (ret_is_dev)
+		*ret_is_dev = is_dev;
+
+	if ((flags & CHECK_BLOCK_DEV) && !is_dev) {
+		printf(_("%s is not a block special device.\n"), device);
+		return 0;
+	}
+
+	/*
+	 * Note: we use the older-style blkid API's here because we
+	 * want as much functionality to be available when using the
+	 * internal blkid library, when e2fsprogs is compiled for
+	 * non-Linux systems that will probably not have the libraries
+	 * from util-linux available.  We only use the newer
+	 * blkid-probe interfaces to access functionality not
+	 * available in the original blkid library.
+	 */
+	if ((flags & CHECK_FS_EXIST) && blkid_get_cache(&cache, NULL) >= 0) {
+		fs_type = blkid_get_tag_value(cache, "TYPE", device);
+		if (fs_type)
+			fs_label = blkid_get_tag_value(cache, "LABEL", device);
+		blkid_put_cache(cache);
+	}
+
+	if (fs_type) {
+		if (fs_label)
+			printf(_("%s contains a %s file system "
+				 "labelled '%s'\n"), device, fs_type, fs_label);
+		else
+			printf(_("%s contains a %s file system\n"), device,
+			       fs_type);
+		if (strncmp(fs_type, "ext", 3) == 0)
+			print_ext2_info(device);
+		free(fs_type);
+		free(fs_label);
+		return 0;
+	}
+
+	ret = check_partition_table(device);
+	if (ret >= 0)
+		return ret;
+
+#ifdef HAVE_LINUX_MAJOR_H
+#ifndef MAJOR
+#define MAJOR(dev)	((dev)>>8)
+#define MINOR(dev)	((dev) & 0xff)
+#endif
+#ifndef SCSI_BLK_MAJOR
+#ifdef SCSI_DISK0_MAJOR
+#ifdef SCSI_DISK8_MAJOR
+#define SCSI_DISK_MAJOR(M) ((M) == SCSI_DISK0_MAJOR || \
+	((M) >= SCSI_DISK1_MAJOR && (M) <= SCSI_DISK7_MAJOR) || \
+	((M) >= SCSI_DISK8_MAJOR && (M) <= SCSI_DISK15_MAJOR))
+#else
+#define SCSI_DISK_MAJOR(M) ((M) == SCSI_DISK0_MAJOR || \
+	((M) >= SCSI_DISK1_MAJOR && (M) <= SCSI_DISK7_MAJOR))
+#endif /* defined(SCSI_DISK8_MAJOR) */
+#define SCSI_BLK_MAJOR(M) (SCSI_DISK_MAJOR((M)) || (M) == SCSI_CDROM_MAJOR)
+#else
+#define SCSI_BLK_MAJOR(M)  ((M) == SCSI_DISK_MAJOR || (M) == SCSI_CDROM_MAJOR)
+#endif /* defined(SCSI_DISK0_MAJOR) */
+#endif /* defined(SCSI_BLK_MAJOR) */
+	if (((MAJOR(s.st_rdev) == HD_MAJOR &&
+	      MINOR(s.st_rdev)%64 == 0) ||
+	     (SCSI_BLK_MAJOR(MAJOR(s.st_rdev)) &&
+	      MINOR(s.st_rdev)%16 == 0))) {
+		printf(_("%s is entire device, not just one partition!\n"),
+		       device);
+		return 0;
+	}
+#endif
+	return 1;
+}
+
diff --git a/misc/plausible.h b/misc/plausible.h
new file mode 100644
index 0000000..594e4b1
--- /dev/null
+++ b/misc/plausible.h
@@ -0,0 +1,28 @@
+/*
+ * plausible.h --- header file defining prototypes for helper functions
+ * used by tune2fs and mke2fs
+ *
+ * Copyright 2014 by Oracle, Inc.
+ *
+ * %Begin-Header%
+ * This file may be redistributed under the terms of the GNU Public
+ * License.
+ * %End-Header%
+ */
+
+#ifndef PLAUSIBLE_H_
+#define PLAUSIBLE_H_
+
+/*
+ * Flags for check_plausibility()
+ */
+#define CHECK_BLOCK_DEV	0x0001
+#define CREATE_FILE	0x0002
+#define CHECK_FS_EXIST	0x0004
+#define VERBOSE_CREATE	0x0008
+#define NO_SIZE		0x0010
+
+extern int check_plausibility(const char *device, int flags,
+			      int *ret_is_dev);
+
+#endif /* PLAUSIBLE_H_ */
diff --git a/misc/tune2fs.c b/misc/tune2fs.c
index 510e936..c454b84 100644
--- a/misc/tune2fs.c
+++ b/misc/tune2fs.c
@@ -59,6 +59,7 @@ extern int optind;
 #include "e2p/e2p.h"
 #include "jfs_user.h"
 #include "util.h"
+#include "plausible.h"
 #include "blkid/blkid.h"
 #include "quota/quotaio.h"
 
diff --git a/misc/util.c b/misc/util.c
index 2898830..f906339 100644
--- a/misc/util.c
+++ b/misc/util.c
@@ -108,203 +108,6 @@ void proceed_question(int delay)
 	signal(SIGALRM, SIG_IGN);
 }
 
-static void print_ext2_info(const char *device)
-
-{
-	struct ext2_super_block	*sb;
-	ext2_filsys		fs;
-	errcode_t		retval;
-	time_t 			tm;
-	char			buf[80];
-
-	retval = ext2fs_open2(device, 0, EXT2_FLAG_64BITS, 0, 0,
-			      unix_io_manager, &fs);
-	if (retval)
-		return;
-	sb = fs->super;
-
-	if (sb->s_mtime) {
-		tm = sb->s_mtime;
-		if (sb->s_last_mounted[0]) {
-			memset(buf, 0, sizeof(buf));
-			strncpy(buf, sb->s_last_mounted,
-				sizeof(sb->s_last_mounted));
-			printf(_("\tlast mounted on %s on %s"), buf,
-			       ctime(&tm));
-		} else
-			printf(_("\tlast mounted on %s"), ctime(&tm));
-	} else if (sb->s_mkfs_time) {
-		tm = sb->s_mkfs_time;
-		printf(_("\tcreated on %s"), ctime(&tm));
-	} else if (sb->s_wtime) {
-		tm = sb->s_wtime;
-		printf(_("\tlast modified on %s"), ctime(&tm));
-	}
-	ext2fs_close_free(&fs);
-}
-
-/*
- * return 1 if there is no partition table, 0 if a partition table is
- * detected, and -1 on an error.
- */
-static int check_partition_table(const char *device)
-{
-#ifdef HAVE_BLKID_PROBE_ENABLE_PARTITIONS
-	blkid_probe pr;
-	const char *value;
-	int ret;
-
-	pr = blkid_new_probe_from_filename(device);
-	if (!pr)
-		return -1;
-
-        ret = blkid_probe_enable_partitions(pr, 1);
-        if (ret < 0)
-		goto errout;
-
-	ret = blkid_probe_enable_superblocks(pr, 0);
-	if (ret < 0)
-		goto errout;
-
-	ret = blkid_do_fullprobe(pr);
-	if (ret < 0)
-		goto errout;
-
-	ret = blkid_probe_lookup_value(pr, "PTTYPE", &value, NULL);
-	if (ret == 0)
-		fprintf(stderr, _("Found a %s partition table in %s\n"),
-			value, device);
-	else
-		ret = 1;
-
-errout:
-	blkid_free_probe(pr);
-	return ret;
-#else
-	return -1;
-#endif
-}
-
-/*
- * return 1 if the device looks plausible, creating the file if necessary
- */
-int check_plausibility(const char *device, int flags, int *ret_is_dev)
-{
-	int fd, ret, is_dev = 0;
-	ext2fs_struct_stat s;
-	int fl = O_RDONLY;
-	blkid_cache cache = NULL;
-	char *fs_type = NULL;
-	char *fs_label = NULL;
-
-	fd = ext2fs_open_file(device, fl, 0666);
-	if ((fd < 0) && (errno == ENOENT) && (flags & NO_SIZE)) {
-		fprintf(stderr, _("The file %s does not exist and no "
-				  "size was specified.\n"), device);
-		exit(1);
-	}
-	if ((fd < 0) && (errno == ENOENT) && (flags & CREATE_FILE)) {
-		fl |= O_CREAT;
-		fd = ext2fs_open_file(device, fl, 0666);
-		if (fd >= 0 && (flags & VERBOSE_CREATE))
-			printf(_("Creating regular file %s\n"), device);
-	}
-	if (fd < 0) {
-		fprintf(stderr, _("Could not open %s: %s\n"),
-			device, error_message(errno));
-		if (errno == ENOENT)
-			fputs(_("\nThe device apparently does not exist; "
-				"did you specify it correctly?\n"), stderr);
-		exit(1);
-	}
-
-	if (ext2fs_fstat(fd, &s) < 0) {
-		perror("stat");
-		exit(1);
-	}
-	close(fd);
-
-	if (S_ISBLK(s.st_mode))
-		is_dev = 1;
-#if defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
-	/* On FreeBSD, all disk devices are character specials */
-	if (S_ISCHR(s.st_mode))
-		is_dev = 1;
-#endif
-	if (ret_is_dev)
-		*ret_is_dev = is_dev;
-
-	if ((flags & CHECK_BLOCK_DEV) && !is_dev) {
-		printf(_("%s is not a block special device.\n"), device);
-		return 0;
-	}
-
-	/*
-	 * Note: we use the older-style blkid API's here because we
-	 * want as much functionality to be available when using the
-	 * internal blkid library, when e2fsprogs is compiled for
-	 * non-Linux systems that will probably not have the libraries
-	 * from util-linux available.  We only use the newer
-	 * blkid-probe interfaces to access functionality not
-	 * available in the original blkid library.
-	 */
-	if ((flags & CHECK_FS_EXIST) && blkid_get_cache(&cache, NULL) >= 0) {
-		fs_type = blkid_get_tag_value(cache, "TYPE", device);
-		if (fs_type)
-			fs_label = blkid_get_tag_value(cache, "LABEL", device);
-		blkid_put_cache(cache);
-	}
-
-	if (fs_type) {
-		if (fs_label)
-			printf(_("%s contains a %s file system "
-				 "labelled '%s'\n"), device, fs_type, fs_label);
-		else
-			printf(_("%s contains a %s file system\n"), device,
-			       fs_type);
-		if (strncmp(fs_type, "ext", 3) == 0)
-			print_ext2_info(device);
-		free(fs_type);
-		free(fs_label);
-		return 0;
-	}
-
-	ret = check_partition_table(device);
-	if (ret >= 0)
-		return ret;
-
-#ifdef HAVE_LINUX_MAJOR_H
-#ifndef MAJOR
-#define MAJOR(dev)	((dev)>>8)
-#define MINOR(dev)	((dev) & 0xff)
-#endif
-#ifndef SCSI_BLK_MAJOR
-#ifdef SCSI_DISK0_MAJOR
-#ifdef SCSI_DISK8_MAJOR
-#define SCSI_DISK_MAJOR(M) ((M) == SCSI_DISK0_MAJOR || \
-  ((M) >= SCSI_DISK1_MAJOR && (M) <= SCSI_DISK7_MAJOR) || \
-  ((M) >= SCSI_DISK8_MAJOR && (M) <= SCSI_DISK15_MAJOR))
-#else
-#define SCSI_DISK_MAJOR(M) ((M) == SCSI_DISK0_MAJOR || \
-  ((M) >= SCSI_DISK1_MAJOR && (M) <= SCSI_DISK7_MAJOR))
-#endif /* defined(SCSI_DISK8_MAJOR) */
-#define SCSI_BLK_MAJOR(M) (SCSI_DISK_MAJOR((M)) || (M) == SCSI_CDROM_MAJOR)
-#else
-#define SCSI_BLK_MAJOR(M)  ((M) == SCSI_DISK_MAJOR || (M) == SCSI_CDROM_MAJOR)
-#endif /* defined(SCSI_DISK0_MAJOR) */
-#endif /* defined(SCSI_BLK_MAJOR) */
-	if (((MAJOR(s.st_rdev) == HD_MAJOR &&
-	      MINOR(s.st_rdev)%64 == 0) ||
-	     (SCSI_BLK_MAJOR(MAJOR(s.st_rdev)) &&
-	      MINOR(s.st_rdev)%16 == 0))) {
-		printf(_("%s is entire device, not just one partition!\n"),
-		       device);
-		return 0;
-	}
-#endif
-	return 1;
-}
-
 void check_mount(const char *device, int force, const char *type)
 {
 	errcode_t	retval;
diff --git a/misc/util.h b/misc/util.h
index f3827dd..49b4b9c 100644
--- a/misc/util.h
+++ b/misc/util.h
@@ -15,22 +15,11 @@ extern int	 journal_flags;
 extern char	*journal_device;
 extern char	*journal_location_string;
 
-/*
- * Flags for check_plausibility()
- */
-#define CHECK_BLOCK_DEV	0x0001
-#define CREATE_FILE	0x0002
-#define CHECK_FS_EXIST	0x0004
-#define VERBOSE_CREATE	0x0008
-#define NO_SIZE		0x0010

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 13/34] misc: add plausibility checks to debugfs/tune2fs/dumpe2fs/e2fsck
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (11 preceding siblings ...)
  2014-09-13 22:12 ` [PATCH 12/34] misc: move check_plausibility into a separate file Darrick J. Wong
@ 2014-09-13 22:12 ` Darrick J. Wong
  2014-09-19 23:00   ` Theodore Ts'o
  2014-09-13 22:12 ` [PATCH 14/34] misc: use libmagic when libblkid can't identify something Darrick J. Wong
                   ` (20 subsequent siblings)
  33 siblings, 1 reply; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:12 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

If any of these utilities detect a bad superblock magic, call
check_plausibility to see if blkid can identify the passed-in argument
as something else (xfs, partition, etc.) in the hopes of catching a
user error.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 debugfs/Makefile.in                 |   23 ++++++++++++++++++----
 debugfs/debugfs.c                   |    3 +++
 e2fsck/Makefile.in                  |   19 ++++++++++++++++--
 e2fsck/problem.c                    |    2 +-
 e2fsck/unix.c                       |    7 ++++++-
 misc/Makefile.in                    |   33 ++++++++++++++++++--------------
 misc/dumpe2fs.c                     |    3 +++
 misc/e2image.c                      |    3 +++
 misc/tune2fs.c                      |    2 ++
 tests/f_detect_xfs/expect           |   25 ++++++++++++++++++++++++
 tests/f_detect_xfs/expect.nodebugfs |   23 ++++++++++++++++++++++
 tests/f_detect_xfs/image.bz2        |  Bin
 tests/f_detect_xfs/name             |    1 +
 tests/f_detect_xfs/script           |   36 +++++++++++++++++++++++++++++++++++
 14 files changed, 156 insertions(+), 24 deletions(-)
 create mode 100644 tests/f_detect_xfs/expect
 create mode 100644 tests/f_detect_xfs/expect.nodebugfs
 create mode 100644 tests/f_detect_xfs/image.bz2
 create mode 100644 tests/f_detect_xfs/name
 create mode 100755 tests/f_detect_xfs/script


diff --git a/debugfs/Makefile.in b/debugfs/Makefile.in
index 6220943..b33f73b 100644
--- a/debugfs/Makefile.in
+++ b/debugfs/Makefile.in
@@ -19,11 +19,12 @@ MK_CMDS=	_SS_DIR_OVERRIDE=../lib/ss ../lib/ss/mk_cmds
 DEBUG_OBJS= debug_cmds.o debugfs.o util.o ncheck.o icheck.o ls.o \
 	lsdel.o dump.o set_fields.o logdump.o htree.o unused.o e2freefrag.o \
 	filefrag.o extent_cmds.o extent_inode.o zap.o create_inode.o \
-	quota.o xattrs.o journal.o revoke.o recovery.o do_journal.o
+	quota.o xattrs.o journal.o revoke.o recovery.o do_journal.o \
+	plausible.o
 
 RO_DEBUG_OBJS= ro_debug_cmds.o ro_debugfs.o util.o ncheck.o icheck.o ls.o \
 	lsdel.o logdump.o htree.o e2freefrag.o filefrag.o extent_cmds.o \
-	extent_inode.o quota.o xattrs.o
+	extent_inode.o quota.o xattrs.o ../misc/plausible.o
 
 SRCS= debug_cmds.c $(srcdir)/debugfs.c $(srcdir)/util.c $(srcdir)/ls.c \
 	$(srcdir)/ncheck.c $(srcdir)/icheck.c $(srcdir)/lsdel.c \
@@ -32,7 +33,8 @@ SRCS= debug_cmds.c $(srcdir)/debugfs.c $(srcdir)/util.c $(srcdir)/ls.c \
 	$(srcdir)/filefrag.c $(srcdir)/extent_inode.c $(srcdir)/zap.c \
 	$(srcdir)/../misc/create_inode.c $(srcdir)/xattrs.c $(srcdir)/quota.c \
 	$(srcdir)/journal.c $(srcdir)/../e2fsck/revoke.c \
-	$(srcdir)/../e2fsck/recovery.c $(srcdir)/do_journal.c
+	$(srcdir)/../e2fsck/recovery.c $(srcdir)/do_journal.c \
+	$(srcdir)/../misc/plausible.c
 
 LIBS= $(LIBQUOTA) $(LIBEXT2FS) $(LIBE2P) $(LIBSS) $(LIBCOM_ERR) $(LIBBLKID) \
 	$(LIBUUID) $(SYSLIBS) $(LIBINTL)
@@ -59,6 +61,10 @@ DEPEND_CFLAGS = -I$(srcdir)
 
 all:: $(PROGS) $(MANPAGES)
 
+plausible.o: $(top_srcdir)/misc/plausible.c
+	$(E) "	CC $<"
+	$(Q) $(CC) $(ALL_CFLAGS) -c $< -o $@
+
 debugfs: $(DEBUG_OBJS) $(DEPLIBS)
 	$(E) "	LD $@"
 	$(Q) $(CC) $(ALL_LDFLAGS) -o debugfs $(DEBUG_OBJS) $(LIBS)
@@ -172,7 +178,8 @@ debugfs.o: $(srcdir)/debugfs.c $(top_builddir)/lib/config.h \
  $(top_srcdir)/lib/quota/dqblk_v2.h $(top_srcdir)/lib/quota/quotaio_tree.h \
  $(top_srcdir)/lib/../e2fsck/dict.h $(top_srcdir)/version.h \
  $(srcdir)/../e2fsck/jfs_user.h $(top_srcdir)/lib/ext2fs/kernel-jbd.h \
- $(top_srcdir)/lib/ext2fs/jfs_compat.h $(top_srcdir)/lib/ext2fs/kernel-list.h
+ $(top_srcdir)/lib/ext2fs/jfs_compat.h $(top_srcdir)/lib/ext2fs/kernel-list.h \
+ $(srcdir)/../misc/plausible.h
 util.o: $(srcdir)/util.c $(top_builddir)/lib/config.h \
  $(top_builddir)/lib/dirpaths.h $(top_srcdir)/lib/ss/ss.h \
  $(top_builddir)/lib/ss/ss_err.h $(top_srcdir)/lib/et/com_err.h \
@@ -398,3 +405,11 @@ do_journal.o: $(srcdir)/do_journal.c $(top_builddir)/lib/config.h \
  $(top_srcdir)/lib/../e2fsck/dict.h $(srcdir)/../e2fsck/jfs_user.h \
  $(top_srcdir)/lib/ext2fs/kernel-jbd.h $(top_srcdir)/lib/ext2fs/jfs_compat.h \
  $(top_srcdir)/lib/ext2fs/kernel-list.h
+plausible.o: $(srcdir)/../misc/plausible.c $(top_builddir)/lib/config.h \
+ $(top_builddir)/lib/dirpaths.h $(top_srcdir)/lib/et/com_err.h \
+ $(top_srcdir)/lib/e2p/e2p.h $(top_srcdir)/lib/ext2fs/ext2_fs.h \
+ $(top_builddir)/lib/ext2fs/ext2_types.h $(top_srcdir)/lib/ext2fs/ext2fs.h \
+ $(top_srcdir)/lib/ext2fs/ext3_extents.h $(top_srcdir)/lib/ext2fs/ext2_io.h \
+ $(top_builddir)/lib/ext2fs/ext2_err.h \
+ $(top_srcdir)/lib/ext2fs/ext2_ext_attr.h $(top_srcdir)/lib/ext2fs/bitops.h \
+ $(srcdir)/../misc/nls-enable.h $(srcdir)/../misc/plausible.h
diff --git a/debugfs/debugfs.c b/debugfs/debugfs.c
index 0d8e9e8..db85028 100644
--- a/debugfs/debugfs.c
+++ b/debugfs/debugfs.c
@@ -34,6 +34,7 @@ extern char *optarg;
 
 #include "../version.h"
 #include "jfs_user.h"
+#include "../misc/plausible.h"
 
 #ifndef BUFSIZ
 #define BUFSIZ 8192
@@ -87,6 +88,8 @@ static void open_filesystem(char *device, int open_flags, blk64_t superblock,
 			     unix_io_manager, &current_fs);
 	if (retval) {
 		com_err(device, retval, "while opening filesystem");
+		if (retval == EXT2_ET_BAD_MAGIC)
+			check_plausibility(device, CHECK_FS_EXIST, NULL);
 		current_fs = NULL;
 		return;
 	}
diff --git a/e2fsck/Makefile.in b/e2fsck/Makefile.in
index 8d7e769..1afd15f 100644
--- a/e2fsck/Makefile.in
+++ b/e2fsck/Makefile.in
@@ -62,7 +62,7 @@ OBJS= dict.o unix.o e2fsck.o super.o pass1.o pass1b.o pass2.o \
 	pass3.o pass4.o pass5.o journal.o badblocks.o util.o dirinfo.o \
 	dx_dirinfo.o ehandler.o problem.o message.o quota.o recovery.o \
 	region.o revoke.o ea_refcount.o rehash.o profile.o prof_err.o \
-	logfile.o sigcatcher.o readahead.o $(MTRACE_OBJ)
+	logfile.o sigcatcher.o readahead.o $(MTRACE_OBJ) plausible.o
 
 PROFILED_OBJS= profiled/dict.o profiled/unix.o profiled/e2fsck.o \
 	profiled/super.o profiled/pass1.o profiled/pass1b.o \
@@ -73,7 +73,7 @@ PROFILED_OBJS= profiled/dict.o profiled/unix.o profiled/e2fsck.o \
 	profiled/recovery.o profiled/region.o profiled/revoke.o \
 	profiled/ea_refcount.o profiled/rehash.o profiled/profile.o \
 	profiled/prof_err.o profiled/logfile.o \
-	profiled/sigcatcher.o profiled/readahead.o
+	profiled/sigcatcher.o profiled/readahead.o profiled/plausible.o
 
 SRCS= $(srcdir)/e2fsck.c \
 	$(srcdir)/dict.c \
@@ -104,12 +104,17 @@ SRCS= $(srcdir)/e2fsck.c \
 	$(srcdir)/logfile.c \
 	prof_err.c \
 	$(srcdir)/quota.c \
+	$(srcdir)/../misc/plausible.c \
 	$(MTRACE_SRC)
 
 all:: profiled $(PROGS) e2fsck $(MANPAGES) $(FMANPAGES)
 
 @PROFILE_CMT@all:: e2fsck.profiled
 
+plausible.o: $(top_srcdir)/misc/plausible.c
+	$(E) "	CC $<"
+	$(Q) $(CC) $(ALL_CFLAGS) -c $< -o $@
+
 prof_err.c prof_err.h: prof_err.et
 	$(E) "	COMPILE_ET prof_err.et"
 	$(Q) $(COMPILE_ET) $(srcdir)/prof_err.et
@@ -411,7 +416,7 @@ unix.o: $(srcdir)/unix.c $(top_builddir)/lib/config.h \
  $(srcdir)/profile.h prof_err.h $(top_srcdir)/lib/quota/quotaio.h \
  $(top_srcdir)/lib/quota/dqblk_v2.h $(top_srcdir)/lib/quota/quotaio_tree.h \
  $(top_srcdir)/lib/../e2fsck/dict.h $(srcdir)/problem.h \
- $(top_srcdir)/version.h
+ $(top_srcdir)/version.h $(srcdir)/../misc/plausible.h
 dirinfo.o: $(srcdir)/dirinfo.c $(top_builddir)/lib/config.h \
  $(top_builddir)/lib/dirpaths.h $(srcdir)/e2fsck.h \
  $(top_srcdir)/lib/ext2fs/ext2_fs.h $(top_builddir)/lib/ext2fs/ext2_types.h \
@@ -529,3 +534,11 @@ quota.o: $(srcdir)/quota.c $(top_builddir)/lib/config.h \
 readahead.o: $(srcdir)/readahead.c $(top_builddir)/lib/config.h \
  $(top_srcdir)/lib/ext2fs/ext2fs.h $(top_srcdir)/lib/ext2fs/ext2_fs.h \
  $(top_builddir)/lib/ext2fs/ext2_err.h $(srcdir)/e2fsck.h prof_err.h
+plausible.o: $(srcdir)/../misc/plausible.c $(top_builddir)/lib/config.h \
+ $(top_builddir)/lib/dirpaths.h $(top_srcdir)/lib/et/com_err.h \
+ $(top_srcdir)/lib/e2p/e2p.h $(top_srcdir)/lib/ext2fs/ext2_fs.h \
+ $(top_builddir)/lib/ext2fs/ext2_types.h $(top_srcdir)/lib/ext2fs/ext2fs.h \
+ $(top_srcdir)/lib/ext2fs/ext3_extents.h $(top_srcdir)/lib/ext2fs/ext2_io.h \
+ $(top_builddir)/lib/ext2fs/ext2_err.h \
+ $(top_srcdir)/lib/ext2fs/ext2_ext_attr.h $(top_srcdir)/lib/ext2fs/bitops.h \
+ $(srcdir)/../misc/nls-enable.h $(srcdir)/../misc/plausible.h
diff --git a/e2fsck/problem.c b/e2fsck/problem.c
index 174f45a..a4da64b 100644
--- a/e2fsck/problem.c
+++ b/e2fsck/problem.c
@@ -126,7 +126,7 @@ static struct e2fsck_problem problem_table[] = {
 	  "    e2fsck -b 8193 <@v>\n"
 	  " or\n"
 	  "    e2fsck -b 32768 <@v>\n\n"),
-	  PROMPT_NONE, PR_FATAL },
+	  PROMPT_NONE, 0 },
 
 	/* Filesystem size is wrong */
 	{ PR_0_FS_SIZE_WRONG,
diff --git a/e2fsck/unix.c b/e2fsck/unix.c
index 6b0ca96..b3338ab 100644
--- a/e2fsck/unix.c
+++ b/e2fsck/unix.c
@@ -52,6 +52,7 @@ extern int optind;
 #include "e2fsck.h"
 #include "problem.h"
 #include "../version.h"
+#include "../misc/plausible.h"
 
 /* Command line options */
 static int cflag;		/* check disk */
@@ -1410,8 +1411,12 @@ failure:
 					     "-n option to do a read-only\n"
 					     "check of the device.\n"));
 #endif
-		else
+		else {
 			fix_problem(ctx, PR_0_SB_CORRUPT, &pctx);
+			if (retval == EXT2_ET_BAD_MAGIC)
+				check_plausibility(ctx->filesystem_name,
+						   CHECK_FS_EXIST, NULL);
+		}
 		fatal_error(ctx, 0);
 	}
 	/*
diff --git a/misc/Makefile.in b/misc/Makefile.in
index e49078b..bdeaa49 100644
--- a/misc/Makefile.in
+++ b/misc/Makefile.in
@@ -48,9 +48,9 @@ CHATTR_OBJS=	chattr.o
 LSATTR_OBJS=	lsattr.o
 UUIDGEN_OBJS=	uuidgen.o
 UUIDD_OBJS=	uuidd.o
-DUMPE2FS_OBJS=	dumpe2fs.o
+DUMPE2FS_OBJS=	dumpe2fs.o plausible.o
 BADBLOCKS_OBJS=	badblocks.o
-E2IMAGE_OBJS=	e2image.o
+E2IMAGE_OBJS=	e2image.o plausible.o
 FSCK_OBJS=	fsck.o base_device.o ismounted.o
 BLKID_OBJS=	blkid.o
 FILEFRAG_OBJS=	filefrag.o
@@ -70,7 +70,7 @@ PROFILED_CHATTR_OBJS=	profiled/chattr.o
 PROFILED_LSATTR_OBJS=	profiled/lsattr.o
 PROFILED_UUIDGEN_OBJS=	profiled/uuidgen.o
 PROFILED_UUIDD_OBJS=	profiled/uuidd.o
-PROFILED_DUMPE2FS_OBJS=	profiled/dumpe2fs.o
+PROFILED_DUMPE2FS_OBJS=	profiled/dumpe2fs.o profiled/plausible.o
 PROFILED_BADBLOCKS_OBJS=	profiled/badblocks.o
 PROFILED_E2IMAGE_OBJS=	profiled/e2image.o
 PROFILED_FSCK_OBJS=	profiled/fsck.o profiled/base_device.o \
@@ -165,13 +165,14 @@ tune2fs: $(TUNE2FS_OBJS) $(DEPLIBS) $(DEPLIBS_E2P) $(DEPLIBBLKID) \
 	$(E) "	LD $@"
 	$(Q) $(CC) $(ALL_LDFLAGS) -o tune2fs $(TUNE2FS_OBJS) $(LIBS) \
 		$(LIBBLKID) $(LIBUUID) $(LIBQUOTA) $(LIBEXT2FS) $(LIBS_E2P) \
-		$(LIBINTL) $(SYSLIBS)
+		$(LIBINTL) $(SYSLIBS) $(LIBBLKID)
 
 tune2fs.static: $(TUNE2FS_OBJS) $(STATIC_DEPLIBS) $(STATIC_LIBE2P) $(DEPSTATIC_LIBBLKID)
 	$(E) "	LD $@"
 	$(Q) $(CC) $(LDFLAGS_STATIC) -o tune2fs.static $(TUNE2FS_OBJS) \
 		$(STATIC_LIBS) $(STATIC_LIBBLKID) $(STATIC_LIBUUID) \
-		$(STATIC_LIBQUOTA) $(STATIC_LIBE2P) $(LIBINTL) $(SYSLIBS)
+		$(STATIC_LIBQUOTA) $(STATIC_LIBE2P) $(LIBINTL) $(SYSLIBS) \
+		$(STATIC_LIBBLKID)
 
 tune2fs.profiled: $(TUNE2FS_OBJS) $(PROFILED_DEPLIBS) \
 		$(PROFILED_E2P) $(DEPPROFILED_LIBBLKID) $(DEPPROFILED_LIBUUID) \
@@ -180,7 +181,7 @@ tune2fs.profiled: $(TUNE2FS_OBJS) $(PROFILED_DEPLIBS) \
 	$(Q) $(CC) $(ALL_LDFLAGS) -g -pg -o tune2fs.profiled \
 		$(PROFILED_TUNE2FS_OBJS) $(PROFILED_LIBBLKID) \
 		$(PROFILED_LIBUUID) $(PROFILED_LIBQUOTA) $(PROFILED_LIBE2P) \
-		$(LIBINTL) $(PROFILED_LIBS) $(SYSLIBS)
+		$(LIBINTL) $(PROFILED_LIBS) $(SYSLIBS) $(PROFILED_LIBBLKID)
 
 blkid: $(BLKID_OBJS) $(DEPLIBBLKID) $(LIBEXT2FS)
 	$(E) "	LD $@"
@@ -198,15 +199,16 @@ blkid.profiled: $(BLKID_OBJS) $(DEPPROFILED_LIBBLKID) \
 	$(Q) $(CC) $(ALL_LDFLAGS) -g -pg -o blkid.profiled $(PROFILED_BLKID_OBJS) \
 		$(PROFILED_LIBBLKID) $(LIBINTL) $(PROFILED_LIBEXT2FS) $(SYSLIBS)
 
-e2image: $(E2IMAGE_OBJS) $(DEPLIBS)
+e2image: $(E2IMAGE_OBJS) $(DEPLIBS) $(DEPLIBBLKID)
 	$(E) "	LD $@"
 	$(Q) $(CC) $(ALL_LDFLAGS) -o e2image $(E2IMAGE_OBJS) $(LIBS) \
-		$(LIBINTL) $(SYSLIBS)
+		$(LIBINTL) $(SYSLIBS) $(LIBBLKID)
 
-e2image.profiled: $(E2IMAGE_OBJS) $(PROFILED_DEPLIBS)
+e2image.profiled: $(E2IMAGE_OBJS) $(PROFILED_DEPLIBS) $(DEPLIBBLKID)
 	$(E) "	LD $@"
 	$(Q) $(CC) $(ALL_LDFLAGS) -g -pg -o e2image.profiled \
-		$(PROFILED_E2IMAGE_OBJS) $(PROFILED_LIBS) $(LIBINTL) $(SYSLIBS)
+		$(PROFILED_E2IMAGE_OBJS) $(PROFILED_LIBS) $(LIBINTL) $(SYSLIBS) \
+		$(LIBBLKID)
 
 e2undo: $(E2UNDO_OBJS) $(DEPLIBS)
 	$(E) "	LD $@"
@@ -296,17 +298,18 @@ uuidd.profiled: $(UUIDD_OBJS) $(PROFILED_DEPLIBUUID)
 	$(Q) $(CC) $(ALL_LDFLAGS) -g -pg -o uuidd.profiled $(PROFILED_UUIDD_OBJS) \
 		$(PROFILED_LIBUUID) $(LIBINTL) $(SYSLIBS)
 
-dumpe2fs: $(DUMPE2FS_OBJS) $(DEPLIBS) $(DEPLIBS_E2P) $(DEPLIBUUID)
+dumpe2fs: $(DUMPE2FS_OBJS) $(DEPLIBS) $(DEPLIBS_E2P) $(DEPLIBUUID) $(DEPLIBBLKID)
 	$(E) "	LD $@"
 	$(Q) $(CC) $(ALL_LDFLAGS) -o dumpe2fs $(DUMPE2FS_OBJS) $(LIBS) \
-		$(LIBS_E2P) $(LIBUUID) $(LIBINTL) $(SYSLIBS)
+		$(LIBS_E2P) $(LIBUUID) $(LIBINTL) $(SYSLIBS) $(LIBBLKID)
 
 dumpe2fs.profiled: $(DUMPE2FS_OBJS) $(PROFILED_DEPLIBS) \
-		$(PROFILED_LIBE2P) $(PROFILED_DEPLIBUUID)
+		$(PROFILED_LIBE2P) $(PROFILED_DEPLIBUUID) $(PROFILED_DEPLIBBLKID)
 	$(E) "	LD $@"
 	$(Q) $(CC) $(ALL_LDFLAGS) -g -pg -o dumpe2fs.profiled \
 		$(PROFILED_DUMPE2FS_OBJS) $(PROFILED_LIBS) \
-		$(PROFILED_LIBE2P) $(PROFILED_LIBUUID) $(LIBINTL) $(SYSLIBS)
+		$(PROFILED_LIBE2P) $(PROFILED_LIBUUID) $(LIBINTL) $(SYSLIBS) \
+		$(PROFILED_LIBBLKID)
 
 fsck: $(FSCK_OBJS) $(DEPLIBBLKID)
 	$(E) "	LD $@"
@@ -688,7 +691,7 @@ dumpe2fs.o: $(srcdir)/dumpe2fs.c $(top_builddir)/lib/config.h \
  $(top_srcdir)/lib/e2p/e2p.h $(srcdir)/jfs_user.h \
  $(top_srcdir)/lib/ext2fs/kernel-jbd.h $(top_srcdir)/lib/ext2fs/jfs_compat.h \
  $(top_srcdir)/lib/ext2fs/kernel-list.h $(top_srcdir)/version.h \
- $(srcdir)/nls-enable.h
+ $(srcdir)/nls-enable.h $(srcdir)/plausible.h
 badblocks.o: $(srcdir)/badblocks.c $(top_builddir)/lib/config.h \
  $(top_builddir)/lib/dirpaths.h $(top_srcdir)/lib/et/com_err.h \
  $(top_srcdir)/lib/ext2fs/ext2_io.h $(top_srcdir)/lib/ext2fs/ext2_fs.h \
diff --git a/misc/dumpe2fs.c b/misc/dumpe2fs.c
index 39505a8..7c3c2cc 100644
--- a/misc/dumpe2fs.c
+++ b/misc/dumpe2fs.c
@@ -42,6 +42,7 @@ extern int optind;
 
 #include "../version.h"
 #include "nls-enable.h"
+#include "plausible.h"
 
 #define in_use(m, x)	(ext2fs_test_bit ((x), (m)))
 
@@ -689,6 +690,8 @@ try_open_again:
 		com_err (program_name, retval, _("while trying to open %s"),
 			 device_name);
 		printf("%s", _("Couldn't find valid filesystem superblock.\n"));
+		if (retval == EXT2_ET_BAD_MAGIC)
+			check_plausibility(device_name, CHECK_FS_EXIST, NULL);
 		exit (1);
 	}
 	fs->default_bitmap_type = EXT2FS_BMAP64_RBTREE;
diff --git a/misc/e2image.c b/misc/e2image.c
index e1c63a7..e876ae8 100644
--- a/misc/e2image.c
+++ b/misc/e2image.c
@@ -47,6 +47,7 @@ extern int optind;
 
 #include "../version.h"
 #include "nls-enable.h"
+#include "plausible.h"
 
 #define QCOW_OFLAG_COPIED     (1LL << 63)
 #define NO_BLK ((blk64_t) -1)
@@ -1578,6 +1579,8 @@ int main (int argc, char ** argv)
 		com_err (program_name, retval, _("while trying to open %s"),
 			 device_name);
 		fputs(_("Couldn't find valid filesystem superblock.\n"), stdout);
+		if (retval == EXT2_ET_BAD_MAGIC)
+			check_plausibility(device_name, CHECK_FS_EXIST, NULL);
 		exit(1);
 	}
 
diff --git a/misc/tune2fs.c b/misc/tune2fs.c
index c454b84..d17c8de 100644
--- a/misc/tune2fs.c
+++ b/misc/tune2fs.c
@@ -2575,6 +2575,8 @@ retry_open:
 			fprintf(stderr,
 				_("MMP block magic is bad. Try to fix it by "
 				  "running:\n'e2fsck -f %s'\n"), device_name);
+		else if (retval == EXT2_ET_BAD_MAGIC)
+			check_plausibility(device_name, CHECK_FS_EXIST, NULL);
 		else if (retval != EXT2_ET_MMP_FAILED)
 			fprintf(stderr, "%s",
 			     _("Couldn't find valid filesystem superblock.\n"));
diff --git a/tests/f_detect_xfs/expect b/tests/f_detect_xfs/expect
new file mode 100644
index 0000000..4ab6e5b
--- /dev/null
+++ b/tests/f_detect_xfs/expect
@@ -0,0 +1,25 @@
+*** e2fsck
+ext2fs_open2: Bad magic number in super-block
+../e2fsck/e2fsck: Superblock invalid, trying backup blocks...
+../e2fsck/e2fsck: Bad magic number in super-block while trying to open test.img
+
+The superblock could not be read or does not describe a valid ext2/ext3/ext4
+filesystem.  If the device is valid and it really contains an ext2/ext3/ext4
+filesystem (and not swap or ufs or something else), then the superblock
+is corrupt, and you might try running e2fsck with an alternate superblock:
+    e2fsck -b 8193 <device>
+ or
+    e2fsck -b 32768 <device>
+
+test.img contains a xfs file system labelled 'test_filsys'
+*** debugfs
+test.img: Bad magic number in super-block while opening filesystem
+test.img contains a xfs file system labelled 'test_filsys'
+*** tune2fs
+../misc/tune2fs: Bad magic number in super-block while trying to open test.img
+test.img contains a xfs file system labelled 'test_filsys'
+*** mke2fs
+Creating filesystem with 16384 1k blocks and 4096 inodes
+Superblock backups stored on blocks: 
+	8193
+
diff --git a/tests/f_detect_xfs/expect.nodebugfs b/tests/f_detect_xfs/expect.nodebugfs
new file mode 100644
index 0000000..d3b7935
--- /dev/null
+++ b/tests/f_detect_xfs/expect.nodebugfs
@@ -0,0 +1,23 @@
+*** e2fsck
+ext2fs_open2: Bad magic number in super-block
+../e2fsck/e2fsck: Superblock invalid, trying backup blocks...
+../e2fsck/e2fsck: Bad magic number in super-block while trying to open test.img
+
+The superblock could not be read or does not describe a valid ext2/ext3/ext4
+filesystem.  If the device is valid and it really contains an ext2/ext3/ext4
+filesystem (and not swap or ufs or something else), then the superblock
+is corrupt, and you might try running e2fsck with an alternate superblock:
+    e2fsck -b 8193 <device>
+ or
+    e2fsck -b 32768 <device>
+
+test.img contains a xfs file system labelled 'test_filsys'
+*** debugfs
+*** tune2fs
+../misc/tune2fs: Bad magic number in super-block while trying to open test.img
+test.img contains a xfs file system labelled 'test_filsys'
+*** mke2fs
+Creating filesystem with 16384 1k blocks and 4096 inodes
+Superblock backups stored on blocks: 
+	8193
+
diff --git a/tests/f_detect_xfs/image.bz2 b/tests/f_detect_xfs/image.bz2
new file mode 100644
index 0000000000000000000000000000000000000000..9dc5e44b57c25d68cf46059439245b4f02c0d8f7
GIT binary patch
literal 450
zcmV;z0X_agT4*^jL0KkKS;zj9s{#Zw|NsB~#OOgr144bF4*%07Okh9|5J8R*M73Zb
zK|paT0LiccmH^Pypa1{>&;S9TG|&J505lI%O$b6qr~^O%00000000000E$VZo~DM7
z)6z6F0BC5@u#i7g(8sDakUd6a2AL1Yfk;si5lUbx4#)^}>SV#PqKFOtoDKj&Bt)hQ
zk8_=H64EPjfmeEEP!>VI3Wp>j5<jw30Z|+<sEP=MUzj#;_cN!HS8U^-qR}P6hp9jn
zLrf{B2`xD)w`h`x0sse{6@?1@dwsIeN}o;AbQ95&D0U&9w2%V^w1hwq06CC=9fAQ2
zP@;%{0i_ezHRm%dY`}J5!GbeQt8|LE{1OKI0c-aqxp0C}hF6I9l<yR0!0K)i=|X40
zSj32`7uBmRqBwZRbkQf93>eA_b0RNja^f}oxVPC=$Z%#-O00;l^DMLg0=q1-EQ{|I
z-UFP9{H}W`A0XuTloI=J@mt1a$(Fq?Qdym%5C9rM@4&KeGzZ~%chaW?2qK|iAweqy
sLXu;sl0e`{>cWa302YhTqK(sV6`W^c`qpYum;L`2az!{$kjMU$tCv{5XaE2J

literal 0
HcmV?d00001

diff --git a/tests/f_detect_xfs/name b/tests/f_detect_xfs/name
new file mode 100644
index 0000000..d5b9b82
--- /dev/null
+++ b/tests/f_detect_xfs/name
@@ -0,0 +1 @@
+detect xfs filesystem
diff --git a/tests/f_detect_xfs/script b/tests/f_detect_xfs/script
new file mode 100755
index 0000000..2531c5e
--- /dev/null
+++ b/tests/f_detect_xfs/script
@@ -0,0 +1,36 @@
+#!/bin/bash
+
+FSCK_OPT=-fn
+IMAGE=$test_dir/image.bz2
+
+bzip2 -d < $IMAGE > $TMPFILE
+
+# Run fsck to fix things?
+if [ -x $DEBUGFS_EXE ]; then
+	EXP=$test_dir/expect
+else
+	EXP=$test_dir/expect.nodebugfs
+fi
+OUT=$test_name.log
+rm -rf $test_name.failed $test_name.ok
+
+echo "*** e2fsck" > $OUT
+$FSCK $FSCK_OPT $TMPFILE >> $OUT 2>&1
+echo "*** debugfs" >> $OUT
+test -x $DEBUGFS_EXE && $DEBUGFS_EXE -R 'quit' $TMPFILE >> $OUT 2>&1
+echo "*** tune2fs" >> $OUT
+$TUNE2FS -i 0 $TMPFILE >> $OUT 2>&1
+echo "*** mke2fs" >> $OUT
+$MKE2FS -n $TMPFILE >> $OUT 2>&1
+
+sed -f $cmd_dir/filter.sed -e "s|$TMPFILE|test.img|g" -i $OUT
+
+# Figure out what happened
+if cmp -s $EXP $OUT; then
+	echo "$test_name: $test_description: ok"
+	touch $test_name.ok
+else
+	echo "$test_name: $test_description: failed"
+	diff -u $EXP $OUT >> $test_name.failed
+fi
+unset EXP OUT FSCK_OPT IMAGE


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 14/34] misc: use libmagic when libblkid can't identify something
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (12 preceding siblings ...)
  2014-09-13 22:12 ` [PATCH 13/34] misc: add plausibility checks to debugfs/tune2fs/dumpe2fs/e2fsck Darrick J. Wong
@ 2014-09-13 22:12 ` Darrick J. Wong
  2014-09-21  5:29   ` Theodore Ts'o
  2014-09-13 22:12 ` [PATCH 15/34] libext2fs: support BLKZEROOUT/FALLOC_FL_ZERO_RANGE in ext2fs_zero_blocks Darrick J. Wong
                   ` (19 subsequent siblings)
  33 siblings, 1 reply; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:12 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

If we're using check_plausibility() to try to identify something that
obviously isn't an ext* filesystem and libblkid doesn't know what it
is, try libmagic instead.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 MCONFIG.in                           |    1 +
 configure                            |   54 ++++++++++++++++++++++++++++++++++
 configure.in                         |    6 ++++
 debugfs/Makefile.in                  |    4 +--
 e2fsck/Makefile.in                   |    4 +--
 lib/config.h.in                      |    3 ++
 misc/Makefile.in                     |   23 ++++++++------
 misc/plausible.c                     |   22 ++++++++++++++
 tests/f_detect_junk/expect           |   25 ++++++++++++++++
 tests/f_detect_junk/expect.nodebugfs |   23 ++++++++++++++
 tests/f_detect_junk/image.bz2        |  Bin
 tests/f_detect_junk/name             |    1 +
 tests/f_detect_junk/script           |   43 +++++++++++++++++++++++++++
 13 files changed, 195 insertions(+), 14 deletions(-)
 create mode 100644 tests/f_detect_junk/expect
 create mode 100644 tests/f_detect_junk/expect.nodebugfs
 create mode 100644 tests/f_detect_junk/image.bz2
 create mode 100644 tests/f_detect_junk/name
 create mode 100755 tests/f_detect_junk/script


diff --git a/MCONFIG.in b/MCONFIG.in
index 2a5055f..4751176 100644
--- a/MCONFIG.in
+++ b/MCONFIG.in
@@ -114,6 +114,7 @@ LIBCOM_ERR = $(LIB)/libcom_err@LIB_EXT@ @PRIVATE_LIBS_CMT@ @SEM_INIT_LIB@
 LIBE2P = $(LIB)/libe2p@LIB_EXT@
 LIBEXT2FS = $(LIB)/libext2fs@LIB_EXT@
 LIBUUID = @LIBUUID@ @SOCKET_LIB@
+LIBMAGIC = @MAGIC_LIB@
 LIBQUOTA = @STATIC_LIBQUOTA@
 LIBBLKID = @LIBBLKID@ @PRIVATE_LIBS_CMT@ $(LIBUUID)
 LIBINTL = @LIBINTL@
diff --git a/configure b/configure
index 0ea5fc5..ac2fba0 100755
--- a/configure
+++ b/configure
@@ -643,6 +643,7 @@ CYGWIN_CMT
 LINUX_CMT
 UNI_DIFF_OPTS
 SEM_INIT_LIB
+MAGIC_LIB
 SOCKET_LIB
 SIZEOF_OFF_T
 SIZEOF_LONG_LONG
@@ -13125,6 +13126,59 @@ if test "x$ac_cv_lib_socket_socket" = xyes; then :
 fi
 
 
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for magic_file in -lmagic" >&5
+$as_echo_n "checking for magic_file in -lmagic... " >&6; }
+if ${ac_cv_lib_magic_magic_file+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  ac_check_lib_save_LIBS=$LIBS
+LIBS="-lmagic  $LIBS"
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+/* Override any GCC internal prototype to avoid an error.
+   Use char because int might match the return type of a GCC
+   builtin and then its argument prototype would still apply.  */
+#ifdef __cplusplus
+extern "C"
+#endif
+char magic_file ();
+int
+main ()
+{
+return magic_file ();
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+  ac_cv_lib_magic_magic_file=yes
+else
+  ac_cv_lib_magic_magic_file=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+    conftest$ac_exeext conftest.$ac_ext
+LIBS=$ac_check_lib_save_LIBS
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_magic_magic_file" >&5
+$as_echo "$ac_cv_lib_magic_magic_file" >&6; }
+if test "x$ac_cv_lib_magic_magic_file" = xyes; then :
+  MAGIC_LIB=-lmagic
+for ac_header in magic.h
+do :
+  ac_fn_c_check_header_mongrel "$LINENO" "magic.h" "ac_cv_header_magic_h" "$ac_includes_default"
+if test "x$ac_cv_header_magic_h" = xyes; then :
+  cat >>confdefs.h <<_ACEOF
+#define HAVE_MAGIC_H 1
+_ACEOF
+
+fi
+
+done
+
+fi
+
+
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for optreset" >&5
 $as_echo_n "checking for optreset... " >&6; }
 if ${ac_cv_have_optreset+:} false; then :
diff --git a/configure.in b/configure.in
index 5106f96..98dca5e 100644
--- a/configure.in
+++ b/configure.in
@@ -1146,6 +1146,12 @@ SOCKET_LIB=''
 AC_CHECK_LIB(socket, socket, [SOCKET_LIB=-lsocket])
 AC_SUBST(SOCKET_LIB)
 dnl
+dnl See if libmagic exists
+dnl
+AC_CHECK_LIB(magic, magic_file, [MAGIC_LIB=-lmagic
+AC_CHECK_HEADERS([magic.h])])
+AC_SUBST(MAGIC_LIB)
+dnl
 dnl See if optreset exists
 dnl
 AC_MSG_CHECKING(for optreset)
diff --git a/debugfs/Makefile.in b/debugfs/Makefile.in
index b33f73b..f6eae6c 100644
--- a/debugfs/Makefile.in
+++ b/debugfs/Makefile.in
@@ -37,13 +37,13 @@ SRCS= debug_cmds.c $(srcdir)/debugfs.c $(srcdir)/util.c $(srcdir)/ls.c \
 	$(srcdir)/../misc/plausible.c
 
 LIBS= $(LIBQUOTA) $(LIBEXT2FS) $(LIBE2P) $(LIBSS) $(LIBCOM_ERR) $(LIBBLKID) \
-	$(LIBUUID) $(SYSLIBS) $(LIBINTL)
+	$(LIBUUID) $(SYSLIBS) $(LIBINTL) $(LIBMAGIC)
 DEPLIBS= $(DEPLIBQUOTA) $(LIBEXT2FS) $(LIBE2P) $(DEPLIBSS) $(DEPLIBCOM_ERR) \
 	$(DEPLIBBLKID) $(DEPLIBUUID)
 
 STATIC_LIBS= $(STATIC_LIBQUOTA) $(STATIC_LIBEXT2FS) $(STATIC_LIBSS) \
 	$(STATIC_LIBCOM_ERR) $(STATIC_LIBBLKID) $(STATIC_LIBUUID) \
-	$(STATIC_LIBE2P) $(SYSLIBS) $(LIBINTL)
+	$(STATIC_LIBE2P) $(SYSLIBS) $(LIBINTL) $(LIBMAGIC)
 STATIC_DEPLIBS= $(STATIC_LIBEXT2FS) $(DEPSTATIC_LIBSS) \
 		$(DEPSTATIC_LIBCOM_ERR) $(DEPSTATIC_LIBUUID) \
 		$(DEPSTATIC_LIBE2P)
diff --git a/e2fsck/Makefile.in b/e2fsck/Makefile.in
index 1afd15f..5c40fee 100644
--- a/e2fsck/Makefile.in
+++ b/e2fsck/Makefile.in
@@ -16,13 +16,13 @@ MANPAGES=	e2fsck.8
 FMANPAGES=	e2fsck.conf.5
 
 LIBS= $(LIBQUOTA) $(LIBEXT2FS) $(LIBCOM_ERR) $(LIBBLKID) $(LIBUUID) \
-	$(LIBINTL) $(LIBE2P) $(SYSLIBS)
+	$(LIBINTL) $(LIBE2P) $(SYSLIBS) $(LIBMAGIC)
 DEPLIBS= $(DEPLIBQUOTA) $(LIBEXT2FS) $(DEPLIBCOM_ERR) $(DEPLIBBLKID) \
 	 $(DEPLIBUUID) $(DEPLIBE2P)
 
 STATIC_LIBS= $(STATIC_LIBQUOTA) $(STATIC_LIBEXT2FS) $(STATIC_LIBCOM_ERR) \
 	     $(STATIC_LIBBLKID) $(STATIC_LIBUUID) $(LIBINTL) $(STATIC_LIBE2P) \
-	     $(SYSLIBS)
+	     $(SYSLIBS) $(LIBMAGIC)
 STATIC_DEPLIBS= $(DEPSTATIC_LIBQUOTA) $(STATIC_LIBEXT2FS) \
 		$(DEPSTATIC_LIBCOM_ERR) $(DEPSTATIC_LIBBLKID) \
 		$(DEPSTATIC_LIBUUID) $(DEPSTATIC_LIBE2P)
diff --git a/lib/config.h.in b/lib/config.h.in
index be8f976..92fca3e 100644
--- a/lib/config.h.in
+++ b/lib/config.h.in
@@ -265,6 +265,9 @@
 /* Define to 1 if lseek64 declared in unistd.h */
 #undef HAVE_LSEEK64_PROTOTYPE
 
+/* Define to 1 if you have the <magic.h> header file. */
+#undef HAVE_MAGIC_H
+
 /* Define to 1 if you have the `mallinfo' function. */
 #undef HAVE_MALLINFO
 
diff --git a/misc/Makefile.in b/misc/Makefile.in
index bdeaa49..d19908f 100644
--- a/misc/Makefile.in
+++ b/misc/Makefile.in
@@ -165,14 +165,14 @@ tune2fs: $(TUNE2FS_OBJS) $(DEPLIBS) $(DEPLIBS_E2P) $(DEPLIBBLKID) \
 	$(E) "	LD $@"
 	$(Q) $(CC) $(ALL_LDFLAGS) -o tune2fs $(TUNE2FS_OBJS) $(LIBS) \
 		$(LIBBLKID) $(LIBUUID) $(LIBQUOTA) $(LIBEXT2FS) $(LIBS_E2P) \
-		$(LIBINTL) $(SYSLIBS) $(LIBBLKID)
+		$(LIBINTL) $(SYSLIBS) $(LIBBLKID) $(LIBMAGIC)
 
 tune2fs.static: $(TUNE2FS_OBJS) $(STATIC_DEPLIBS) $(STATIC_LIBE2P) $(DEPSTATIC_LIBBLKID)
 	$(E) "	LD $@"
 	$(Q) $(CC) $(LDFLAGS_STATIC) -o tune2fs.static $(TUNE2FS_OBJS) \
 		$(STATIC_LIBS) $(STATIC_LIBBLKID) $(STATIC_LIBUUID) \
 		$(STATIC_LIBQUOTA) $(STATIC_LIBE2P) $(LIBINTL) $(SYSLIBS) \
-		$(STATIC_LIBBLKID)
+		$(STATIC_LIBBLKID) $(LIBMAGIC)
 
 tune2fs.profiled: $(TUNE2FS_OBJS) $(PROFILED_DEPLIBS) \
 		$(PROFILED_E2P) $(DEPPROFILED_LIBBLKID) $(DEPPROFILED_LIBUUID) \
@@ -181,7 +181,8 @@ tune2fs.profiled: $(TUNE2FS_OBJS) $(PROFILED_DEPLIBS) \
 	$(Q) $(CC) $(ALL_LDFLAGS) -g -pg -o tune2fs.profiled \
 		$(PROFILED_TUNE2FS_OBJS) $(PROFILED_LIBBLKID) \
 		$(PROFILED_LIBUUID) $(PROFILED_LIBQUOTA) $(PROFILED_LIBE2P) \
-		$(LIBINTL) $(PROFILED_LIBS) $(SYSLIBS) $(PROFILED_LIBBLKID)
+		$(LIBINTL) $(PROFILED_LIBS) $(SYSLIBS) $(PROFILED_LIBBLKID) \
+		$(PROFILED_LIBMAGIC)
 
 blkid: $(BLKID_OBJS) $(DEPLIBBLKID) $(LIBEXT2FS)
 	$(E) "	LD $@"
@@ -202,13 +203,13 @@ blkid.profiled: $(BLKID_OBJS) $(DEPPROFILED_LIBBLKID) \
 e2image: $(E2IMAGE_OBJS) $(DEPLIBS) $(DEPLIBBLKID)
 	$(E) "	LD $@"
 	$(Q) $(CC) $(ALL_LDFLAGS) -o e2image $(E2IMAGE_OBJS) $(LIBS) \
-		$(LIBINTL) $(SYSLIBS) $(LIBBLKID)
+		$(LIBINTL) $(SYSLIBS) $(LIBBLKID) $(LIBMAGIC)
 
 e2image.profiled: $(E2IMAGE_OBJS) $(PROFILED_DEPLIBS) $(DEPLIBBLKID)
 	$(E) "	LD $@"
 	$(Q) $(CC) $(ALL_LDFLAGS) -g -pg -o e2image.profiled \
 		$(PROFILED_E2IMAGE_OBJS) $(PROFILED_LIBS) $(LIBINTL) $(SYSLIBS) \
-		$(LIBBLKID)
+		$(LIBBLKID) $(LIBMAGIC)
 
 e2undo: $(E2UNDO_OBJS) $(DEPLIBS)
 	$(E) "	LD $@"
@@ -249,14 +250,15 @@ mke2fs: $(MKE2FS_OBJS) $(DEPLIBS) $(LIBE2P) $(DEPLIBBLKID) $(DEPLIBUUID) \
 	$(E) "	LD $@"
 	$(Q) $(CC) $(ALL_LDFLAGS) -o mke2fs $(MKE2FS_OBJS) $(LIBS) $(LIBBLKID) \
 		$(LIBUUID) $(LIBQUOTA) $(LIBEXT2FS) $(LIBE2P) $(LIBINTL) \
-		$(SYSLIBS)
+		$(SYSLIBS) $(LIBMAGIC)
 
 mke2fs.static: $(MKE2FS_OBJS) $(STATIC_DEPLIBS) $(STATIC_LIBE2P) $(DEPSTATIC_LIBUUID) \
 		$(DEPSTATIC_LIBQUOTA) $(DEPSTATIC_LIBBLKID)
 	$(E) "	LD $@"
 	$(Q) $(CC) $(ALL_LDFLAGS) -static -o mke2fs.static $(MKE2FS_OBJS) \
 		$(STATIC_LIBQUOTA) $(STATIC_LIBS) $(STATIC_LIBE2P) \
-		$(STATIC_LIBBLKID) $(STATIC_LIBUUID) $(LIBINTL) $(SYSLIBS)
+		$(STATIC_LIBBLKID) $(STATIC_LIBUUID) $(LIBINTL) $(SYSLIBS) \
+		$(LIBMAGIC)
 
 mke2fs.profiled: $(MKE2FS_OBJS) $(PROFILED_DEPLIBS) \
 	$(PROFILED_LIBE2P) $(PROFILED_DEPLIBBLKID) $(PROFILED_DEPLIBUUID) \
@@ -265,7 +267,7 @@ mke2fs.profiled: $(MKE2FS_OBJS) $(PROFILED_DEPLIBS) \
 	$(Q) $(CC) $(ALL_LDFLAGS) -g -pg -o mke2fs.profiled \
 		$(PROFILED_MKE2FS_OBJS) $(PROFILED_LIBBLKID) \
 		$(PROFILED_LIBUUID) $(PROFILED_LIBQUOTA) $(PROFILED_LIBE2P) \
-		$(LIBINTL) $(PROFILED_LIBS) $(SYSLIBS)
+		$(LIBINTL) $(PROFILED_LIBS) $(SYSLIBS) $(LIBMAGIC)
 
 chattr: $(CHATTR_OBJS) $(DEPLIBS_E2P)
 	$(E) "	LD $@"
@@ -301,7 +303,8 @@ uuidd.profiled: $(UUIDD_OBJS) $(PROFILED_DEPLIBUUID)
 dumpe2fs: $(DUMPE2FS_OBJS) $(DEPLIBS) $(DEPLIBS_E2P) $(DEPLIBUUID) $(DEPLIBBLKID)
 	$(E) "	LD $@"
 	$(Q) $(CC) $(ALL_LDFLAGS) -o dumpe2fs $(DUMPE2FS_OBJS) $(LIBS) \
-		$(LIBS_E2P) $(LIBUUID) $(LIBINTL) $(SYSLIBS) $(LIBBLKID)
+		$(LIBS_E2P) $(LIBUUID) $(LIBINTL) $(SYSLIBS) $(LIBBLKID) \
+		$(LIBMAGIC)
 
 dumpe2fs.profiled: $(DUMPE2FS_OBJS) $(PROFILED_DEPLIBS) \
 		$(PROFILED_LIBE2P) $(PROFILED_DEPLIBUUID) $(PROFILED_DEPLIBBLKID)
@@ -309,7 +312,7 @@ dumpe2fs.profiled: $(DUMPE2FS_OBJS) $(PROFILED_DEPLIBS) \
 	$(Q) $(CC) $(ALL_LDFLAGS) -g -pg -o dumpe2fs.profiled \
 		$(PROFILED_DUMPE2FS_OBJS) $(PROFILED_LIBS) \
 		$(PROFILED_LIBE2P) $(PROFILED_LIBUUID) $(LIBINTL) $(SYSLIBS) \
-		$(PROFILED_LIBBLKID)
+		$(PROFILED_LIBBLKID) $(PROFILED_LIBMAGIC)
 
 fsck: $(FSCK_OBJS) $(DEPLIBBLKID)
 	$(E) "	LD $@"
diff --git a/misc/plausible.c b/misc/plausible.c
index 2768e4b..caeb929 100644
--- a/misc/plausible.c
+++ b/misc/plausible.c
@@ -28,6 +28,9 @@
 #ifdef HAVE_UNISTD_H
 #include <unistd.h>
 #endif
+#ifdef HAVE_MAGIC_H
+#include <magic.h>
+#endif
 #include "plausible.h"
 #include "ext2fs/ext2fs.h"
 #include "nls-enable.h"
@@ -194,6 +197,25 @@ int check_plausibility(const char *device, int flags, int *ret_is_dev)
 		return 0;
 	}
 
+#ifdef HAVE_MAGIC_H
+	if (flags & CHECK_FS_EXIST) {
+		const char *msg;
+		magic_t mag;
+
+		mag = magic_open(MAGIC_RAW | MAGIC_SYMLINK | MAGIC_DEVICES |
+				 MAGIC_ERROR | MAGIC_NO_CHECK_ELF |
+				 MAGIC_NO_CHECK_COMPRESS);
+		magic_load(mag, NULL);
+
+		msg = magic_file(mag, device);
+		if (msg && strcmp(msg, "data") && strcmp(msg, "empty"))
+			printf(_("%s contains a `%s'\n"), device, msg);
+
+		magic_close(mag);
+		return 0;
+	}
+#endif
+
 	ret = check_partition_table(device);
 	if (ret >= 0)
 		return ret;
diff --git a/tests/f_detect_junk/expect b/tests/f_detect_junk/expect
new file mode 100644
index 0000000..57f7f89
--- /dev/null
+++ b/tests/f_detect_junk/expect
@@ -0,0 +1,25 @@
+*** e2fsck
+ext2fs_open2: Bad magic number in super-block
+../e2fsck/e2fsck: Superblock invalid, trying backup blocks...
+../e2fsck/e2fsck: Bad magic number in super-block while trying to open test.img
+
+The superblock could not be read or does not describe a valid ext2/ext3/ext4
+filesystem.  If the device is valid and it really contains an ext2/ext3/ext4
+filesystem (and not swap or ufs or something else), then the superblock
+is corrupt, and you might try running e2fsck with an alternate superblock:
+    e2fsck -b 8193 <device>
+ or
+    e2fsck -b 32768 <device>
+
+test.img contains a `PNG image data, 148 x 31, 8-bit/color RGBA, non-interlaced'
+*** debugfs
+test.img: Bad magic number in super-block while opening filesystem
+test.img contains a `PNG image data, 148 x 31, 8-bit/color RGBA, non-interlaced'
+*** tune2fs
+../misc/tune2fs: Bad magic number in super-block while trying to open test.img
+test.img contains a `PNG image data, 148 x 31, 8-bit/color RGBA, non-interlaced'
+*** mke2fs
+Creating filesystem with 16384 1k blocks and 4096 inodes
+Superblock backups stored on blocks: 
+	8193
+
diff --git a/tests/f_detect_junk/expect.nodebugfs b/tests/f_detect_junk/expect.nodebugfs
new file mode 100644
index 0000000..d9281a0
--- /dev/null
+++ b/tests/f_detect_junk/expect.nodebugfs
@@ -0,0 +1,23 @@
+*** e2fsck
+ext2fs_open2: Bad magic number in super-block
+../e2fsck/e2fsck: Superblock invalid, trying backup blocks...
+../e2fsck/e2fsck: Bad magic number in super-block while trying to open test.img
+
+The superblock could not be read or does not describe a valid ext2/ext3/ext4
+filesystem.  If the device is valid and it really contains an ext2/ext3/ext4
+filesystem (and not swap or ufs or something else), then the superblock
+is corrupt, and you might try running e2fsck with an alternate superblock:
+    e2fsck -b 8193 <device>
+ or
+    e2fsck -b 32768 <device>
+
+test.img contains a `PNG image data, 148 x 31, 8-bit/color RGBA, non-interlaced'
+*** debugfs
+*** tune2fs
+../misc/tune2fs: Bad magic number in super-block while trying to open test.img
+test.img contains a `PNG image data, 148 x 31, 8-bit/color RGBA, non-interlaced'
+*** mke2fs
+Creating filesystem with 16384 1k blocks and 4096 inodes
+Superblock backups stored on blocks: 
+	8193
+
diff --git a/tests/f_detect_junk/image.bz2 b/tests/f_detect_junk/image.bz2
new file mode 100644
index 0000000000000000000000000000000000000000..3d52600d67d5f9e3212f6e6b5b8bbc918343fd41
GIT binary patch
literal 2504
zcmV;(2{-maT4*^jL0KkKS;2=vM*sq1|NsC0|NsC0|NsC0|NsC0|NsC0|NsC0|NsC0
z|NsC0|Nqbhp7Y&v-P@kL*Iln}s&(RfBC4N|O&TL<lTAzsw3tVzWWa`MF$@H0fSNX-
zG{8npFh+*d(djZ|JtmkFX&Pc;G{R)lPe{h5O*WHiVtP+V$?9q9WHB~@GBZTTRa5;F
zMGrweG))Xm2dR*0lSZDC)Or!>H1!%Z&<2AIO{m(Knq<kQK*V~QWNky#d8qYF)EPYy
zw2wj<CKC{19)x;oG|*|ICL_{mqfFGq45z87<R$<k6HJ&ij5R$IYBb4{0BC45(Kd!r
zqf7!adTBCYJb*(0j0uRwf-*4}g9tXGGzpUllhS!j7y@97OaWC-+J=<!YI=`NCzQ=k
z1u>>m(t4VHpo33S$*2sTpxGl$q|;3@2a;_;rqt6Xsp@F-ngd3f2dSe&CXYs@nKS?c
z)b$%kWY7%&VKNN>Xfk9Xr|ATlsgVOC&`qe*K-m*O&@v4g42&bx4Ff|%MuDNGOn?my
z4U%S{dYJ&o&@wV)c@xyr)YBu>Xwzto0MKctq&-G}1mK_UB$Y<xEPX(pa~Q`GI|=9H
z#IO(r;&>ay0rEh+-JfHO1(H{m*9547vUX!6=na6a35E<TmJN4}Cq1Y~j2Zy&^_M`W
zi!CLWdmLEBe)R<q&z3JSg(_l?ud+^_Jpgf0{cd}mZafG~LZP`3>q7O?^`80SF=_%B
zJ1O-$BmWbOEC@s&3rH-`KrQP`*R<1`d%s0l73cuu#vp;4Vj>0I=@_^~4%jOPvPj_6
z7`GL6J5SXLk*cPvU0ygC6lsF-^4}6IcoqwYRKH=Pd2wJ45DeA`Z|HqeR>+1XZInpd
z2%nDHaPTEd-uJOdPg;(cen|KU(=jwPu5sbh2%!87!`XCS@86l<8rlOI#J8}FNQKj-
zXVfP7mLiH-86r_%<WWC-bNj1b_p|6Kp`h`bklN<Q7}Q~%{z6#u&v^<aO0>%LRFGk{
zpfu^C2PJHj8!wzJnZz0Xy?VJ<IERwYV#^}o3>d45%}n(ls0*nqlE=*w44zqp+W;ho
zKrnPvG~gA&1t7;HSl`DX>Uh@HEMq#M$oyPE)elr5lP)p!aN`{qtXqWI?3rOM+m$5*
zIiK&wEV@`VR-D-BRf~~f1_9hZq@Ls#fCv?;<}Z{ylM3WGV$GES*~6xqGZGwtvy(B9
z^$TQ=@Jfh?jg7<IBWK#~k7eG?hTL0^f)Z8r7Hi#puEdlCbHF?0bA$Ext-&sVWQfBZ
zRw`|f)SSdVOzS>)X_ua`r8&6-I}@r?we%_S%0*EE?$QwNm6cN#dv<Zbk}?V&u3fo>
zqEmJRi&CyE#!>2G5X+-4wFj%Md;vShF_kK!y{Q`%GfU^?*4VFfV&0Yly5l;hcouY!
z?%CjVJe*Iox#Az3NH+r=`|ly>Fa?Jz&S8}e0ZOmRFWdzaGz`l(&Tgcx_zdh>RXAd5
z)CbVQ^64nUq}Zs^JHaq#oZr}ql6GX%t0uE9<+%-jzo3Lf=Xi*+i!Nep21NvDr@B$v
z#jmu0n|RA><A6;h20TNwN-nieA0>OGSp2#pj!AWpR*QXv({Or$gJ{{@^SpJ^t^u~J
zXzK60#^p$#f1OA0C@N@4mI9or*wPi_LL&}@g6Z@Yx?*2ZCgQ&@9h>a<(@1iPyh>io
z)|XK)o3o0Vc16-Brl=~bhM!$u+O@N*Bij#r2qfGlyvb*THkLSytB43&!DfB(W!rDA
zxrZ6|V@rg8MP;NVgzL)`MU_z}Yut&$F-1wXbMT7CO{vswIO|(ilcq_Lo|v$J`NlPO
zZ%%QMxdZITdNc<!8h_0>A9}i6%oQQqZR8r&uGBf1M~g<gWr)Dk|9x<`um(J$ph7hE
zavh<k6jDCXIQgY2aH*a}kJHpmmID;Ce4a&eN)w$we-9PBiW39I0%x6#F2$Pn$AsGR
z=nCu|)vP2<K*I{MRS8_mWZ=<v*31owSy%tf8_^Hq`ZAMy^zFWF2vE|*3ZX9ps7M!!
zgCvPh2ZYF~XtpxjZm6QWfKM<2Qp!ZrNn4#_ikTi#3bm_@NZVQ$+C0ZR+QvbF+<!C}
zbxe<fEo-*hM%X)CzZ0H2<f{Y1J9guea#QzpSA)p@t)|UTYXs_btYFPQlnwvz?VIU2
zTQ?TLIT98}RwBJ0Ep^kyWMWdwJ)OhJjAYN4C@bUcupk=6{*58JWSgSo%ZtymQ!%?l
zCI~>ht2M}%n3gL#7Srt6dbZ?)ZsCokltYfAJ(xal^(d3i=M*qaaW|R8U)1y>I$&^h
zmabsLJHWci0~ON496@Ge@wCN5Wv@J0K{e&t349j?u^x>wGAh`IAkn>Q%ZighgKLZk
z=n(CPLgGFecp5;HdYf{P7p1-9+_%Njuo4-~s6i<?kR7F|png;XU6*de!C}5(a(fQ*
zeL9QppiE68P}YOTMqW96RY^yy<jii+>`eeh!ki9=Qq(pxlRISe&-F8kWoR>Kfd;L3
zAos9z%Hi7b<{BOe#;GBTz5NzW3(4rutfW7L`aUJDkU&|e^b-$I@73wV8yJ)$u6<!U
zse&>Vl^Y$5MO<WzK>T-IP@62C=QmH|$xv;ai6CjQ=-!&37Dj}cRl0Q&)TSFwri2_Z
zRBkj+UFaO%=|dQ9B;&yN5ziIzBiTW;s4Il`H6Q|$Ak(<9p^{ZX^>$DruV-Z9*9kkT
zW5ttX?^Li~ivhHB70$kxaDMRS<GQYbs2M6*h(v2&V5NcBQZX1J%g)8EE+{8IYo(Yq
zLbngjhj_GMMaOl9s;hf{NNWd3BY9EJF`-2N8Ie=KdfOgttET$3N8L*ZKL8`X^ZY09
zmd`>sl|d`Q@bJ7pZYIP@*p&#PNywUP`*LL(Mwsg;7wQrjH${J<odAG?8*_HMva5ju
z%eDiq-4Kuqdd3>ZhaRp*-b#ARHVZIk9d>kHpp5qTP^W}k!&sat2ZmtoQde0$0Fxm}
z-ik#clMkZqMqle4;K+7hDT<yp957j3QrY1kv(hDa{Jx&z6jc8&NNAw3sTouTvD_FB
z=mb#yxS&9^b35|~3S}$`y5^0O<uJw&5`ijsP@63+SJmRPn@8i;ecK{3aIo{7kk{c<
z8C;nZC^b#2UGa4)R2#7BWhm_#xrZR;jO7fBsfFogWfeyV+)MDF{@dGeXcA&;PuEQ`
zJxY#*i(Z}t)0bnH(WL46B^h7n4bRdG5v8U$UbC(&HPeJ~qcHO{mjMYu#sL|+j4?2^
SjgZZI;_gVN3KAGF=qTVdHMar)

literal 0
HcmV?d00001

diff --git a/tests/f_detect_junk/name b/tests/f_detect_junk/name
new file mode 100644
index 0000000..81cf655
--- /dev/null
+++ b/tests/f_detect_junk/name
@@ -0,0 +1 @@
+detect non-fs file data
diff --git a/tests/f_detect_junk/script b/tests/f_detect_junk/script
new file mode 100755
index 0000000..8409fdd
--- /dev/null
+++ b/tests/f_detect_junk/script
@@ -0,0 +1,43 @@
+#!/bin/bash
+
+if [ "$(grep -c 'define HAVE_MAGIC_H' $test_dir/../../lib/config.h)" -gt 0 ]; then
+
+FSCK_OPT=-fn
+IMAGE=$test_dir/image.bz2
+
+bzip2 -d < $IMAGE > $TMPFILE
+dd if=/dev/zero of=$TMPFILE conv=notrunc oflag=append bs=1024k count=16 > /dev/null 2>&1
+
+# Run fsck to fix things?
+if [ -x $DEBUGFS_EXE ]; then
+	EXP=$test_dir/expect
+else
+	EXP=$test_dir/expect.nodebugfs
+fi
+OUT=$test_name.log
+rm -rf $test_name.failed $test_name.ok
+
+echo "*** e2fsck" > $OUT
+$FSCK $FSCK_OPT $TMPFILE >> $OUT 2>&1
+echo "*** debugfs" >> $OUT
+test -x $DEBUGFS_EXE && $DEBUGFS_EXE -R 'quit' $TMPFILE >> $OUT 2>&1
+echo "*** tune2fs" >> $OUT
+$TUNE2FS -i 0 $TMPFILE >> $OUT 2>&1
+echo "*** mke2fs" >> $OUT
+$MKE2FS -n $TMPFILE >> $OUT 2>&1
+
+sed -f $cmd_dir/filter.sed -e "s|$TMPFILE|test.img|g" -i $OUT
+
+# Figure out what happened
+if cmp -s $EXP $OUT; then
+	echo "$test_name: $test_description: ok"
+	touch $test_name.ok
+else
+	echo "$test_name: $test_description: failed"
+	diff -u $EXP $OUT >> $test_name.failed
+fi
+unset EXP OUT FSCK_OPT IMAGE
+
+else #if HAVE_MAGIC_H
+	echo "$test_name: $test_description: skipped"
+fi


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 15/34] libext2fs: support BLKZEROOUT/FALLOC_FL_ZERO_RANGE in ext2fs_zero_blocks
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (13 preceding siblings ...)
  2014-09-13 22:12 ` [PATCH 14/34] misc: use libmagic when libblkid can't identify something Darrick J. Wong
@ 2014-09-13 22:12 ` Darrick J. Wong
  2014-09-22  2:51   ` Theodore Ts'o
                     ` (2 more replies)
  2014-09-13 22:12 ` [PATCH 16/34] libext2fs/e2fsck: refactor everyone who writes zero blocks to disk Darrick J. Wong
                   ` (18 subsequent siblings)
  33 siblings, 3 replies; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:12 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

Plumb a new call into the IO manager to support translating
ext2fs_zero_blocks calls into the equivalent kernel-level BLKZEROOUT
ioctl or FALLOC_FL_ZERO_RANGE fallocate flag primitives when possible.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 contrib/fallocate.c     |   14 +++++++++
 lib/ext2fs/ext2_io.h    |    7 ++++-
 lib/ext2fs/io_manager.c |   11 +++++++
 lib/ext2fs/mkjournal.c  |    6 ++++
 lib/ext2fs/test_io.c    |   21 ++++++++++++++
 lib/ext2fs/unix_io.c    |   71 +++++++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 128 insertions(+), 2 deletions(-)


diff --git a/contrib/fallocate.c b/contrib/fallocate.c
index 1f9b59a..01d4af7 100644
--- a/contrib/fallocate.c
+++ b/contrib/fallocate.c
@@ -36,6 +36,8 @@
 // #include <linux/falloc.h>
 #define FALLOC_FL_KEEP_SIZE	0x01
 #define FALLOC_FL_PUNCH_HOLE	0x02 /* de-allocates range */
+#define FALLOC_FL_COLLAPSE_RANGE	0x08
+#define FALLOC_FL_ZERO_RANGE		0x10
 
 void usage(void)
 {
@@ -95,7 +97,7 @@ int main(int argc, char **argv)
 	int	error;
 	int	tflag = 0;
 
-	while ((opt = getopt(argc, argv, "npl:o:t")) != -1) {
+	while ((opt = getopt(argc, argv, "npl:o:tzc")) != -1) {
 		switch(opt) {
 		case 'n':
 			/* do not change filesize */
@@ -106,6 +108,16 @@ int main(int argc, char **argv)
 			falloc_mode = (FALLOC_FL_PUNCH_HOLE |
 				       FALLOC_FL_KEEP_SIZE);
 			break;
+		case 'c':
+			/* collapse range mode */
+			falloc_mode = (FALLOC_FL_COLLAPSE_RANGE |
+				       FALLOC_FL_KEEP_SIZE);
+			break;
+		case 'z':
+			/* zero range mode */
+			falloc_mode = (FALLOC_FL_ZERO_RANGE |
+				       FALLOC_FL_KEEP_SIZE);
+			break;
 		case 'l':
 			length = cvtnum(optarg);
 			break;
diff --git a/lib/ext2fs/ext2_io.h b/lib/ext2fs/ext2_io.h
index 4c5a5c5..1faa720 100644
--- a/lib/ext2fs/ext2_io.h
+++ b/lib/ext2fs/ext2_io.h
@@ -93,7 +93,9 @@ struct struct_io_manager {
 	errcode_t (*cache_readahead)(io_channel channel,
 				     unsigned long long block,
 				     unsigned long long count);
-	long	reserved[15];
+	errcode_t (*zeroout)(io_channel channel, unsigned long long block,
+			     unsigned long long count);
+	long	reserved[14];
 };
 
 #define IO_FLAG_RW		0x0001
@@ -125,6 +127,9 @@ extern errcode_t io_channel_write_blk64(io_channel channel,
 extern errcode_t io_channel_discard(io_channel channel,
 				    unsigned long long block,
 				    unsigned long long count);
+extern errcode_t io_channel_zeroout(io_channel channel,
+				    unsigned long long block,
+				    unsigned long long count);
 extern errcode_t io_channel_alloc_buf(io_channel channel,
 				      int count, void *ptr);
 extern errcode_t io_channel_cache_readahead(io_channel io,
diff --git a/lib/ext2fs/io_manager.c b/lib/ext2fs/io_manager.c
index dc5888d..c395d61 100644
--- a/lib/ext2fs/io_manager.c
+++ b/lib/ext2fs/io_manager.c
@@ -112,6 +112,17 @@ errcode_t io_channel_discard(io_channel channel, unsigned long long block,
 	return EXT2_ET_UNIMPLEMENTED;
 }
 
+errcode_t io_channel_zeroout(io_channel channel, unsigned long long block,
+			     unsigned long long count)
+{
+	EXT2_CHECK_MAGIC(channel, EXT2_ET_MAGIC_IO_CHANNEL);
+
+	if (channel->manager->zeroout)
+		return (channel->manager->zeroout)(channel, block, count);
+
+	return EXT2_ET_UNIMPLEMENTED;
+}
+
 errcode_t io_channel_alloc_buf(io_channel io, int count, void *ptr)
 {
 	size_t	size;
diff --git a/lib/ext2fs/mkjournal.c b/lib/ext2fs/mkjournal.c
index 6f3a862..5be425c 100644
--- a/lib/ext2fs/mkjournal.c
+++ b/lib/ext2fs/mkjournal.c
@@ -164,6 +164,12 @@ errcode_t ext2fs_zero_blocks2(ext2_filsys fs, blk64_t blk, int num,
 		}
 		return 0;
 	}
+
+	/* Try a zero out command, if supported */
+	retval = io_channel_zeroout(fs->io, blk, num);
+	if (retval == 0)
+		return 0;
+
 	/* Allocate the zeroizing buffer if necessary */
 	if (!buf) {
 		buf = malloc(fs->blocksize * STRIDE_LENGTH);
diff --git a/lib/ext2fs/test_io.c b/lib/ext2fs/test_io.c
index b03a939..f7c50d1 100644
--- a/lib/ext2fs/test_io.c
+++ b/lib/ext2fs/test_io.c
@@ -86,6 +86,7 @@ void (*test_io_cb_write_byte)
 #define TEST_FLAG_SET_OPTION		0x20
 #define TEST_FLAG_DISCARD		0x40
 #define TEST_FLAG_READAHEAD		0x80
+#define TEST_FLAG_ZEROOUT		0x100
 
 static void test_dump_block(io_channel channel,
 			    struct test_private_data *data,
@@ -507,6 +508,25 @@ static errcode_t test_cache_readahead(io_channel channel,
 	return retval;
 }
 
+static errcode_t test_zeroout(io_channel channel, unsigned long long block,
+			      unsigned long long count)
+{
+	struct test_private_data *data;
+	errcode_t	retval = 0;
+
+	EXT2_CHECK_MAGIC(channel, EXT2_ET_MAGIC_IO_CHANNEL);
+	data = (struct test_private_data *) channel->private_data;
+	EXT2_CHECK_MAGIC(data, EXT2_ET_MAGIC_TEST_IO_CHANNEL);
+
+	if (data->real)
+		retval = io_channel_zeroout(data->real, block, count);
+	if (data->flags & TEST_FLAG_ZEROOUT)
+		fprintf(data->outfile,
+			"Test_io: zeroout(%llu, %llu) returned %s\n",
+			block, count, retval ? error_message(retval) : "OK");
+	return retval;
+}
+
 static struct struct_io_manager struct_test_manager = {
 	.magic		= EXT2_ET_MAGIC_IO_MANAGER,
 	.name		= "Test I/O Manager",
@@ -523,6 +543,7 @@ static struct struct_io_manager struct_test_manager = {
 	.write_blk64	= test_write_blk64,
 	.discard	= test_discard,
 	.cache_readahead	= test_cache_readahead,
+	.zeroout	= test_zeroout,
 };
 
 io_manager test_io_manager = &struct_test_manager;
diff --git a/lib/ext2fs/unix_io.c b/lib/ext2fs/unix_io.c
index 189adce..20e5b64 100644
--- a/lib/ext2fs/unix_io.c
+++ b/lib/ext2fs/unix_io.c
@@ -986,6 +986,76 @@ unimplemented:
 	return EXT2_ET_UNIMPLEMENTED;
 }
 
+#if defined(__linux__) && !defined(BLKZEROOUT)
+#define BLKZEROOUT		_IO(0x12, 127)
+#endif
+
+#if defined(__linux__) && !defined(FALLOC_FL_ZERO_RANGE)
+#define FALLOC_FL_ZERO_RANGE    0x10
+#endif
+
+static errcode_t unix_zeroout(io_channel channel, unsigned long long block,
+			      unsigned long long count)
+{
+	struct unix_private_data *data;
+	int		ret;
+
+	EXT2_CHECK_MAGIC(channel, EXT2_ET_MAGIC_IO_CHANNEL);
+	data = (struct unix_private_data *) channel->private_data;
+	EXT2_CHECK_MAGIC(data, EXT2_ET_MAGIC_UNIX_IO_CHANNEL);
+
+	if (getenv("UNIX_IO_NOZEROOUT"))
+		goto unimplemented;
+
+	if (channel->flags & CHANNEL_FLAGS_BLOCK_DEVICE) {
+#ifdef BLKZEROOUT
+		__u64 range[2];
+
+		range[0] = (__u64)(block) * channel->block_size;
+		range[1] = (__u64)(count) * channel->block_size;
+
+		ret = ioctl(data->dev, BLKZEROOUT, &range);
+#else
+		goto unimplemented;
+#endif
+	} else {
+#if defined(HAVE_FALLOCATE) && defined(FALLOC_FL_ZERO_RANGE)
+		int flag = FALLOC_FL_ZERO_RANGE;
+		struct stat statbuf;
+
+		/*
+		 * If we're trying to zero a range past the end of the file,
+		 * just use regular fallocate to get there, because zeroing
+		 * a range past EOF does not extend the file.
+		 */
+		ret = fstat(data->dev, &statbuf);
+		if (ret)
+			goto err;
+		if (statbuf.st_size < (block + count) * channel->block_size)
+			flag = 0;
+		/*
+		 * If we are not on block device, try to use the zero out
+		 * primitive.
+		 */
+		ret = fallocate(data->dev,
+				flag,
+				(off_t)(block) * channel->block_size,
+				(off_t)(count) * channel->block_size);
+#else
+		goto unimplemented;
+#endif
+	}
+err:
+	if (ret < 0) {
+		if (errno == EOPNOTSUPP)
+			goto unimplemented;
+		return errno;
+	}
+	return 0;
+unimplemented:
+	return EXT2_ET_UNIMPLEMENTED;
+}
+
 static struct struct_io_manager struct_unix_manager = {
 	.magic		= EXT2_ET_MAGIC_IO_MANAGER,
 	.name		= "Unix I/O Manager",
@@ -1002,6 +1072,7 @@ static struct struct_io_manager struct_unix_manager = {
 	.write_blk64	= unix_write_blk64,
 	.discard	= unix_discard,
 	.cache_readahead	= unix_cache_readahead,
+	.zeroout	= unix_zeroout,
 };
 
 io_manager unix_io_manager = &struct_unix_manager;


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 16/34] libext2fs/e2fsck: refactor everyone who writes zero blocks to disk
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (14 preceding siblings ...)
  2014-09-13 22:12 ` [PATCH 15/34] libext2fs: support BLKZEROOUT/FALLOC_FL_ZERO_RANGE in ext2fs_zero_blocks Darrick J. Wong
@ 2014-09-13 22:12 ` Darrick J. Wong
  2014-10-13 10:09   ` Theodore Ts'o
  2014-09-13 22:13 ` [PATCH 17/34] libext2fs: support allocating uninit blocks in bmap2() Darrick J. Wong
                   ` (17 subsequent siblings)
  33 siblings, 1 reply; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:12 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

Convert all call sites that write zero blocks to disk to use
ext2fs_zero_blocks2() since it can use Linux's zero out feature to do
the writes more quickly.  Reclaim the zero buffer at freefs time and
make the write-zeroes fallback use a larger buffer.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 e2fsck/e2fsck.h        |    2 --
 e2fsck/pass1.c         |   13 +++++++-----
 e2fsck/pass3.c         |   13 +++---------
 e2fsck/util.c          |   53 ------------------------------------------------
 lib/ext2fs/alloc.c     |   14 +++----------
 lib/ext2fs/expanddir.c |   13 +++---------
 lib/ext2fs/freefs.c    |    1 +
 lib/ext2fs/mkjournal.c |   16 ++++++--------
 misc/mke2fs.c          |    2 --
 resize/resize2fs.c     |   25 ++++++++---------------
 10 files changed, 34 insertions(+), 118 deletions(-)


diff --git a/e2fsck/e2fsck.h b/e2fsck/e2fsck.h
index b2654ef..e359515 100644
--- a/e2fsck/e2fsck.h
+++ b/e2fsck/e2fsck.h
@@ -544,8 +544,6 @@ extern void e2fsck_read_bitmaps(e2fsck_t ctx);
 extern void e2fsck_write_bitmaps(e2fsck_t ctx);
 extern void preenhalt(e2fsck_t ctx);
 extern char *string_copy(e2fsck_t ctx, const char *str, int len);
-extern errcode_t e2fsck_zero_blocks(ext2_filsys fs, blk_t blk, int num,
-				    blk_t *ret_blk, int *ret_count);
 extern int fs_proc_check(const char *fs_name);
 extern int check_for_modules(const char *fs_name);
 #ifdef RESOURCE_TRACK
diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
index d4760ef..a963849 100644
--- a/e2fsck/pass1.c
+++ b/e2fsck/pass1.c
@@ -3576,12 +3576,15 @@ static void new_table_block(e2fsck_t ctx, blk64_t first_block, dgrp_t group,
 				   old_block + i, 1, buf);
 			if (pctx.errcode)
 				fix_problem(ctx, PR_1_RELOC_READ_ERR, &pctx);
-		} else
-			memset(buf, 0, fs->blocksize);
+			pctx.blk = (*new_block) + i;
+			pctx.errcode = io_channel_write_blk64(fs->io, pctx.blk,
+							      1, buf);
+		} else {
+			pctx.blk = (*new_block) + i;
+			pctx.errcode = ext2fs_zero_blocks2(fs, pctx.blk, 1,
+							   NULL, NULL);
+		}
 
-		pctx.blk = (*new_block) + i;
-		pctx.errcode = io_channel_write_blk64(fs->io, pctx.blk,
-					      1, buf);
 		if (pctx.errcode)
 			fix_problem(ctx, PR_1_RELOC_WRITE_ERR, &pctx);
 	}
diff --git a/e2fsck/pass3.c b/e2fsck/pass3.c
index f03c7ae..2d94ece 100644
--- a/e2fsck/pass3.c
+++ b/e2fsck/pass3.c
@@ -809,20 +809,13 @@ static int expand_dir_proc(ext2_filsys fs,
 		es->num--;
 		retval = ext2fs_write_dir_block4(fs, new_blk, block, 0,
 						 es->dir);
-	} else {
-		retval = ext2fs_get_mem(fs->blocksize, &block);
-		if (retval) {
-			es->err = retval;
-			return BLOCK_ABORT;
-		}
-		memset(block, 0, fs->blocksize);
-		retval = io_channel_write_blk64(fs->io, new_blk, 1, block);
-	}
+		ext2fs_free_mem(&block);
+	} else
+		retval = ext2fs_zero_blocks2(fs, new_blk, 1, NULL, NULL);
 	if (retval) {
 		es->err = retval;
 		return BLOCK_ABORT;
 	}
-	ext2fs_free_mem(&block);
 	*blocknr = new_blk;
 	ext2fs_mark_block_bitmap2(ctx->block_found_map, new_blk);
 
diff --git a/e2fsck/util.c b/e2fsck/util.c
index 74f20062..723dafb 100644
--- a/e2fsck/util.c
+++ b/e2fsck/util.c
@@ -612,59 +612,6 @@ int ext2_file_type(unsigned int mode)
 	return 0;
 }
 
-#define STRIDE_LENGTH 8
-/*
- * Helper function which zeros out _num_ blocks starting at _blk_.  In
- * case of an error, the details of the error is returned via _ret_blk_
- * and _ret_count_ if they are non-NULL pointers.  Returns 0 on
- * success, and an error code on an error.
- *
- * As a special case, if the first argument is NULL, then it will
- * attempt to free the static zeroizing buffer.  (This is to keep
- * programs that check for memory leaks happy.)
- */
-errcode_t e2fsck_zero_blocks(ext2_filsys fs, blk_t blk, int num,
-			     blk_t *ret_blk, int *ret_count)
-{
-	int		j, count;
-	static char	*buf;
-	errcode_t	retval;
-
-	/* If fs is null, clean up the static buffer and return */
-	if (!fs) {
-		if (buf) {
-			free(buf);
-			buf = 0;
-		}
-		return 0;
-	}
-	/* Allocate the zeroizing buffer if necessary */
-	if (!buf) {
-		buf = malloc(fs->blocksize * STRIDE_LENGTH);
-		if (!buf) {
-			com_err("malloc", ENOMEM, "%s",
-				_("while allocating zeroizing buffer"));
-			exit(1);
-		}
-		memset(buf, 0, fs->blocksize * STRIDE_LENGTH);
-	}
-	/* OK, do the write loop */
-	for (j = 0; j < num; j += STRIDE_LENGTH, blk += STRIDE_LENGTH) {
-		count = num - j;
-		if (count > STRIDE_LENGTH)
-			count = STRIDE_LENGTH;
-		retval = io_channel_write_blk64(fs->io, blk, count, buf);
-		if (retval) {
-			if (ret_count)
-				*ret_count = count;
-			if (ret_blk)
-				*ret_blk = blk;
-			return retval;
-		}
-	}
-	return 0;
-}
-
 /*
  * Check to see if a filesystem is in /proc/filesystems.
  * Returns 1 if found, 0 if not
diff --git a/lib/ext2fs/alloc.c b/lib/ext2fs/alloc.c
index d1c1a84..4e3bfdb 100644
--- a/lib/ext2fs/alloc.c
+++ b/lib/ext2fs/alloc.c
@@ -198,15 +198,9 @@ errcode_t ext2fs_alloc_block2(ext2_filsys fs, blk64_t goal,
 {
 	errcode_t	retval;
 	blk64_t		block;
-	char		*buf = 0;
 
-	if (!block_buf) {
-		retval = ext2fs_get_mem(fs->blocksize, &buf);
-		if (retval)
-			return retval;
-		block_buf = buf;
-	}
-	memset(block_buf, 0, fs->blocksize);
+	if (block_buf)
+		memset(block_buf, 0, fs->blocksize);
 
 	if (fs->get_alloc_block) {
 		retval = (fs->get_alloc_block)(fs, goal, &block);
@@ -224,7 +218,7 @@ errcode_t ext2fs_alloc_block2(ext2_filsys fs, blk64_t goal,
 			goto fail;
 	}
 
-	retval = io_channel_write_blk64(fs->io, block, 1, block_buf);
+	retval = ext2fs_zero_blocks2(fs, block, 1, NULL, NULL);
 	if (retval)
 		goto fail;
 
@@ -232,8 +226,6 @@ errcode_t ext2fs_alloc_block2(ext2_filsys fs, blk64_t goal,
 	*ret = block;
 
 fail:
-	if (buf)
-		ext2fs_free_mem(&buf);
 	return retval;
 }
 
diff --git a/lib/ext2fs/expanddir.c b/lib/ext2fs/expanddir.c
index d0f7287..ecc13ae 100644
--- a/lib/ext2fs/expanddir.c
+++ b/lib/ext2fs/expanddir.c
@@ -67,22 +67,15 @@ static int expand_dir_proc(ext2_filsys	fs,
 		es->done = 1;
 		retval = ext2fs_write_dir_block4(fs, new_blk, block, 0,
 						 es->dir);
-	} else {
-		retval = ext2fs_get_mem(fs->blocksize, &block);
-		if (retval) {
-			es->err = retval;
-			return BLOCK_ABORT;
-		}
-		memset(block, 0, fs->blocksize);
-		retval = io_channel_write_blk64(fs->io, new_blk, 1, block);
-	}
+		ext2fs_free_mem(&block);
+	} else
+		retval = ext2fs_zero_blocks2(fs, new_blk, 1, NULL, NULL);
 	if (blockcnt >= 0)
 		es->goal = new_blk;
 	if (retval) {
 		es->err = retval;
 		return BLOCK_ABORT;
 	}
-	ext2fs_free_mem(&block);
 	*blocknr = new_blk;
 
 	if (es->done)
diff --git a/lib/ext2fs/freefs.c b/lib/ext2fs/freefs.c
index 89a157b..ea9742e 100644
--- a/lib/ext2fs/freefs.c
+++ b/lib/ext2fs/freefs.c
@@ -61,6 +61,7 @@ void ext2fs_free(ext2_filsys fs)
 
 	fs->magic = 0;
 
+	ext2fs_zero_blocks2(NULL, 0, 0, NULL, NULL);
 	ext2fs_free_mem(&fs);
 }
 
diff --git a/lib/ext2fs/mkjournal.c b/lib/ext2fs/mkjournal.c
index 5be425c..3cc15a9 100644
--- a/lib/ext2fs/mkjournal.c
+++ b/lib/ext2fs/mkjournal.c
@@ -148,7 +148,7 @@ errfree:
  * attempt to free the static zeroizing buffer.  (This is to keep
  * programs that check for memory leaks happy.)
  */
-#define STRIDE_LENGTH 8
+#define STRIDE_LENGTH (4194304 / fs->blocksize)
 errcode_t ext2fs_zero_blocks2(ext2_filsys fs, blk64_t blk, int num,
 			      blk64_t *ret_blk, int *ret_count)
 {
@@ -372,20 +372,20 @@ static errcode_t write_journal_inode(ext2_filsys fs, ext2_ino_t journal_ino,
 	retval = ext2fs_block_iterate3(fs, journal_ino, BLOCK_FLAG_APPEND,
 				       0, mkjournal_proc, &es);
 	if (retval)
-		goto errout;
+		goto out2;
 	if (es.err) {
 		retval = es.err;
-		goto errout;
+		goto out2;
 	}
 	if (es.zero_count) {
 		retval = ext2fs_zero_blocks2(fs, es.blk_to_zero,
 					    es.zero_count, 0, 0);
 		if (retval)
-			goto errout;
+			goto out2;
 	}
 
 	if ((retval = ext2fs_read_inode(fs, journal_ino, &inode)))
-		goto errout;
+		goto out2;
 
 	inode_size = (unsigned long long)fs->blocksize * num_blocks;
 	ext2fs_iblk_add_blocks(fs, &inode, es.newblocks);
@@ -394,10 +394,10 @@ static errcode_t write_journal_inode(ext2_filsys fs, ext2_ino_t journal_ino,
 	inode.i_mode = LINUX_S_IFREG | 0600;
 	retval = ext2fs_inode_size_set(fs, &inode, inode_size);
 	if (retval)
-		goto errout;
+		goto out2;
 
 	if ((retval = ext2fs_write_new_inode(fs, journal_ino, &inode)))
-		goto errout;
+		goto out2;
 	retval = 0;
 
 	memcpy(fs->super->s_jnl_blocks, inode.i_block, EXT2_N_BLOCKS*4);
@@ -406,8 +406,6 @@ static errcode_t write_journal_inode(ext2_filsys fs, ext2_ino_t journal_ino,
 	fs->super->s_jnl_backup_type = EXT3_JNL_BACKUP_BLOCKS;
 	ext2fs_mark_super_dirty(fs);
 
-errout:
-	ext2fs_zero_blocks2(0, 0, 0, 0, 0);
 out2:
 	ext2fs_free_mem(&buf);
 	return retval;
diff --git a/misc/mke2fs.c b/misc/mke2fs.c
index 3a963d7..055b2ab 100644
--- a/misc/mke2fs.c
+++ b/misc/mke2fs.c
@@ -434,7 +434,6 @@ static void write_inode_tables(ext2_filsys fs, int lazy_flag, int itable_zeroed)
 				sync();
 		}
 	}
-	ext2fs_zero_blocks2(0, 0, 0, 0, 0);
 	ext2fs_numeric_progress_close(fs, &progress,
 				      _("done                            \n"));
 
@@ -623,7 +622,6 @@ static void create_journal_dev(ext2_filsys fs)
 		count -= c;
 		ext2fs_numeric_progress_update(fs, &progress, blk);
 	}
-	ext2fs_zero_blocks2(0, 0, 0, 0, 0);
 
 	ext2fs_numeric_progress_close(fs, &progress, NULL);
 write_superblock:
diff --git a/resize/resize2fs.c b/resize/resize2fs.c
index b59f482..57fe485 100644
--- a/resize/resize2fs.c
+++ b/resize/resize2fs.c
@@ -758,11 +758,11 @@ static errcode_t adjust_superblock(ext2_resize_t rfs, blk64_t new_size)
 		/*
 		 * Write out the new inode table
 		 */
-		retval = io_channel_write_blk64(fs->io,
-						ext2fs_inode_table_loc(fs, i),
-						fs->inode_blocks_per_group,
-						rfs->itable_buf);
-		if (retval) goto errout;
+		retval = ext2fs_zero_blocks2(fs, ext2fs_inode_table_loc(fs, i),
+					     fs->inode_blocks_per_group, NULL,
+					     NULL);
+		if (retval)
+			goto errout;
 
 		io_channel_flush(fs->io);
 		if (rfs->progress) {
@@ -2186,15 +2186,11 @@ static errcode_t fix_resize_inode(ext2_filsys fs)
 {
 	struct ext2_inode	inode;
 	errcode_t		retval;
-	char			*block_buf = NULL;
 
 	if (!(fs->super->s_feature_compat &
 	      EXT2_FEATURE_COMPAT_RESIZE_INODE))
 		return 0;
 
-	retval = ext2fs_get_mem(fs->blocksize, &block_buf);
-	if (retval) goto errout;
-
 	retval = ext2fs_read_inode(fs, EXT2_RESIZE_INO, &inode);
 	if (retval) goto errout;
 
@@ -2214,19 +2210,16 @@ static errcode_t fix_resize_inode(ext2_filsys fs)
 		exit(1);
 	}
 
-	memset(block_buf, 0, fs->blocksize);

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 17/34] libext2fs: support allocating uninit blocks in bmap2()
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (15 preceding siblings ...)
  2014-09-13 22:12 ` [PATCH 16/34] libext2fs/e2fsck: refactor everyone who writes zero blocks to disk Darrick J. Wong
@ 2014-09-13 22:13 ` Darrick J. Wong
  2014-10-13 14:35   ` Theodore Ts'o
  2014-09-13 22:13 ` [PATCH 18/34] libext2fs: file IO routines should handle uninit blocks Darrick J. Wong
                   ` (16 subsequent siblings)
  33 siblings, 1 reply; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:13 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

In order to support fallocate, we need to be able to have
ext2fs_bmap2() allocate blocks and put them into uninitialized
extents.  There's a flag to do this in the extent code, but it's not
exposed to the bmap2 interface, so plumb that in.  Eventually
fallocate or fuse2fs or somebody will use it.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 lib/ext2fs/bmap.c   |   24 ++++++++++++++++++++++--
 lib/ext2fs/ext2fs.h |    1 +
 2 files changed, 23 insertions(+), 2 deletions(-)


diff --git a/lib/ext2fs/bmap.c b/lib/ext2fs/bmap.c
index c1d0e6f..a4dc8ef 100644
--- a/lib/ext2fs/bmap.c
+++ b/lib/ext2fs/bmap.c
@@ -72,6 +72,11 @@ static _BMAP_INLINE_ errcode_t block_ind_bmap(ext2_filsys fs, int flags,
 					    block_buf + fs->blocksize, &b);
 		if (retval)
 			return retval;
+		if (flags & BMAP_UNINIT) {
+			retval = ext2fs_zero_blocks2(fs, b, 1, NULL, NULL);
+			if (retval)
+				return retval;
+		}
 
 #ifdef WORDS_BIGENDIAN
 		((blk_t *) block_buf)[nr] = ext2fs_swab32(b);
@@ -214,10 +219,13 @@ static errcode_t extent_bmap(ext2_filsys fs, ext2_ino_t ino,
 	errcode_t		retval = 0;
 	blk64_t			blk64 = 0;
 	int			alloc = 0;
+	int			set_flags;
+
+	set_flags = bmap_flags & BMAP_UNINIT ? EXT2_EXTENT_SET_BMAP_UNINIT : 0;
 
 	if (bmap_flags & BMAP_SET) {
 		retval = ext2fs_extent_set_bmap(handle, block,
-						*phys_blk, 0);
+						*phys_blk, set_flags);
 		return retval;
 	}
 	retval = ext2fs_extent_goto(handle, block);
@@ -254,7 +262,7 @@ got_block:
 		alloc++;
 	set_extent:
 		retval = ext2fs_extent_set_bmap(handle, block,
-						blk64, 0);
+						blk64, set_flags);
 		if (retval) {
 			ext2fs_block_alloc_stats2(fs, blk64, -1);
 			return retval;
@@ -345,6 +353,12 @@ errcode_t ext2fs_bmap2(ext2_filsys fs, ext2_ino_t ino, struct ext2_inode *inode,
 		goto done;
 	}
 
+	if ((bmap_flags & BMAP_SET) && (bmap_flags & BMAP_UNINIT)) {
+		retval = ext2fs_zero_blocks2(fs, *phys_blk, 1, NULL, NULL);
+		if (retval)
+			goto done;
+	}
+
 	if (block < EXT2_NDIR_BLOCKS) {
 		if (bmap_flags & BMAP_SET) {
 			b = *phys_blk;
@@ -360,6 +374,12 @@ errcode_t ext2fs_bmap2(ext2_filsys fs, ext2_ino_t ino, struct ext2_inode *inode,
 			retval = ext2fs_alloc_block(fs, b, block_buf, &b);
 			if (retval)
 				goto done;
+			if (bmap_flags & BMAP_UNINIT) {
+				retval = ext2fs_zero_blocks2(fs, b, 1, NULL,
+							     NULL);
+				if (retval)
+					goto done;
+			}
 			inode_bmap(inode, block) = b;
 			blocks_alloc++;
 			*phys_blk = b;
diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
index fe82a32..3419185 100644
--- a/lib/ext2fs/ext2fs.h
+++ b/lib/ext2fs/ext2fs.h
@@ -533,6 +533,7 @@ typedef struct ext2_icount *ext2_icount_t;
  */
 #define BMAP_ALLOC	0x0001
 #define BMAP_SET	0x0002
+#define BMAP_UNINIT	0x0004
 
 /*
  * Returned flags from ext2fs_bmap


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 18/34] libext2fs: file IO routines should handle uninit blocks
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (16 preceding siblings ...)
  2014-09-13 22:13 ` [PATCH 17/34] libext2fs: support allocating uninit blocks in bmap2() Darrick J. Wong
@ 2014-09-13 22:13 ` Darrick J. Wong
  2014-09-13 22:13 ` [PATCH 19/34] resize2fs: convert fs to and from 64bit mode Darrick J. Wong
                   ` (15 subsequent siblings)
  33 siblings, 0 replies; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:13 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

The file IO routines do not handle uninit blocks at all.  The read
method should check for the uninit flag and return a buffer of zeroes,
and the write routine should convert unwritten extents.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 lib/ext2fs/fileio.c         |   24 ++++++++++++++++++++++--
 tests/f_uninit_cat/expect   |  Bin
 tests/f_uninit_cat/image.gz |  Bin
 tests/f_uninit_cat/name     |    1 +
 tests/f_uninit_cat/script   |   37 +++++++++++++++++++++++++++++++++++++
 5 files changed, 60 insertions(+), 2 deletions(-)
 create mode 100644 tests/f_uninit_cat/expect
 create mode 100644 tests/f_uninit_cat/image.gz
 create mode 100644 tests/f_uninit_cat/name
 create mode 100755 tests/f_uninit_cat/script


diff --git a/lib/ext2fs/fileio.c b/lib/ext2fs/fileio.c
index 1d5032a..d0a05d6 100644
--- a/lib/ext2fs/fileio.c
+++ b/lib/ext2fs/fileio.c
@@ -123,6 +123,8 @@ errcode_t ext2fs_file_flush(ext2_file_t file)
 {
 	errcode_t	retval;
 	ext2_filsys fs;
+	int		ret_flags;
+	blk64_t		dontcare;
 
 	EXT2_CHECK_MAGIC(file, EXT2_ET_MAGIC_EXT2_FILE);
 	fs = file->fs;
@@ -131,6 +133,22 @@ errcode_t ext2fs_file_flush(ext2_file_t file)
 	    !(file->flags & EXT2_FILE_BUF_DIRTY))
 		return 0;
 
+	/* Is this an uninit block? */
+	if (file->physblock && file->inode.i_flags & EXT4_EXTENTS_FL) {
+		retval = ext2fs_bmap2(fs, file->ino, &file->inode, BMAP_BUFFER,
+				      0, file->blockno, &ret_flags, &dontcare);
+		if (retval)
+			return retval;
+		if (ret_flags & BMAP_RET_UNINIT) {
+			retval = ext2fs_bmap2(fs, file->ino, &file->inode,
+					      BMAP_BUFFER, BMAP_SET,
+					      file->blockno, 0,
+					      &file->physblock);
+			if (retval)
+				return retval;
+		}
+	}
+
 	/*
 	 * OK, the physical block hasn't been allocated yet.
 	 * Allocate it.
@@ -185,15 +203,17 @@ static errcode_t load_buffer(ext2_file_t file, int dontfill)
 {
 	ext2_filsys	fs = file->fs;
 	errcode_t	retval;
+	int		ret_flags;
 
 	if (!(file->flags & EXT2_FILE_BUF_VALID)) {
 		retval = ext2fs_bmap2(fs, file->ino, &file->inode,
-				     BMAP_BUFFER, 0, file->blockno, 0,
+				     BMAP_BUFFER, 0, file->blockno, &ret_flags,
 				     &file->physblock);
 		if (retval)
 			return retval;
 		if (!dontfill) {
-			if (file->physblock) {
+			if (file->physblock &&
+			    !(ret_flags & BMAP_RET_UNINIT)) {
 				retval = io_channel_read_blk64(fs->io,
 							       file->physblock,
 							       1, file->buf);
diff --git a/tests/f_uninit_cat/expect b/tests/f_uninit_cat/expect
new file mode 100644
index 0000000000000000000000000000000000000000..0c0a5cf84cc58483ca33e257259607e4ca54ebbe
GIT binary patch
literal 3623
zcmeH{v2KGf5Qa1J6t`4rrKVs4QNvoLQ-&;^E5{tf3b8An9fdx94LB%6-ypu_bol%I
zbN}EyCjwT%#}UOzsurZuHPR~_IxSAVb5#S$U!-I|p!pqIOM}8{(*s%KgmnfdX!S27
zv{Igz7is&6EABXh4H{GeL1?FJuq*F~)@b(w<j!aAEv0I-IddzuN-UE7Ze)klQw1zf
z*9D9tJZEp&6DX~g-rdU9X-6-wP?Ral@**smY_HP#9k_J_k|0ZJJh-+Y5Zr=OQu*WI
zzT5W-@CqqUc6h-Kw#pib1XJyFD+TYSVSnstoOY;MdxX!9x0FDZLgoRM0<1<bgJtNx
zmG9H!`&bOV#$T9q`K)6>#|E(61l{JQn-!~Bkq1RSFzev!`&hG6*uNSB@QW^D#ROsk
a51YU>R!kr!@URIy?86rqj|s#C7RevINPdd|

literal 0
HcmV?d00001

diff --git a/tests/f_uninit_cat/image.gz b/tests/f_uninit_cat/image.gz
new file mode 100644
index 0000000000000000000000000000000000000000..d2ae66cb35f93dd4022e85f8667365487a7b24ea
GIT binary patch
literal 1553
zcmb2|=3wBo77AfvetUOgws4>X!-w+e5!ulJUCtMqn*6x7X0!`+M-&JY2D}M!S)+8+
zDVJ6IMapY?1&f!@9@sy4qUHCltf(-cFi@sM<Z<;*mRk{C3zur%doRAP`s7{J>E%1$
zf6YFjs>qO9YUaC2aZcCS-(DQb8_$NVE7;Mo<=)q*|3~ghM6S5L{?7@i*S^PYuMLg<
z{yO)bMDN+#HZAQNWmni(e$ISz=k4?0&B2S~=hf!a%c;LRqnoqOeE0R(Td}i${(5!v
zN9*zV=lkXUet4HAR(*B4`0223$*Xyed2c)RXUFctq3v?k4*osNhJB`=>Q$cfF5YY)
zq&Ig<I|BqHJYRiXzO>c;>t5Mgf0ATh%=q$g`@RWJoj2;Pd;ar6z(>h_e;2CHpSyp1
zK~2@ae|Oey(P??=HSMlJ&Y}8++jalfhj#s8_ga)C-tdS0iM_&q#-IEr>K*<uf0Cc@
zA1uVcut9k1xj*G<CR6-R{q_2++YoEG%ju)G)m_7H5^{`CS3sx*%4*qvh1Y5X@893w
ze(dD=mG_O$YD^1}+TT9sT+6vfTb`$HdzF3Z(w=*|H~uB`<f~6#9=s)v-CcQutl0EN
zDLbP6mj5|_XrHg$w}ZTW^>6Fj_h@afy|?67dd1pb`uxmif2pqt{Qt!?S^L#|zi0nX
zFI}X(^4;uQx9mT!qeB0OhF!1kR=hH+G@<KquWi^F`Rv=r-e*s4zxMk7(bMbaFY_=>
zsFGV7C4lT;jqS<~nYIjGr#U#DKDAdsmKtS`hQMeD44Duxd!&D1J4-kNg8%~n1^y`^

literal 0
HcmV?d00001

diff --git a/tests/f_uninit_cat/name b/tests/f_uninit_cat/name
new file mode 100644
index 0000000..f6b5674
--- /dev/null
+++ b/tests/f_uninit_cat/name
@@ -0,0 +1 @@
+cat a file with uninit blocks
diff --git a/tests/f_uninit_cat/script b/tests/f_uninit_cat/script
new file mode 100755
index 0000000..11dbb6e
--- /dev/null
+++ b/tests/f_uninit_cat/script
@@ -0,0 +1,37 @@
+#!/bin/bash
+
+if test -x $DEBUGFS_EXE; then
+FSCK_OPT=-fy
+IMAGE=$test_dir/image.gz
+
+gzip -d < $IMAGE > $TMPFILE
+#e2label $TMPFILE test_filesys
+
+# Run fsck to fix things?
+EXP=$test_dir/expect
+OUT=$test_name.log
+rm -rf $test_name.failed $test_name.ok
+
+$FSCK $FSCK_OPT -N test_filesys $TMPFILE 2>&1 | sed -f $cmd_dir/filter.sed > $OUT
+echo "Exit status is $?" >> $OUT
+
+echo "debugfs cat uninit file" >> $OUT
+echo "ex /a" > $TMPFILE.cmd
+echo "cat /a" >> $TMPFILE.cmd
+$DEBUGFS_EXE -w $TMPFILE -f $TMPFILE.cmd >> $OUT.new 2>&1
+echo >> $OUT.new
+sed -f $cmd_dir/filter.sed < $OUT.new >> $OUT
+rm -rf $OUT.new $TMPFILE
+
+# Figure out what happened
+if cmp -s $EXP $OUT; then
+	echo "$test_name: $test_description: ok"
+	touch $test_name.ok
+else
+	echo "$test_name: $test_description: failed"
+	diff -u $EXP $OUT >> $test_name.failed
+fi
+unset EXP OUT FSCK_OPT IMAGE
+else #if test -a -x $DEBUGFS_EXE; then
+        echo "$test_name: $test_description: skipped"
+fi 


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 19/34] resize2fs: convert fs to and from 64bit mode
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (17 preceding siblings ...)
  2014-09-13 22:13 ` [PATCH 18/34] libext2fs: file IO routines should handle uninit blocks Darrick J. Wong
@ 2014-09-13 22:13 ` Darrick J. Wong
  2014-09-14 17:34   ` TR Reardon
  2014-09-13 22:13 ` [PATCH 20/34] resize2fs: adjust reserved_gdt_blocks when changing group descriptor size Darrick J. Wong
                   ` (14 subsequent siblings)
  33 siblings, 1 reply; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:13 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

resize2fs does its magic by loading a filesystem, duplicating the
in-memory image of that fs, moving relevant blocks out of the way of
whatever new metadata get created, and finally writing everything back
out to disk.  Enabling 64bit mode enlarges the group descriptors,
which makes resize2fs a reasonable vehicle for taking care of the rest
of the bookkeeping requirements, so add to resize2fs the ability to
convert a filesystem to 64bit mode and back.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 resize/main.c         |   40 ++++++
 resize/resize2fs.8.in |   18 +++
 resize/resize2fs.c    |  326 ++++++++++++++++++++++++++++++++++++++++++++++++-
 resize/resize2fs.h    |    3 
 4 files changed, 379 insertions(+), 8 deletions(-)


diff --git a/resize/main.c b/resize/main.c
index c107028..9fea3d8 100644
--- a/resize/main.c
+++ b/resize/main.c
@@ -42,7 +42,7 @@ static char *device_name, *io_options;
 static void usage (char *prog)
 {
 	fprintf (stderr, _("Usage: %s [-d debug_flags] [-f] [-F] [-M] [-P] "
-			   "[-p] device [new_size]\n\n"), prog);
+			   "[-p] device [-b|-s|new_size]\n\n"), prog);
 
 	exit (1);
 }
@@ -200,7 +200,7 @@ int main (int argc, char ** argv)
 	if (argc && *argv)
 		program_name = *argv;
 
-	while ((c = getopt (argc, argv, "d:fFhMPpS:")) != EOF) {
+	while ((c = getopt(argc, argv, "d:fFhMPpS:bs")) != EOF) {
 		switch (c) {
 		case 'h':
 			usage(program_name);
@@ -226,6 +226,12 @@ int main (int argc, char ** argv)
 		case 'S':
 			use_stride = atoi(optarg);
 			break;
+		case 'b':
+			flags |= RESIZE_ENABLE_64BIT;
+			break;
+		case 's':
+			flags |= RESIZE_DISABLE_64BIT;
+			break;
 		default:
 			usage(program_name);
 		}
@@ -389,6 +395,10 @@ int main (int argc, char ** argv)
 		if (sys_page_size > fs->blocksize)
 			new_size &= ~((sys_page_size / fs->blocksize)-1);
 	}
+	/* If changing 64bit, don't change the filesystem size. */
+	if (flags & (RESIZE_DISABLE_64BIT | RESIZE_ENABLE_64BIT)) {
+		new_size = ext2fs_blocks_count(fs->super);
+	}
 	if (!EXT2_HAS_INCOMPAT_FEATURE(fs->super,
 				       EXT4_FEATURE_INCOMPAT_64BIT)) {
 		/* Take 16T down to 2^32-1 blocks */
@@ -440,7 +450,31 @@ int main (int argc, char ** argv)
 			fs->blocksize / 1024, new_size);
 		exit(1);
 	}
-	if (new_size == ext2fs_blocks_count(fs->super)) {
+	if ((flags & RESIZE_DISABLE_64BIT) && (flags & RESIZE_ENABLE_64BIT)) {
+		fprintf(stderr, _("Cannot set and unset 64bit feature.\n"));
+		exit(1);
+	} else if (flags & (RESIZE_DISABLE_64BIT | RESIZE_ENABLE_64BIT)) {
+		new_size = ext2fs_blocks_count(fs->super);
+		if (new_size >= (1ULL << 32)) {
+			fprintf(stderr, _("Cannot change the 64bit feature "
+				"on a filesystem that is larger than "
+				"2^32 blocks.\n"));
+			exit(1);
+		}
+		if (mount_flags & EXT2_MF_MOUNTED) {
+			fprintf(stderr, _("Cannot change the 64bit feature "
+				"while the filesystem is mounted.\n"));
+			exit(1);
+		}
+		if (flags & RESIZE_ENABLE_64BIT &&
+		    !EXT2_HAS_INCOMPAT_FEATURE(fs->super,
+				EXT3_FEATURE_INCOMPAT_EXTENTS)) {
+			fprintf(stderr, _("Please enable the extents feature "
+				"with tune2fs before enabling the 64bit "
+				"feature.\n"));
+			exit(1);
+		}
+	} else if (new_size == ext2fs_blocks_count(fs->super)) {
 		fprintf(stderr, _("The filesystem is already %llu (%dk) "
 			"blocks long.  Nothing to do!\n\n"), new_size,
 			fs->blocksize / 1024);
diff --git a/resize/resize2fs.8.in b/resize/resize2fs.8.in
index 86495c6..0129bfc 100644
--- a/resize/resize2fs.8.in
+++ b/resize/resize2fs.8.in
@@ -8,7 +8,7 @@ resize2fs \- ext2/ext3/ext4 file system resizer
 .SH SYNOPSIS
 .B resize2fs
 [
-.B \-fFpPM
+.B \-fFpPMbs
 ]
 [
 .B \-d
@@ -86,8 +86,21 @@ to shrink the size of filesystem.  Then you may use
 to shrink the size of the partition.  When shrinking the size of
 the partition, make sure you do not make it smaller than the new size
 of the ext2 filesystem!
+.PP
+The
+.B \-b
+and
+.B \-s
+options enable and disable the 64bit feature, respectively.  The resize2fs
+program will, of course, take care of resizing the block group descriptors
+and moving other data blocks out of the way, as needed.  It is not possible
+to resize the filesystem concurrent with changing the 64bit status.
 .SH OPTIONS
 .TP
+.B \-b
+Turns on the 64bit feature, resizes the group descriptors as necessary, and
+moves other metadata out of the way.
+.TP
 .B \-d \fIdebug-flags
 Turns on various resize2fs debugging features, if they have been compiled
 into the binary.
@@ -127,6 +140,9 @@ of what the program is doing.
 .B \-P
 Print the minimum size of the filesystem and exit.
 .TP
+.B \-s
+Turns off the 64bit feature and frees blocks that are no longer in use.
+.TP
 .B \-S \fIRAID-stride
 The
 .B resize2fs
diff --git a/resize/resize2fs.c b/resize/resize2fs.c
index 57fe485..30cdfbd 100644
--- a/resize/resize2fs.c
+++ b/resize/resize2fs.c
@@ -56,6 +56,9 @@ static errcode_t mark_table_blocks(ext2_filsys fs,
 static errcode_t clear_sparse_super2_last_group(ext2_resize_t rfs);
 static errcode_t reserve_sparse_super2_last_group(ext2_resize_t rfs,
 						 ext2fs_block_bitmap meta_bmap);
+static errcode_t resize_group_descriptors(ext2_resize_t rfs, blk64_t new_size);
+static errcode_t move_bg_metadata(ext2_resize_t rfs);
+static errcode_t zero_high_bits_in_inodes(ext2_resize_t rfs);
 
 /*
  * Some helper CPP macros
@@ -122,13 +125,30 @@ errcode_t resize_fs(ext2_filsys fs, blk64_t *new_size, int flags,
 	if (retval)
 		goto errout;
 
+	init_resource_track(&rtrack, "resize_group_descriptors", fs->io);
+	retval = resize_group_descriptors(rfs, *new_size);
+	if (retval)
+		goto errout;
+	print_resource_track(rfs, &rtrack, fs->io);
+
+	init_resource_track(&rtrack, "move_bg_metadata", fs->io);
+	retval = move_bg_metadata(rfs);
+	if (retval)
+		goto errout;
+	print_resource_track(rfs, &rtrack, fs->io);
+
+	init_resource_track(&rtrack, "zero_high_bits_in_metadata", fs->io);
+	retval = zero_high_bits_in_inodes(rfs);
+	if (retval)
+		goto errout;
+	print_resource_track(rfs, &rtrack, fs->io);
+
 	init_resource_track(&rtrack, "adjust_superblock", fs->io);
 	retval = adjust_superblock(rfs, *new_size);
 	if (retval)
 		goto errout;
 	print_resource_track(rfs, &rtrack, fs->io);
 
-
 	init_resource_track(&rtrack, "fix_uninit_block_bitmaps 2", fs->io);
 	fix_uninit_block_bitmaps(rfs->new_fs);
 	print_resource_track(rfs, &rtrack, fs->io);
@@ -231,6 +251,301 @@ errout:
 	return retval;
 }
 
+/* Toggle 64bit mode */
+static errcode_t resize_group_descriptors(ext2_resize_t rfs, blk64_t new_size)
+{
+	void *o, *n, *new_group_desc;
+	dgrp_t i;
+	int copy_size;
+	errcode_t retval;
+
+	if (!(rfs->flags & (RESIZE_DISABLE_64BIT | RESIZE_ENABLE_64BIT)))
+		return 0;
+
+	if (new_size != ext2fs_blocks_count(rfs->new_fs->super) ||
+	    ext2fs_blocks_count(rfs->new_fs->super) >= (1ULL << 32) ||
+	    (rfs->flags & RESIZE_DISABLE_64BIT &&
+	     rfs->flags & RESIZE_ENABLE_64BIT))
+		return EXT2_ET_INVALID_ARGUMENT;
+
+	if (rfs->flags & RESIZE_DISABLE_64BIT) {
+		rfs->new_fs->super->s_feature_incompat &=
+				~EXT4_FEATURE_INCOMPAT_64BIT;
+		rfs->new_fs->super->s_desc_size = EXT2_MIN_DESC_SIZE;
+	} else if (rfs->flags & RESIZE_ENABLE_64BIT) {
+		rfs->new_fs->super->s_feature_incompat |=
+				EXT4_FEATURE_INCOMPAT_64BIT;
+		rfs->new_fs->super->s_desc_size = EXT2_MIN_DESC_SIZE_64BIT;
+	}
+
+	if (EXT2_DESC_SIZE(rfs->old_fs->super) ==
+	    EXT2_DESC_SIZE(rfs->new_fs->super))
+		return 0;
+
+	o = rfs->new_fs->group_desc;
+	rfs->new_fs->desc_blocks = ext2fs_div_ceil(
+			rfs->old_fs->group_desc_count,
+			EXT2_DESC_PER_BLOCK(rfs->new_fs->super));
+	retval = ext2fs_get_arrayzero(rfs->new_fs->desc_blocks,
+				      rfs->old_fs->blocksize, &new_group_desc);
+	if (retval)
+		return retval;
+
+	n = new_group_desc;
+
+	if (EXT2_DESC_SIZE(rfs->old_fs->super) <=
+	    EXT2_DESC_SIZE(rfs->new_fs->super))
+		copy_size = EXT2_DESC_SIZE(rfs->old_fs->super);
+	else
+		copy_size = EXT2_DESC_SIZE(rfs->new_fs->super);
+	for (i = 0; i < rfs->old_fs->group_desc_count; i++) {
+		memcpy(n, o, copy_size);
+		n += EXT2_DESC_SIZE(rfs->new_fs->super);
+		o += EXT2_DESC_SIZE(rfs->old_fs->super);
+	}
+
+	ext2fs_free_mem(&rfs->new_fs->group_desc);
+	rfs->new_fs->group_desc = new_group_desc;
+
+	for (i = 0; i < rfs->old_fs->group_desc_count; i++)
+		ext2fs_group_desc_csum_set(rfs->new_fs, i);
+
+	return 0;
+}
+
+/* Move bitmaps/inode tables out of the way. */
+static errcode_t move_bg_metadata(ext2_resize_t rfs)
+{
+	dgrp_t i;
+	blk64_t b, c, d, old_desc_blocks, new_desc_blocks, j;
+	ext2fs_block_bitmap old_map, new_map;
+	int old, new;
+	errcode_t retval;
+	int cluster_ratio;
+
+	if (!(rfs->flags & (RESIZE_DISABLE_64BIT | RESIZE_ENABLE_64BIT)))
+		return 0;
+
+	retval = ext2fs_allocate_block_bitmap(rfs->old_fs, "oldfs", &old_map);
+	if (retval)
+		return retval;
+
+	retval = ext2fs_allocate_block_bitmap(rfs->new_fs, "newfs", &new_map);
+	if (retval)
+		goto out;
+
+	if (EXT2_HAS_INCOMPAT_FEATURE(rfs->old_fs->super,
+				      EXT2_FEATURE_INCOMPAT_META_BG)) {
+		old_desc_blocks = rfs->old_fs->super->s_first_meta_bg;
+		new_desc_blocks = rfs->new_fs->super->s_first_meta_bg;
+	} else {
+		old_desc_blocks = rfs->old_fs->desc_blocks +
+				rfs->old_fs->super->s_reserved_gdt_blocks;
+		new_desc_blocks = rfs->new_fs->desc_blocks +
+				rfs->new_fs->super->s_reserved_gdt_blocks;
+	}
+
+	/* Construct bitmaps of super/descriptor blocks in old and new fs */
+	for (i = 0; i < rfs->old_fs->group_desc_count; i++) {
+		retval = ext2fs_super_and_bgd_loc2(rfs->old_fs, i, &b, &c, &d,
+						   NULL);
+		if (retval)
+			goto out;
+		if (b)
+			ext2fs_mark_block_bitmap2(old_map, b);
+		for (j = 0; c != 0 && j < old_desc_blocks; j++)
+			ext2fs_mark_block_bitmap2(old_map, c + j);
+		if (d)
+			ext2fs_mark_block_bitmap2(old_map, d);
+
+		retval = ext2fs_super_and_bgd_loc2(rfs->new_fs, i, &b, &c, &d,
+						   NULL);
+		if (retval)
+			goto out;
+		if (b)
+			ext2fs_mark_block_bitmap2(new_map, b);
+		for (j = 0; c != 0 && j < new_desc_blocks; j++)
+			ext2fs_mark_block_bitmap2(new_map, c + j);
+		if (d)
+			ext2fs_mark_block_bitmap2(new_map, d);
+	}
+
+	cluster_ratio = EXT2FS_CLUSTER_RATIO(rfs->new_fs);
+
+	/* Find changes in block allocations for bg metadata */
+	for (b = EXT2FS_B2C(rfs->old_fs,
+			    rfs->old_fs->super->s_first_data_block);
+	     b < ext2fs_blocks_count(rfs->new_fs->super);
+	     b += cluster_ratio) {
+		old = ext2fs_test_block_bitmap2(old_map, b);
+		new = ext2fs_test_block_bitmap2(new_map, b);
+
+		if (old && !new) {
+			/* mark old_map, unmark new_map */
+			if (cluster_ratio == 1)
+				ext2fs_unmark_block_bitmap2(
+						rfs->new_fs->block_map, b);
+		} else if (!old && new)
+			; /* unmark old_map, mark new_map */
+		else {
+			ext2fs_unmark_block_bitmap2(old_map, b);
+			ext2fs_unmark_block_bitmap2(new_map, b);
+		}
+	}
+
+	/*
+	 * new_map now shows blocks that have been newly allocated.
+	 * old_map now shows blocks that have been newly freed.
+	 */
+
+	/*
+	 * Move any conflicting bitmaps and inode tables.  Ensure that we
+	 * don't try to free clusters associated with bitmaps or tables.
+	 */
+	for (i = 0; i < rfs->old_fs->group_desc_count; i++) {
+		b = ext2fs_block_bitmap_loc(rfs->new_fs, i);
+		if (ext2fs_test_block_bitmap2(new_map, b))
+			ext2fs_block_bitmap_loc_set(rfs->new_fs, i, 0);
+		else if (ext2fs_test_block_bitmap2(old_map, b))
+			ext2fs_unmark_block_bitmap2(old_map, b);
+
+		b = ext2fs_inode_bitmap_loc(rfs->new_fs, i);
+		if (ext2fs_test_block_bitmap2(new_map, b))
+			ext2fs_inode_bitmap_loc_set(rfs->new_fs, i, 0);
+		else if (ext2fs_test_block_bitmap2(old_map, b))
+			ext2fs_unmark_block_bitmap2(old_map, b);
+
+		c = ext2fs_inode_table_loc(rfs->new_fs, i);
+		for (b = 0;
+		     b < rfs->new_fs->inode_blocks_per_group;
+		     b++) {
+			if (ext2fs_test_block_bitmap2(new_map, b + c))
+				ext2fs_inode_table_loc_set(rfs->new_fs, i, 0);
+			else if (ext2fs_test_block_bitmap2(old_map, b + c))
+				ext2fs_unmark_block_bitmap2(old_map, b + c);
+		}
+	}
+
+	/* Free unused clusters */
+	for (b = 0;
+	     cluster_ratio > 1 && b < ext2fs_blocks_count(rfs->new_fs->super);
+	     b += cluster_ratio)
+		if (ext2fs_test_block_bitmap2(old_map, b))
+			ext2fs_unmark_block_bitmap2(rfs->new_fs->block_map, b);
+out:
+	if (old_map)
+		ext2fs_free_block_bitmap(old_map);
+	if (new_map)
+		ext2fs_free_block_bitmap(new_map);
+	return retval;
+}
+
+/* Zero out the high bits of extent fields */
+static errcode_t zero_high_bits_in_extents(ext2_filsys fs, ext2_ino_t ino,
+				 struct ext2_inode *inode)
+{
+	ext2_extent_handle_t	handle;
+	struct ext2fs_extent	extent;
+	int			op = EXT2_EXTENT_ROOT;
+	errcode_t		errcode;
+
+	if (!(inode->i_flags & EXT4_EXTENTS_FL))
+		return 0;
+
+	errcode = ext2fs_extent_open(fs, ino, &handle);
+	if (errcode)
+		return errcode;
+
+	while (1) {
+		errcode = ext2fs_extent_get(handle, op, &extent);
+		if (errcode)
+			break;
+
+		op = EXT2_EXTENT_NEXT_SIB;
+
+		if (extent.e_pblk > (1ULL << 32)) {
+			extent.e_pblk &= (1ULL << 32) - 1;
+			errcode = ext2fs_extent_replace(handle, 0, &extent);
+			if (errcode)
+				break;
+		}
+	}
+
+	/* Ok if we run off the end */
+	if (errcode == EXT2_ET_EXTENT_NO_NEXT)
+		errcode = 0;
+	return errcode;
+}
+
+/* Zero out the high bits of inodes. */
+static errcode_t zero_high_bits_in_inodes(ext2_resize_t rfs)
+{
+	ext2_filsys	fs = rfs->new_fs;
+	int length = EXT2_INODE_SIZE(fs->super);
+	struct ext2_inode *inode = NULL;
+	ext2_inode_scan	scan = NULL;
+	errcode_t	retval;
+	ext2_ino_t	ino;
+
+	if (!(rfs->flags & (RESIZE_DISABLE_64BIT | RESIZE_ENABLE_64BIT)))
+		return 0;
+
+	if (fs->super->s_creator_os != EXT2_OS_LINUX)
+		return 0;
+
+	retval = ext2fs_open_inode_scan(fs, 0, &scan);
+	if (retval)
+		return retval;
+
+	retval = ext2fs_get_mem(length, &inode);
+	if (retval)
+		goto out;
+
+	do {
+		retval = ext2fs_get_next_inode_full(scan, &ino, inode, length);
+		if (retval)
+			goto out;
+		if (!ino)
+			break;
+		if (!ext2fs_test_inode_bitmap2(fs->inode_map, ino))
+			continue;
+
+		/*
+		 * Here's how we deal with high block number fields:
+		 *
+		 *  - i_size_high has been been written out with i_size_lo
+		 *    since the ext2 days, so no conversion is needed.
+		 *
+		 *  - i_blocks_hi is guarded by both the huge_file feature and
+		 *    inode flags and has always been written out with
+		 *    i_blocks_lo if the feature is set.  The field is only
+		 *    ever read if both feature and inode flag are set, so
+		 *    we don't need to zero it now.
+		 *
+		 *  - i_file_acl_high can be uninitialized, so zero it if
+		 *    it isn't already.
+		 */
+		if (inode->osd2.linux2.l_i_file_acl_high) {
+			inode->osd2.linux2.l_i_file_acl_high = 0;
+			retval = ext2fs_write_inode_full(fs, ino, inode,
+							 length);
+			if (retval)
+				goto out;
+		}
+
+		retval = zero_high_bits_in_extents(fs, ino, inode);
+		if (retval)
+			goto out;
+	} while (ino);
+
+out:
+	if (inode)
+		ext2fs_free_mem(&inode);
+	if (scan)
+		ext2fs_close_inode_scan(scan);
+	return retval;
+}
+
 /*
  * Clean up the bitmaps for unitialized bitmaps
  */
@@ -454,7 +769,8 @@ retry:
 	/*
 	 * Reallocate the group descriptors as necessary.
 	 */
-	if (old_fs->desc_blocks != fs->desc_blocks) {
+	if (EXT2_DESC_SIZE(old_fs->super) == EXT2_DESC_SIZE(fs->super) &&
+	    old_fs->desc_blocks != fs->desc_blocks) {
 		retval = ext2fs_resize_mem(old_fs->desc_blocks *
 					   fs->blocksize,
 					   fs->desc_blocks * fs->blocksize,
@@ -1014,7 +1330,9 @@ static errcode_t blocks_to_move(ext2_resize_t rfs)
 	if (retval)
 		goto errout;
 
-	if (old_blocks == new_blocks) {
+	if (EXT2_DESC_SIZE(rfs->old_fs->super) ==
+	    EXT2_DESC_SIZE(rfs->new_fs->super) &&
+	    old_blocks == new_blocks) {
 		retval = 0;
 		goto errout;
 	}
@@ -1544,7 +1862,7 @@ static errcode_t progress_callback(ext2_filsys fs,
 static errcode_t migrate_ea_block(ext2_resize_t rfs, ext2_ino_t ino,
 				  struct ext2_inode *inode, int *changed)
 {
-	char *buf;
+	char *buf = NULL;
 	blk64_t new_block;
 	errcode_t err = 0;
 
diff --git a/resize/resize2fs.h b/resize/resize2fs.h
index 7aeab91..829fcd8 100644
--- a/resize/resize2fs.h
+++ b/resize/resize2fs.h
@@ -82,6 +82,9 @@ typedef struct ext2_sim_progress *ext2_sim_progmeter;
 #define RESIZE_PERCENT_COMPLETE		0x0100
 #define RESIZE_VERBOSE			0x0200
 
+#define RESIZE_ENABLE_64BIT		0x0400
+#define RESIZE_DISABLE_64BIT		0x0800
+
 /*
  * This structure is used for keeping track of how much resources have
  * been used for a particular resize2fs pass.


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 20/34] resize2fs: adjust reserved_gdt_blocks when changing group descriptor size
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (18 preceding siblings ...)
  2014-09-13 22:13 ` [PATCH 19/34] resize2fs: convert fs to and from 64bit mode Darrick J. Wong
@ 2014-09-13 22:13 ` Darrick J. Wong
  2014-09-13 22:13 ` [PATCH 21/34] tests: test resize2fs 32->64 and 64->32bit conversion code Darrick J. Wong
                   ` (13 subsequent siblings)
  33 siblings, 0 replies; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:13 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

Since we're constructing the fantasy that new_fs has always been a
64bit fs, we need to adjust reserved_gdt_blocks when we start resizing
the metadata so that the size of the gdt space in the new fs reflects
the fantasy throughout the resize process.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 resize/resize2fs.c |   37 ++++++++++++++++++++++++-------------
 1 file changed, 24 insertions(+), 13 deletions(-)


diff --git a/resize/resize2fs.c b/resize/resize2fs.c
index 30cdfbd..7b98058 100644
--- a/resize/resize2fs.c
+++ b/resize/resize2fs.c
@@ -251,6 +251,24 @@ errout:
 	return retval;
 }
 
+/* Keep the size of the group descriptor region constant */
+static void adjust_reserved_gdt_blocks(ext2_filsys old_fs, ext2_filsys fs)
+{
+	if ((fs->super->s_feature_compat &
+	     EXT2_FEATURE_COMPAT_RESIZE_INODE) &&
+	    (old_fs->desc_blocks != fs->desc_blocks)) {
+		int new;
+
+		new = ((int) fs->super->s_reserved_gdt_blocks) +
+			(old_fs->desc_blocks - fs->desc_blocks);
+		if (new < 0)
+			new = 0;
+		if (new > (int) fs->blocksize/4)
+			new = fs->blocksize/4;
+		fs->super->s_reserved_gdt_blocks = new;
+	}
+}
+
 /* Toggle 64bit mode */
 static errcode_t resize_group_descriptors(ext2_resize_t rfs, blk64_t new_size)
 {
@@ -310,6 +328,8 @@ static errcode_t resize_group_descriptors(ext2_resize_t rfs, blk64_t new_size)
 	for (i = 0; i < rfs->old_fs->group_desc_count; i++)
 		ext2fs_group_desc_csum_set(rfs->new_fs, i);
 
+	adjust_reserved_gdt_blocks(rfs->old_fs, rfs->new_fs);
+
 	return 0;
 }
 
@@ -789,20 +809,11 @@ retry:
 	 * number of descriptor blocks, then adjust
 	 * s_reserved_gdt_blocks if possible to avoid needing to move
 	 * the inode table either now or in the future.
+	 *
+	 * Note: If we're converting to 64bit mode, we did this earlier.
 	 */
-	if ((fs->super->s_feature_compat &
-	     EXT2_FEATURE_COMPAT_RESIZE_INODE) &&
-	    (old_fs->desc_blocks != fs->desc_blocks)) {
-		int new;

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 21/34] tests: test resize2fs 32->64 and 64->32bit conversion code
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (19 preceding siblings ...)
  2014-09-13 22:13 ` [PATCH 20/34] resize2fs: adjust reserved_gdt_blocks when changing group descriptor size Darrick J. Wong
@ 2014-09-13 22:13 ` Darrick J. Wong
  2014-09-13 22:13 ` [PATCH 22/34] libext2fs: find inode goal when allocating blocks Darrick J. Wong
                   ` (12 subsequent siblings)
  33 siblings, 0 replies; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:13 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

Add some simple tests to check that flex_bg and meta_bg filesystems
can be converted between 32 and 64bit layouts.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 tests/r_32to64bit/expect.gz      |  Bin
 tests/r_32to64bit/name           |    1 +
 tests/r_32to64bit/script         |   75 ++++++++++++++++++++++++++++++++++++++
 tests/r_32to64bit_meta/expect.gz |  Bin
 tests/r_32to64bit_meta/name      |    1 +
 tests/r_32to64bit_meta/script    |   75 ++++++++++++++++++++++++++++++++++++++
 tests/r_64to32bit/expect.gz      |  Bin
 tests/r_64to32bit/name           |    1 +
 tests/r_64to32bit/script         |   75 ++++++++++++++++++++++++++++++++++++++
 tests/r_64to32bit_meta/expect.gz |  Bin
 tests/r_64to32bit_meta/name      |    1 +
 tests/r_64to32bit_meta/script    |   75 ++++++++++++++++++++++++++++++++++++++
 12 files changed, 304 insertions(+)
 create mode 100644 tests/r_32to64bit/expect.gz
 create mode 100644 tests/r_32to64bit/name
 create mode 100644 tests/r_32to64bit/script
 create mode 100644 tests/r_32to64bit_meta/expect.gz
 create mode 100644 tests/r_32to64bit_meta/name
 create mode 100644 tests/r_32to64bit_meta/script
 create mode 100644 tests/r_64to32bit/expect.gz
 create mode 100644 tests/r_64to32bit/name
 create mode 100644 tests/r_64to32bit/script
 create mode 100644 tests/r_64to32bit_meta/expect.gz
 create mode 100644 tests/r_64to32bit_meta/name
 create mode 100644 tests/r_64to32bit_meta/script


diff --git a/tests/r_32to64bit/expect.gz b/tests/r_32to64bit/expect.gz
new file mode 100644
index 0000000000000000000000000000000000000000..c0ba3b77f70ea2cf5d32c91f95edf1d40da5b35c
GIT binary patch
literal 3143
zcmaJ>c{CIbyN(~pUXcn}ho7<UDf>1*CcCkWKKls6kSxibM7Hd^kc?qWVeCfK$i5rR
zU>cI7!4O$uEZ6tl``10^-v8e7zVGuq=e+N89^uqW7sOo7Y#INem6&G;3|vX9XN~Q<
z(6NEn1jWYw<cs}e#f8n%7!<w4H?wKt(JXcyK*~Q}xBYB5-X-^#kj|9+&21lx4+}o3
zU5mh>laMD5XOC5D9FjIQ1Yg!E9P}Qw&vkFQAcPWolX`=~Nhj@hx~Ujt=>&^znv2fH
z=7Lb?-hOv55l%w1;^<2aN52Wn4VKR+BYPAwYQXTZ#^!!+bQ`%$XBTt-M18O(@U9o;
z;P$I4Q$r==5r_@*)fR`W{sh+eQ*>_b#~py_Z~MAM-AO}GCy7oWcs~AD4`9&S`RI`I
z?8{cG;o~LmJ<^G!wusW140oa6ZU%)AgNyn3u8Q!nl%p$#xXtqe+E3pcBdm)a{$6{i
zeb<&P^wj7!W^18TCp1yhtW1!?5!u*{BWoipCA;Qni&M%t+5jKMKDJPIC&j94Kc^^=
zm_$~g|2hrF=?EQAbc{dJe&*4~%1%RdyDZs-b+x#(f@Xzq4&1$Taskty_66ZY@~5Iv
z+)5FJ5Y|Kye8Q;fL2sDrUa{OHwF$%=uUYoG&z|WqY>zoCT3*SD$qlX3EvwM^hAgj>
zn+th=B#Gz@Cpt$f9f9_F&DJ75hc41?ZxQL-UpoS#H>48#C#=F3Xty(X(^*b&YiiNe
zdcDu89n`0PUH1e8zT6^aPz^0Y9LlF=_skEp4q5xVmX!jFRt^Q7d}YxK{1eO8W$Bqs
zS1?IBeUqUFZL131Cu{5Ep04oIZ-)o5-Y)=E8KeByAkArs9GZPZ+8b*Ui>KEVXwG?R
z|HLAsyz!&4t+v_rJGVx)*W3mZgu&ycCAP2(|5j~brgyNziAmndAu%UCRaZTB{I1Qc
z#)SB!N=ez=^!AJ&R-t)wm}(@$BAEUh2FC<)aI)y<#J3vL)#mF{_Wx<t<=rU2C%eFX
zSCcnhHnEIuFtlp#N9m+iQ<O|l_d@hVoMoqcsS&=~OrlL?)D5$O9Z+CpGdeWZqPDtS
zSF1X^v4HC$zxw+uaDE0$s*s{FcSUEs)RPYppGPY6JU23$SOml4{7Kq>TM(Lcxv2hG
zT+Dp1WO=C+k1HZqi}uV`y*qO!=@6mj&57EJQk&cv9WYG&7B@Fb@%IGzhzk>Pt0_Wy
z$;@_lM~y3vE7aK1kAwF03bjA9(#vFuC!2M{L+-d%876NR>!Jvthx>RF08z!clxl;_
z!=ijB<+efgUxfJy@#&wj-MU&B9~+MDpA5UeTQ~{s3in(EcA<vcFFr!&tV&4dtQBVo
zx>OZJPT>3Y?xQ|2k&1qUM!rk=aBg6fdeA$z3RyQr%0Pv2PWSdGfGFnD3I_)9K!3LM
zoAoyE2mt~wHsyj}C7aKYmzTb!&S_RK82}@Bim5F*mw{%Q+xm=_u<nBC_DO(V#{5Cw
zF2ACIA@DsMWmNwxNyF<4>*XCkvDEjF-KuJib?ejaLU-|gA1N<KkKkOqsuBCLMO>x_
zPtkV|{Xgo@EAB=sF@Vyh9l>#*K*^4t80Af-OE(p>tf?4r6xsEn;HvE{7aJDtozoae
z=MKdz!A+;N5cWwp<IW``fa*xNr?bqYuLE)!=ER`6wXQK#zyO@QF=@(@V_R5eB6>%{
z$R?>}Ji*>3b>eLn7{HmwZy6!UxvC;+-U=BgH5Ij%_@uFrVI%{L&?A`y1wm^;hA9dX
zc5y@cFR1F`JL_q$<$i&EK7-emI;uhFsqH>=))n!bwamVp3Jt9{1_7pNFz98lnhcAx
zn_o?PXSNrldv8j?8(TwsG@XmTC#Euetk;boLK0YT{BXW;blO_Dq1r9v4eRMnQwW<a
zmx>MSpH-u}8pfj!Br6r;b!GThe-yPfw=dKx2hA%f{2(nlVv2}H$4zsoK{$N8Qfi77
z<abA2WHN^pdNKR@MBSVXjvM%aTR7zOI8+{g23I}BhbpAoHY#AGy1BtR`rZS@mdYxK
zxxYg?VC5PGz!#^AhfK<@j}kWt^eOw3i?42RsM+VfltvlE!x(I_JVow7e2Rd+Z3+2C
z5Upe@e)V?G%E_|+ZKv3!yM_IKS*n0YrP;6Y^LAy7A9&w6fp?pkn6v1W#f(Kkm_pc|
z^b(Ax?~|e+U!O|YrQ1CixejH`o?v^#o~R(FRDLb<DGy9=k)T$7ZSF~}AEv!a8s!%M
zy*Iut9EMc(kMR3@hZF|62&{X982iYOA?pjrNC`Vz;63&0%~fyMLki5rh?xmIfaKYr
zxK?bUEca)}rj-$Bp$PPoz1K(c&;x)6Csac}`^Edo&TTKbvHIM3t--CL5u|zsR<#to
z#ENX~KWH*DXwq`7F_f7EN>LkEY&4$&OTUlA3Fhqjr3|Uw1S1fcxtC3KK0c9fGS${;
z1EI52b4~AvQAxa6zRJ1t{3>$d+2NJRh<c+!UhN9Ms`l{48H*RZt`?Ttg_4G~Z<WrJ
zDmPn9yzRw(6IP4)YmEm|#ekw<Nu;JwF*C9?akt<G!V(Q>w7bLiAJ0Ub!_zPTnit<^
zE4ZHtIC+W(uKL=}cJ(@f`te<2@$)9>HxWa{(^Kw7?zL~lE3bMA_LTcx%QF`K!w;|a
zaR#NAa=m(uv#tKnF{3VaCL}TF;wKPQhn59v&R8i`q$0ySV0JfojQD1QYY`&i5}@on
zC49^9a3!|~^JnimvND=gR%SVrZ`O)Iz{Dgu3+6TS;Q@5ctX|DN&Bt!sifO1*SqVMb
z(7~vlS$7pc9cJ|`4$XTd*8*EPKmO2gRY`o)@b@9OycuWA`amW?_Ijx=AK&nO;&dKq
zZpYX@9_9s=QA9T=s_{YpTAZLdT?uKzVhPCn1}*veVb)X;sMEiA#btP_FDl!|1ALif
zAWh9yU$OaEEl`U*3ezZ6Ez|R<0gs<?8}=}nwx0pF!exTvF)Om=a<1c0rlI*q((D1U
zW}H7{6L+K6F2kpY8a+0JI~BwheC|I`erwdP1pJKAu}5)aUjNpZG2MZ1{Ce#%VcI0i
zEO%$Rji5Td=x*!v)i1xivixQIBj!>qw};sZk@3%5f2X^AfYrpk9qg$?WhV!|vToV5
z{<KOok;_h01}1gCe86_1dQ;ksomEkEZDG=MVb|rkDDQuAWZQ$e^s;u2#nW<0LuP-`
zGYKPd$u9nN9t0n3o<8$t*|-Q2l&8mx8~(<6j;BtfZ3(1_Dl~&3ui5=MuQ@2#e@M=f
z@QWI(<Ub$JDJW%!13ODxLIDs??l83(*;2ckdGaX<;3Resul)x;0k^^|Bjal&h--NU
z!hhJ{tr3lk7HP85I|x3K6~p}vh&#w8FL+9|jtzMWUUOrzk4-}fi^jA_vu|`Twq_n(
z6(1VDUt8k!T<3+_jDP5TP)oj-*Rw%JfphVqb8$=!#%c_^Q{>`@xiN(<Wn;|_T}e<W
zat~(qR{|=-1j#M`V+b$ozYT#bG5DSunUDFwHk;g44-WE!?qZ>iQi0HW4*d^C3ZTF}
zrMkTvc8(01qYfOkkup)dZ-=Mns5bW;FBUkgB&^t^!oIHwKwmS4Wj0=|qRHi@G}<}w
zGt2t{N8BLmkB0f{wtk{+*DRmE=2jJkiAtZ@*}M<63xkvXod$Q#4R!y^TvCoL^D*V<
z;nyEa3X2?%15rBry<@ePyG&`{zW7DoKuH%y^vrQThS<gY<Qh03+&XZ~$O#FWrUc-S
zu>mo&g1TwHIY!Cc^@%49?*%$%GGd&@@oA05XTNsk)gBl&9l`hZ=wn$T!~v9H+no@P
zmWfjy{A$VLdfcf*TWVm;w|{Oea?JkKLmnr*#Qi=qYtIg>w+#3bwHJND)BH8;z;nr#
z(i=cO`prm*tu9!&$(A15_U3e1z;K?<v^v@s7Wvy1(bhBha`RMpo_5j~MsH{)$5w{X
zMHJe|bx6u7p_NP?q3s9www*Q8_P0H7@2}m%ouCkHqyjQ+Y&xK)nXEwDKX3i-G8P{e
zNoW6WWO+ba#`*di$9Y;o7#&H8@ojVce@a(Z-22n#VUbP$7x%ehPkb0XmQnWUJ)2$E
H3m5(isi#o3

literal 0
HcmV?d00001

diff --git a/tests/r_32to64bit/name b/tests/r_32to64bit/name
new file mode 100644
index 0000000..fb45fab
--- /dev/null
+++ b/tests/r_32to64bit/name
@@ -0,0 +1 @@
+convert flex_bg 32bit fs to 64bit fs
diff --git a/tests/r_32to64bit/script b/tests/r_32to64bit/script
new file mode 100644
index 0000000..73493a4
--- /dev/null
+++ b/tests/r_32to64bit/script
@@ -0,0 +1,75 @@
+if test -x $RESIZE2FS_EXE -a -x $DEBUGFS_EXE; then
+
+FSCK_OPT=-fn
+OUT=$test_name.log
+EXP=$test_dir/expect
+CONF=$TMPFILE.conf
+
+gzip -d < $EXP.gz > $EXP
+
+cat > $CONF << ENDL
+[fs_types]
+	ext4h = {
+		features = has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize,sparse_super,filetype,dir_index,ext_attr,resize_inode
+		blocksize = 1024
+		inode_size = 256
+		make_hugefiles = true
+		hugefiles_dir = /
+		hugefiles_slack = 0
+		hugefiles_name = aaaaa
+		hugefiles_digits = 4
+		hugefiles_size = 1M
+		zero_hugefiles = false
+	}
+ENDL
+
+echo "resize2fs test" > $OUT
+
+MKE2FS_CONFIG=$CONF $MKE2FS -F -T ext4h $TMPFILE 524288 >> $OUT 2>&1
+rm -rf $CONF
+
+# dump and check
+$DUMPE2FS $TMPFILE 2>&1 | egrep "(GDT|bitmap|descriptor|^Group|Inode table|features)" >> $OUT
+$FSCK $FSCK_OPT -N test_filesys $TMPFILE >> $OUT 2>&1
+status=$?
+echo Exit status is $status >> $OUT
+
+# resize it
+echo "resize2fs test.img -b" >> $OUT
+$RESIZE2FS $TMPFILE -b -f 2>&1 >> $OUT 2>&1
+status=$?
+echo Exit status is $status >> $OUT
+
+# dump and check
+$DUMPE2FS $TMPFILE 2>&1 | egrep "(GDT|bitmap|descriptor|^Group|Inode table|features)" >> $OUT
+$FSCK $FSCK_OPT -N test_filesys $TMPFILE >> $OUT 2>&1
+status=$?
+echo Exit status is $status >> $OUT
+
+rm $TMPFILE
+
+#
+# Do the verification
+#
+
+sed -f $cmd_dir/filter.sed -e "s;$TMPFILE;test.img;" -e 's/test_filesys:.*//g' -i $OUT
+
+cmp -s $OUT $EXP
+status=$?
+
+if [ "$status" = 0 ] ; then
+	echo "$test_name: $test_description: ok"
+	touch $test_name.ok
+else
+	echo "$test_name: $test_description: failed"
+	diff $DIFF_OPTS $EXP $OUT > $test_name.failed
+fi
+
+rm $EXP
+
+unset IMAGE FSCK_OPT OUT EXP CONF
+
+else #if test -x $RESIZE2FS_EXE -a -x $DEBUGFS_EXE; then
+	echo "$test_name: $test_description: skipped"
+fi
+
diff --git a/tests/r_32to64bit_meta/expect.gz b/tests/r_32to64bit_meta/expect.gz
new file mode 100644
index 0000000000000000000000000000000000000000..dc1e284e195e56235b166faaedf2e0337018ab08
GIT binary patch
literal 2955
zcmb7^X*d)L`^IHwELp}nw$Vu;3S(=r3|`yFObkPbp_rm+vP+g|2*sqaZ^MvfG>m<h
zee4+{=SYfUnTC*xL|*5^|GNI~`|bU9e|Vnz_gvR~J-1RaKL^<3=o%;2Mh*-sPZz^e
z1)?pvTC8gZ);2{KXt}VDpgc>OnTVP4<g_*V6oJ!d^sis+YfR|4)5EGbD{-H??lqtE
z%Z#hL({or@t;iwn7nHi0ie5|Y{mh1TDK@t8;kq7uNi9r;9)UX-L{fV&Ro;@HZ6YH-
zU<;n3JK7{Z7%laOG<C64t4#7eenEdDBClf$9H{@ofqo?Xc32NxP5$tdsa;~bV&%s+
zZx%zIrrwJO47AFN*-r{|rt918Io&NS?llt}s77{0;d4`hL=z5K&O+=|j-L*HcisHB
zFdy|UIeNQkf5a;no0ff)FK;jZZTFB><8G3yu?(wU<?-l}2ulcmDQt78Kykml&}EQ~
z(<M%4YFKoAxKu!G?N)e|4{FN3l>Gg6oyo?yUGsvHhiA^=mk}TZ*_E%|!uOKl<+if4
zANJ5|(?qz5?PZwQw(%*Xt&MH6+=8E=c7(AKzHbcjff$>-xTE*cJx7W+-z9iQ=ZftN
z-`f_%{S^>8cY^O@b;!iryo5)%>%zSYoyabei9H^ngBu$hdlBE;4lI^0;=VJR?=#KF
zYi#1l1;s{fcCRn(9Qrapf7-TOV#k=*FF_Js>2CAKstf#0y(9ZT6Wztn&pJEBHg}+f
zCN2V<j$*4}CWFw<o?v5}y_s1+PR`r&@ofjQ)c%_b(R;nv&X&WSgk7Z%(YpqDmj-`J
z*LY6`wr@SVl~&#SxZ{8*iH-LfYRKzyd5Vg0s0^H$DDgDPB>K_lHT<sXiEjh#esT_v
zHPwM$f*Cfv!+%a%R;<EJ!(w%Govz3mm&BzCJNi6(9tUP%`b{kbCJ?QLqFj9dVtz%M
zT5y;rowxG*!g!z)-VuWrOhmj&0pnuQq_qqJmsRWQY>ei+Ei+^dwR8&gL7DK^*R&@0
zxP}3lYS&GDj4%Bez~us-o4^q!85v6t%!1YLhznbXe-hMn`H`b_(S<7$41j@Th*?YB
z#Ul9`gm@LSNB5!--kJmz^%r<p>gRLc=M*=tQ%OloC&xivqT<ey<UZ!~NM~N^QpTi9
zP(_kzi7xN+(X44qW|F7gwJi-4sNXHvMx7~sm^9>odtPJ%x-%{lBRD>?cIweOMqgl2
zoc~s<<THOUQ&P~EA%h^>A491Q!B#eM1P)^o!CQ%BIxsaieTuXAN`L+Vp1b6eLsa);
ziTN;bIcwuu0>k9m6Nj^G@AQ*`^uYW5WKiCAnXU(+#$sCFd6sF}<2V#l>phtVY~C*$
zt0IOmRONU*oh{>M{*g)HS>(>PCEXpGB=gcd`&(j3=@0I4Tz_4cgg;&EaKBGP2WVJr
z^OpfjB{e}P<>K7dNLISR#KV_8Tl&L$dA(z)I*!W9v2cCYRrlC&dCm)b?#L5%FGK_{
zDPMwef6H7Oi~F<<BlpSZfY0<8VZAjfN_MLW>b7#>2rB1`fg-x6?`nd53Rt!50wklF
z9K>h;?4`r_>2_JMCA+HRuD{>F$HS?b5;p;<%6^z24UF<`0|Wsz@Su@n66yH^yH+$!
zRv8yuLW!PIqhoCsT=!s2s##7wKY#3*uBy6#p1VZ@JiScsQNF2_jFJ~N7eX9*_DHya
z-*>DeBC^P8&_dPL$mFT*wOli+#Pz>B{DIEohbQlpKPmnRl9PcurP)d26uy(yVK6>f
z6_p;EbyX#p7(yXU#jG&y_*gj0YIuvy;xd4>MYTZ;ub5Xzlw~Vl1>SA+OPEs_(A;S=
zxf?Hf+6ir)Ehg@%snam5o?}jhAm!fXgr149xDm@oM|)vrfZLU|;=B}kJn{z7*RG%P
zQnPGzY}hlYT4<<LPvas$1A&Z*cTtqc!%9gJo2DR0XZr~n=PC6OH)x4qSr*aqKpk3E
z`}-}0vKVv$f;8?}5e6v3-Lh5@NNcyU;zZ3VaGIW<?~A(`Qov@~F`kKUqFmU3=%eJ2
zY5k)-AT&x=yneX>z3}_bo*SKIjcKkRR>KoX|8~+FLB}u$X@V$<ed-ca=p2<JUYhJR
z!n1Dv6*6l6Ra&tQ*_c)Y>{RUV(zQC!w|MWEj1|J3mp5Td^~j<he=`sA7mh6*dpGSj
zvCV~EtQ)cq7*f_x9-Wn9J+lAy`~qyGCP=nhv@+$29uTC1IkgLcAna@;u6|CpK=4cG
z%%Thkh&HIjqz|p4;&+-2OJr8=Y<}JtShtuDQKG$yRuuYik$`A|T7-=df1dGI?T|DY
zP$(xv*jBnCOsA~{)np!VOg#}GX5F|Qq_BoEa8}b%pXa;@Jy(p{DW4&Y+B;f0^#bPN
zZQHeUc}-!B6-WXc(t;}|P2InpJ~Wz`qeEvTK8_5PaxO0K&KjtUs454|$GJk!Nd|;Z
zOFB9s@4DmxBM?O5t<uYt<1*6Fs!9+sq_T)tZqi>oTNds-W94jp-hoEzThmb;8g^sl
z-f~`#rPN9>9u_dIrisF3iT;KYYRJV~vf@F@AJ6KslPcYrks6a}D;ll*PkU%*vlms!
zHm(_lC_J}UI3}R3CaJe{1xd;(2-S$OxMdVz*F;OFX&!p<tC*p#M9fhru6bI=2~S6k
zNh>SZju_eW3SOOj%7o7(V(D8_Vz9dUq@~go*MAH&V=Mx^6z%w+Qbx7nBg1J$)obK?
z%#6%_xU2W~e&jagO-W<BlOBa5ct+~GW;}PmA0^IJjY%*^2t%;{TRY~&sXHTVaI?y*
z{02)HKn0rFrtVe`c}{ggZ;#{=XhhcosZv5p@B2I_bU<;JCpvm?k5TE*V+`Q`dZkl5
z_l>54WSFViP9Pw2k{)OC#|-^z6SBfMlJ2jL^_jBRbXN_2srQw<x9=;yo}u>7fSC7i
zVp-k&ZhdEiljaPC^ALowQ7+WCFNVC2779OK(PxadlYJYk|4!P?2G~NaCh)dyzffbY
zeMn=o!3EaG1}-%3@HcL=Pms%gtvs*sw!XgUYom#_G!-d0OuhDb7UDKH%q-{^5_Tg;
z{OkV}MGpW}^mqkh@z$17A}ln-U;Nm9xFN}wahV@DjSkU&I-YVyeRdA~`uM`P1-w4o
z&f$grb(Q7y-oiv)y<7`lD<@*0gLLd=7qkM7JjzcKbjZGqdVAR#a=blQVw^yG0GEpW
z2QPDsN5#&uZ=;_4|CR`61=QBGh_lOGRD9&-0&;?#^Ugu)XvYzte&%(nMUu2e@4W7d
zK!tqt^B1-?g8-6;-I?pdt>}mXAVm~)qX=%PQCqMNG|w5Qr}<RT>~XErLL*Zw2-fwA
zLEazsNoO@G`XUxxWv7BgXk*jr=b${V?_}J<IPE->#Mmc&dSX0PQ%rQds;L3RYlUv&
z2VQ&(g9WIx8cy8>oShAW#sQ1AYRpnZ#)-4ocY~@S*P{++u?1yO^Gjnbm>;K``qADj
z9H^2$*|tGi=tf*gUb=4>P%|F{@|iWuJ-o+v^!~T^#Yn|Ll+KsQh24axCYI%ww#!#n
zhZ1XnKf@yEmn+F<LmT#27$F>q-|rhIZNP_^B)`U}wnyP7=LeZe+t=EeTeA$0g~NM_
z`s>;Xvo4=m`|ZM5mjxy@Mj-UpW_DHIPSoyJ2I&`dSLh)CzFin1a&*Rgn=!h)F@<ce
z2WkiO+qrN5hdhv|2L`eR!^VytPF%1npn6$;`KrBvrA%$BeYRPN-OedJen(dixr|Kw
z^8X@d@VB&f<fuQhgLUfw^mjGFT6-EE9Rt5iSRpG>Yf9RS(Vqu5UtZrQXI^2IUX5U-
Wd;GU73@H3&a^fbKU_x}`;P^N2d#Bt0

literal 0
HcmV?d00001

diff --git a/tests/r_32to64bit_meta/name b/tests/r_32to64bit_meta/name
new file mode 100644
index 0000000..d83492e
--- /dev/null
+++ b/tests/r_32to64bit_meta/name
@@ -0,0 +1 @@
+convert meta_bg 32bit fs to 64bit fs
diff --git a/tests/r_32to64bit_meta/script b/tests/r_32to64bit_meta/script
new file mode 100644
index 0000000..5d02114
--- /dev/null
+++ b/tests/r_32to64bit_meta/script
@@ -0,0 +1,75 @@
+if test -x $RESIZE2FS_EXE -a -x $DEBUGFS_EXE; then
+
+FSCK_OPT=-fn
+OUT=$test_name.log
+EXP=$test_dir/expect
+CONF=$TMPFILE.conf
+
+gzip -d < $EXP.gz > $EXP
+
+cat > $CONF << ENDL
+[fs_types]
+	ext4h = {
+		features = has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize,sparse_super,filetype,dir_index,ext_attr,meta_bg,^resize_inode
+		blocksize = 1024
+		inode_size = 256
+		make_hugefiles = true
+		hugefiles_dir = /
+		hugefiles_slack = 0
+		hugefiles_name = aaaaa
+		hugefiles_digits = 4
+		hugefiles_size = 1M
+		zero_hugefiles = false
+	}
+ENDL
+
+echo "resize2fs test" > $OUT
+
+MKE2FS_CONFIG=$CONF $MKE2FS -F -T ext4h $TMPFILE 524288 >> $OUT 2>&1
+rm -rf $CONF
+
+# dump and check
+$DUMPE2FS $TMPFILE 2>&1 | egrep "(GDT|bitmap|descriptor|^Group|Inode table|features)" >> $OUT
+$FSCK $FSCK_OPT -N test_filesys $TMPFILE >> $OUT 2>&1
+status=$?
+echo Exit status is $status >> $OUT
+
+# resize it
+echo "resize2fs test.img -b" >> $OUT
+$RESIZE2FS $TMPFILE -b -f 2>&1 >> $OUT 2>&1
+status=$?
+echo Exit status is $status >> $OUT
+
+# dump and check
+$DUMPE2FS $TMPFILE 2>&1 | egrep "(GDT|bitmap|descriptor|^Group|Inode table|features)" >> $OUT
+$FSCK $FSCK_OPT -N test_filesys $TMPFILE >> $OUT 2>&1
+status=$?
+echo Exit status is $status >> $OUT
+
+rm $TMPFILE
+
+#
+# Do the verification
+#
+
+sed -f $cmd_dir/filter.sed -e "s;$TMPFILE;test.img;" -e 's/test_filesys:.*//g' -i $OUT
+
+cmp -s $OUT $EXP
+status=$?
+
+if [ "$status" = 0 ] ; then
+	echo "$test_name: $test_description: ok"
+	touch $test_name.ok
+else
+	echo "$test_name: $test_description: failed"
+	diff $DIFF_OPTS $EXP $OUT > $test_name.failed
+fi
+
+rm $EXP
+
+unset IMAGE FSCK_OPT OUT EXP CONF
+
+else #if test -x $RESIZE2FS_EXE -a -x $DEBUGFS_EXE; then
+	echo "$test_name: $test_description: skipped"
+fi
+
diff --git a/tests/r_64to32bit/expect.gz b/tests/r_64to32bit/expect.gz
new file mode 100644
index 0000000000000000000000000000000000000000..247458404cca7bca9b2be6422c0a5fb6c1b59e4b
GIT binary patch
literal 3175
zcmaKsXE+;-8is3>Q>v)dPiwSl&qkGerJ9IMwGAmnVkZ)6)<|l!r&g`nRch51YOf|n
zQ6-2~lo%~hD^`da<@mmHew^$4IRBpOdf(@He?0f~iben9%r*E4l$nK=Vd`<Ei8uU1
z&(SM3?*zNu7fxFv4Qu*>1MMFTt}O7dezHVQ@sU(N^`XXR5ZQ7J6aOLW<S5oC;@#Cs
z0OTIM#DR!+Frx))xCQ^wSEl2C<wEyY^+Wq0iW|cJ!!7@g?;YPY2&=&}S7v<&EcqOZ
zX8n{m{M0?Tr?I%Hz@KgY8MJjj4gB$er}i4%ZqGZ2&iH;jC=&Pax91eCzKC0k25XJ~
zv)uxT=lot;BNrI-wdu^5?Ng=~h8l%#=D!{}*2cw2xn?#z{=IcU3@}Wb*$`d-IDa4s
zzC+t4Z!G=vYY=JH4~B)zdTg@*=poyoJ)o0Gt+j^LkLzGuB*STrIue(qLax5*>mdr~
z{+2BiK^jVboKBZ+why8&?v$uf?6((|T<V2ZGcu+-7*3zcrjJFGYNmqS+R?j3`w|~`
zsP@|F`z}F^cWSsd;>TC#y=G}X{5J9Mrs=gUL_^0RVRNhF_g%gSS{0o^U7R2`U!Fm<
zLi!GOk5z1tlWq%Y{Trh78MH+Mz9rDq$9{%lP;hI@%;2@M4f>3ICuL22?~=I4Z?%rV
zj|U{rOAKxMzGo-2Xj#;DyjSq<Prf~Usqf@2!TuIaEo&{BQj6$x5iUmEr)i~GY%jxq
z^*8$><w?+*O5dGD;NYz3#-=JW7uUf|Kx_vhzP$z5Vm<qlJ8;BSv%QK`wr=God6_fe
zGgv$o>9a$va>60`Vx=`M=OQHx06`aONb6$ctdmWe|9A-Qcx}H%=;V4NzAnB`SXb(C
zxK3*3#TCYP)Y|)};*<ZFKD$m6Gv9l8UeMIk%_#Ck*Feb~IdwA~IywAuJq}f~x{TjU
zq_%=<u5ewiE!KE^aWHS22!oU9JUe}Cs@8Yfu`2D|W41sMtasaY%j?kh6&~lgE@$#z
zjbz+@(kehh`sJ#!9uE19U<U284AwPpZ3Cj>Q&2PgM)afzc{$Snsqrp#>25dNvBl2!
z3%1Zl)%Knu(R4C>>^IO^!Yc&T=j12`yo8$AgE=l<)Z%GmCoC~lRG_YNH`?JuxJEAR
z&#y6&HACxl%$mO5qM0rDg7W8?qs51G?-L&+=w7SJTK;5`<`bMPA9<RXa_-0R;2ZMi
z!h_>E6;aKAw=&Y$&}?fNOb?5VLx3n}MANG`Vr|~Jx!MDwku~gJ7=zFuK5K^b=6jNK
zLY#5T*pZrurRH6m9|@6oWb@4ts)GX3>}#`j(Sa#N>`KR%oD*<R2x{U-K7Y~@)A~pb
zgxj+tFHLh!r772^ad1!!I%%Ohqt<L@gc+#?9qn*9@bWZ$&a>o%AaR3o0=I#W&f*os
zRhIbVDhy|j`XDgg50(~2l$^R1rhvk$6X)V3J(zuxjoi__Gtb48Kz%b&#^OfbE)PB9
zo~bmrIvEu&xoIMZEiaFEXI_}(cEFlyf|HWr3H>`fEYZ+U@6K2S_dAtzyIpjIhI82*
zgZe6?x>1P$g=$?B`?kC=PwBg<4yx&e4xq4GVV+OfZZJuS78n|ZesRHZYQoO>GdumE
zl4QSpKB_m@j3tg<bs=p)P&i*V8>K8H_4bp<&48F;X*;Q$w&$-|k>Bjzd7S-o8H_V3
z6sW8Ld^QBJv1Y52Fk~s@yKDn1@7~Db3qJ|5<@s^i4SJ<O*JMND1&A<NADy*U83#<N
zhNSS01ZY1j&!@25m!L+knwRMLJp04_TE>->w@*|{2?Lgt7k2z_V8d-$7)*Yfh%#*V
z<(o);2VOId`OTPKZ(|n-DGyL+1*Dv<2=9HQ?j7Z^G#1fHP|OqF$(QX=j_Qo0OB~c&
zL8G)L#_vaa-U5^sah)~M)6A=-Cl#MJ19nCx#5S;gl`H)c#Dgvl%LD1fCezMrg3V<H
zO-ccp`#g>-^pA}&3tc6er4E(J9UsZz>(!2cbzQ5RSCCH)|GXEbbv6Lqvic4I=uPty
zzWIJD-ezxGL-4Ryb}Q&U<D9==>3+V-NY9;ilH`8pAsF;4qDNYR?u@*fjmWKEQ3rde
zj9NDN#$<BvCbGu#+!F{duxe9~joS!JPy?Y_II|EJNQe6W46Z#+SQU?z)*Zn=RV(5z
za+ds!wwRDDLgbKG$CUm|ULGIN*Il>=@?a$xu>H)4=LoiQg*txi+P)?a9=5Vq+}!Mv
zT&#KFJa<0*^PX=%Qr8O(j!jYaSm2L&?Q3~ofNWKDK4WIGWV=oz74#Y8T@KXPmN1?Y
zmQUmUVi=&X)GkHp;8?9T>}*T)42SUII0j*3mJ7MUm?}bM7;hqE&4iZ{AoK=uUV#&7
z!0&bcjJq<%HMn535QJrsPn*1HG`uMlAuV6`yfs_$Vx~YI2G<I{Ta?q;-CN+truyvR
zYbIH#t#SUOK!AK*yZ1u~WFI0HgO=nvYEpYRm;XSjAhHtyg)H{qN7=Gm&+&F8xuJ2A
z;QwF~UErtJ`D<IH%Q36Z$bD`<o+p<*g^br5?);0IxZCik{K(=c{*3^N0u0S!9Fjh?
zTWpl{IAy+((}pYMgHq~bdO|WQNa#Vok-MFttSTGC^j1B(*Og(1&{6#K?MpEzsZcyn
z^Z{Wsyz@cAjo4Xo?W<ldv6sT=T-56ad<8llTkGaBo=SD)a;^GN&Fr2xYD;^)T)Tzk
zxKUN312Oq>nD=tbIHfv8l0iDJvvc`&d9^o%z2t@PtFP~3&=NhpY%<Il7Hz4L`F)Yq
z-YB~oyiM-7uzWm90x9$Iy6v}Q6Z;7pChOEP!1a}0c}&Jri%i8jCxKT~s*^L_wEeat
z$w^J9Fg0XV|Fw*Q7|~r&Fs-4YpzO@w^(R(mLN!YR?9n2~SYPdKjP6SpXMrMPMf<ZY
zgakf8aAIw-%pxAIQm5ePY>*7|?Y(In&1p0K5H|OK%ez-N+JUf*5rq*{>NK5sucPIk
z50y{KW#(Xj^86$ckn^gc^$dS|KPQyrBM1t=qt080!ek<3O<Z5k#E`;dV-s&)T-bC9
zD=|Hac10tFTh$-Y5T+v$rL(4(GK*UeIvi*3%{Laf&J?|+7A<wZzgrPn!>*9c?__p=
zlVy5IAfr#tuBdlorOGMK8VM-JCI9j{Z8S+QJsB*P_ztXX<|0t(cuqwE?6GVOnKD2(
zvghda3JJ;2#axQ?<v7wzw`hx_w#F=BGY}fJiD-t6wFDPs%ScMiT+*)n$*TZ|Q~NSG
zZsRguSGm?`)KjQYSPo%}4^@>g91FLEsFj6^kb={ByP}7tcS<GRGvg{h<O+Uwb22B$
zT$2-3(##7uyB5H5{zhJLrGrm-9)QB0`{FTP3D%aL010bR{PFhmzJDo?L@5VZatbf^
z?=PbG*Vvqzsj6#7KD}R}teS^^F*2wizy42i9YvTwpw#EOypRR&1fJRN^i}933#1AZ
zTETJ$F|jH7wF%Ix;KT~TcD;&QMW~3`E7tVSe&sH<0!U*;t8Wg0K&$;{Py=5t<;1Tx
zOgIaM#UE%zV~{#$G|75=IOS>AHbB1cj>FUE`T`$9waAZH&k><lOJL3*ujbK<sZ`Qs
zp#47OT6*!pDH6<#zmOQetsnp`s}22&iB#&}Oq6Z#7pfP}JHeQ6snr0dC7(ynP|j*y
zCAN-=scAl^NYxkD#G4HWY{sv4JbxR}8mLMkyn8&&{iq1MDd(RICGFa0q9;bD`hy$W
zOSg6WHP*$R6nA4YS%2hbO#4NjP>yPW!Iyqo-SW^T2s1huE=%5|16N?TON{MNM1+6S
zjf~xz53pI{>=j;zpl2Gs89GI<!yhNmUugRJ({Zt4+h=r6t(9LcbMD(8X_~ag+b3jB
zp#vJVhYL{-AJu-&`NUtFGkay-Z)zNI&i5q4@9?C0=$N!z5VH0VJRRA(3&-xEl!--f
zEOBS}<OH{@gC(TFsc>w=&TvQt92-DLn|Y{Ayl1TZ`{zmlcRXWw>J4t$e!1YphPZRE
z1;-}gmOF-xTX%+)i94rD|0_dM;n-ggWnw>Z=f`1ALE16r(rL=xaJL3y`2YF-EA1p4
qySyunUGASS7E6U66r=?&7a*|9Y5yaSjn@lK)W?8sqC1E)XZ{76j8Z)S

literal 0
HcmV?d00001

diff --git a/tests/r_64to32bit/name b/tests/r_64to32bit/name
new file mode 100644
index 0000000..4c82371
--- /dev/null
+++ b/tests/r_64to32bit/name
@@ -0,0 +1 @@
+convert flex_bg 64bit fs to 32bit fs
diff --git a/tests/r_64to32bit/script b/tests/r_64to32bit/script
new file mode 100644
index 0000000..83a37bd
--- /dev/null
+++ b/tests/r_64to32bit/script
@@ -0,0 +1,75 @@
+if test -x $RESIZE2FS_EXE -a -x $DEBUGFS_EXE; then
+
+FSCK_OPT=-fn
+OUT=$test_name.log
+EXP=$test_dir/expect
+CONF=$TMPFILE.conf
+
+gzip -d < $EXP.gz > $EXP
+
+cat > $CONF << ENDL
+[fs_types]
+	ext4h = {
+		features = has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize,sparse_super,filetype,dir_index,ext_attr,resize_inode,64bit
+		blocksize = 1024
+		inode_size = 256
+		make_hugefiles = true
+		hugefiles_dir = /
+		hugefiles_slack = 0
+		hugefiles_name = aaaaa
+		hugefiles_digits = 4
+		hugefiles_size = 1M
+		zero_hugefiles = false
+	}
+ENDL
+
+echo "resize2fs test" > $OUT
+
+MKE2FS_CONFIG=$CONF $MKE2FS -F -T ext4h $TMPFILE 524288 >> $OUT 2>&1
+rm -rf $CONF
+
+# dump and check
+$DUMPE2FS $TMPFILE 2>&1 | egrep "(GDT|bitmap|descriptor|^Group|Inode table|features)" >> $OUT
+$FSCK $FSCK_OPT -N test_filesys $TMPFILE >> $OUT 2>&1
+status=$?
+echo Exit status is $status >> $OUT
+
+# resize it
+echo "resize2fs test.img -s" >> $OUT
+$RESIZE2FS $TMPFILE -s -f 2>&1 >> $OUT 2>&1
+status=$?
+echo Exit status is $status >> $OUT
+
+# dump and check
+$DUMPE2FS $TMPFILE 2>&1 | egrep "(GDT|bitmap|descriptor|^Group|Inode table|features)" >> $OUT
+$FSCK $FSCK_OPT -N test_filesys $TMPFILE >> $OUT 2>&1
+status=$?
+echo Exit status is $status >> $OUT
+
+rm $TMPFILE
+
+#
+# Do the verification
+#
+
+sed -f $cmd_dir/filter.sed -e "s;$TMPFILE;test.img;" -e 's/test_filesys:.*//g' -i $OUT
+
+cmp -s $OUT $EXP
+status=$?
+
+if [ "$status" = 0 ] ; then
+	echo "$test_name: $test_description: ok"
+	touch $test_name.ok
+else
+	echo "$test_name: $test_description: failed"
+	diff $DIFF_OPTS $EXP $OUT > $test_name.failed
+fi
+
+rm $EXP
+
+unset IMAGE FSCK_OPT OUT EXP CONF
+
+else #if test -x $RESIZE2FS_EXE -a -x $DEBUGFS_EXE; then
+	echo "$test_name: $test_description: skipped"
+fi
+
diff --git a/tests/r_64to32bit_meta/expect.gz b/tests/r_64to32bit_meta/expect.gz
new file mode 100644
index 0000000000000000000000000000000000000000..178d7bf3e0f6db3deb759ee39bd37567a7945dd0
GIT binary patch
literal 2920
zcmb7^X*d*&7RQk-W`vr-NDNZhm$7DV2$5}wA=$>h8>RAEs&~8?S;BiUXeN8c%-9W4
zO_ni<?6TA-ds7o3OWeBme!0*6bie(d=bZmJ=Q+>s`AcW;bI5ufIddIj71>38Y7@bb
ztd9h^+Naz?pMTA2N9h)O$NWRNp`#zlk3Hkja<Qc&vjA5@C)tC>RX~9moH^Vtlpl9Z
z1?4PUs`Z*sx+%5qvoqxN9_afyHF|T9^8R7R&u48jMq9KhpU-7I_8s5M^upt?=0@o!
z>{E;T%))kBUtSFCoMN45{Yl<E+^R5H%2=RnFMZoW3De^Bwp$m-CUJL^bh;lZ7AoP1
zf=nB?!cMBFs`z?6T;1$kn?iUVn>b-(k(zm`X(lXjpIf(Z<x5uTq?ks83CAao9Fy*s
zLoA>78A+Sd2a9bC-Tv1{bl?)t-ad1y?Y(iD;G$hq)p4&bP<(1dKG52*bAIQxks5{(
ztdT#qe1`CT-Vk@Ego(L*whE;%<a;+ZHh!+GGU}F|F-<-7=d#vRSYkK2{aD`QN5jEX
zR@5H>UrioP&O{rg&&++h)7s54S)%mXt38YV(#P6Ly&E?A^)i!{I(Pm{>0Z{4MRtRc
z(9byD!Of1|#D%!-nXi^%$-1*9tnI9+qqb-BK6&AX+0v5SjKSy~E((-A^LmlBly$Cp
zD4rQhN!7nzdWX@vu&^E+j^KMyuBL0AaBJ2mGOMhAAdPii29u~Gn?S_VCY^C#dY4&(
ziGi<ruI+8ubTXnT2_#WZ2a&$`)(5s<m<A>Cp4!{kN-zw&p?bNu;>UW&PSW07>!e9n
z($3-Bi6q*KMnmKKCFmF5UV<1h9?_hFKP4}8iDICWG41uDTAU?_<+>Zl7w9)yZ~b$8
zMAcn}pDtH!NaT4$1nQr+z&rxvHR3az)QHJ!018~)`3KsPPg?SE|7j^c3(RPGO1fxi
z1v)yI_pd<34hN|HT9o}GAi}4;WVsrE9|3@bE3QgdEO0=FJR_I`9btiwVyGz@wpvl1
z{)XXts!u$tJRg7K%L`PsZ%gh+x9!gG*24?RKFZ>&RQeY0ts%zbA=w+RLEw7aLn>|-
z(srsaXyJMFtJ8eoxRoZXJV0a<02Wh+3rJPc3&_{fEqO0nnhW>s?}DLER+}_(YIWY=
z%fJmd>gLT^(Y^GmF46wzSI=`K9vQHlM6dxuiDxs2ZckOME0Ba6BGVUC<$^?0i>?4n
zdG=B0T;uH~`Qy1~g$K|VAk(tR=Bu*>kt^As3#3cdu(>c<PPC=mjIntNwxYGhnj8K{
z`phoyxFwBskj!(D`b{2|jSi6ONU9Vlahm8;fS$@-{*Z2_ulLa|uv6A2ty=4}HCLg4
zP5|&`yR{(vLHZSL47{8v${AJh=9MWh<yr>VN}(Si(1-!xAbR^6Tg-lX$az&4ahq2a
z0?9W^TwoE=XSDR`API>Pvh}+e*Nxd!kiI8*<+C1xPppyRoH7(4oY;1c+`Dr4+E((s
ziub*QF!CGSQQtBM%ld}!z~gC1vuvi@Xc0wwWIuM+?iMz<l$WksDE(jw*J@{e$;tjp
zXh?Zcu0rW4UvX<c^>RvMgu>a~b8?mRXojm3m8-?lhA&Vh)UwI6c3fZgQ!iWS;!Lpn
zUa+7j01271fvgIQS*e_LQLf=7LNB=8+2be|)KavzFBF0CMO>S4MNqlBFi3dglA|5G
zFy&oFHHM?qan(aW^WF#9I$|U|Gj~>R%pq9s+JUM%q0$r%Bjjy1R>KCp`uPbq^9)x<
z3o0)KgFOFkqgn%B{hUzjOB!yhg$<(m`Hcln_#<6FHBHXId|p^&dX9(->I17H+uPIM
zLURTkG=}GOep=^C;ce(e0>0jMed=(ncVD(4(;aOQ<Bs9>%Ut##QqvK0wYjZ%<xt}8
z0Z;aM<$j&oc>Z7!Fx`I?dYY_ovgl8jFjeZW0YiaG$#7JP*FYFb3}rDOkj+*^-P;@-
z*y#D<<X2%v+)empR|y3ImBf8>z}9XOns&>IgCmM9Vjg}fm}M=nDWPS!W~A3Uc!>sf
zFKvk*F3_TDhbn<Rd@{$uS9%)v9=vOykC3^>+&wU#>Hx>zZ6K#z%k;Kt>>ku-Z#1W!
z3QMrOFY%a-Q<aqld{s8Qq=_u$<X1Blh-l7smJX2Jy+;!F>immzy_4DVRCtPgj!#FU
z6LY2?s$iuC?<=(<H?`YmZl79bhxAI$+lE`oO|S#PGQP)F+u!m8kSJ)#^AojLW0;h`
z|A^IZCJKHtaZKv)bxNt^Nm^)7*CqI{e?!fxr`)3v5Lw&FKrb|2i@@ikbvY)&7C^#J
z3yCmU?)Ux?CbzoF_D7~bYO2c9&+@=C%gOiO@|gliP`auXRp@(N@l_*D#p_%*E~yaQ
zH>)S`uWcPH5WhB9)ibc14v)eOj?t@&{!K+AMpuhaX}bX<R5P{H2wZ+eSJyo(7FQsz
zAR+i5#*@lqLIbkHm*fT@<u4vjM3iW#Vq9JH6<G)~AndU_tysd*Wx~?MDoUfXK%V-V
zzoEf(FTfoXDid1Q!!8YM7KuaMzaU#I?seri5!dzO1qP{vHP_}&P>|K;xgP$smkk#R
z7of_R$z0n4sF|A=az?q%H?{`#)Zsrs=!Y}Hnt%C|qRg!v3bN*Pl$5|7S2FW?wUY_}
z*+LbIxN;v0ASo^nBx8n@Q6E(%`rfTU)%~_BC&R<>SrwUS%w~BM%4*mZ?bI5=VE8K4
zQTBXm;wL^hdUz>9T*uhRqDEoSrhKd`(Kvd|xtRYt)P)g*_x`8_dCl?Xa-7KS@0Pbw
zym-d=86r4ljrOxNOU$>2y<p)NiV-83d(+BJPrYKlgC{xn5b{?0yr%agkH*rD@})@^
z+=~aS`9Emi*2PtQvXbix;9UL`9nvUNc~LcDioS8nuf`x@Kf=J-?625qvo2Ci&K;Yg
zcqc@P^-Wi^2-{h*99><m=*`Q@c{Cx{dR1TwazP3gN0ZQyT|K?jt`SXpSpE{bVvy0t
z)0K}H1y$!*+SJB7Nh#+V*nLc+`drrZ3E(zE#(Pmf`Kl=-Jos(hZA75MeVtdbwP}(!
z<wXK49puiptisMdLB@P^Fl&b4+;wphj?Pz+5XUj#MI<%_E29_{S?`WO1AT$b4oxV>
z6cG!aA(K;gdloQK#355lhx{G7&eL?8i{lQ#ri~GLT7+6$Y-RR`UuF8eLJ@xuPIIt9
zal&-MErV$^w03%Jz``A_^XqbXD&RPkD+6B~XE9C;u^t@Bo=Av^b;Of9zt_j}XGq?J
z4MP2kLQ?;g5Y1tz4CZbaPj<C(e1=-%_yXpcG9wK(p<nSBnUC_Pbd0$LJA@kF;FPx!
zoRtJBtznGp-1vhH&RxnSj%fC5J2Bi^#7l?VIMUvkKe{4$Qunj<z(96=tar5Y(-y31
z%S%Y8azT1vCT}xn>HbvBug{kL$_NRnC-F4I?$cP6jtJGfw%~2pi{8K{nUex1LK#6H
zKV!OQgpL(tC2kJCH+-{2r;XQG{ZRay(~t65U3<KImGx!n{d0tInvY@Eyf^FfSU)*4
zLTKOUpku!;;sDm%SUyMTUeEe@SSi;2<_8QXr&2kx&fL-$o6VK|G`KXA)gdC&@^M|}
z=KbtqKS$5QfSm+2s?TQN0XS0I_<K{!F8M}LW@>ED?w@x$4OM8xgD1AmsDC(QpB%kn
zD7C#4DNa4Qo%ub<UFsRwl*BU}%K071@!#_3Nbqs@3S?QW9($XJ@KVa3v{`I+`hR3Q
o(P{T@1omKYT$1EM3ms=8wL4vb|0C5JLw?*y@;E#H2+6_mPoWc=QUCw|

literal 0
HcmV?d00001

diff --git a/tests/r_64to32bit_meta/name b/tests/r_64to32bit_meta/name
new file mode 100644
index 0000000..e99ed8b
--- /dev/null
+++ b/tests/r_64to32bit_meta/name
@@ -0,0 +1 @@
+convert meta_bg 64bit fs to 32bit fs
diff --git a/tests/r_64to32bit_meta/script b/tests/r_64to32bit_meta/script
new file mode 100644
index 0000000..e2190ce
--- /dev/null
+++ b/tests/r_64to32bit_meta/script
@@ -0,0 +1,75 @@
+if test -x $RESIZE2FS_EXE -a -x $DEBUGFS_EXE; then
+
+FSCK_OPT=-fn
+OUT=$test_name.log
+EXP=$test_dir/expect
+CONF=$TMPFILE.conf
+
+gzip -d < $EXP.gz > $EXP
+
+cat > $CONF << ENDL
+[fs_types]
+	ext4h = {
+		features = has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize,sparse_super,filetype,dir_index,ext_attr,meta_bg,^resize_inode,64bit
+		blocksize = 1024
+		inode_size = 256
+		make_hugefiles = true
+		hugefiles_dir = /
+		hugefiles_slack = 0
+		hugefiles_name = aaaaa
+		hugefiles_digits = 4
+		hugefiles_size = 1M
+		zero_hugefiles = false
+	}
+ENDL
+
+echo "resize2fs test" > $OUT
+
+MKE2FS_CONFIG=$CONF $MKE2FS -F -T ext4h $TMPFILE 524288 >> $OUT 2>&1
+rm -rf $CONF
+
+# dump and check
+$DUMPE2FS $TMPFILE 2>&1 | egrep "(GDT|bitmap|descriptor|^Group|Inode table|features)" >> $OUT
+$FSCK $FSCK_OPT -N test_filesys $TMPFILE >> $OUT 2>&1
+status=$?
+echo Exit status is $status >> $OUT
+
+# resize it
+echo "resize2fs test.img -s" >> $OUT
+$RESIZE2FS $TMPFILE -s -f 2>&1 >> $OUT 2>&1
+status=$?
+echo Exit status is $status >> $OUT
+
+# dump and check
+$DUMPE2FS $TMPFILE 2>&1 | egrep "(GDT|bitmap|descriptor|^Group|Inode table|features)" >> $OUT
+$FSCK $FSCK_OPT -N test_filesys $TMPFILE >> $OUT 2>&1
+status=$?
+echo Exit status is $status >> $OUT
+
+rm $TMPFILE
+
+#
+# Do the verification
+#
+
+sed -f $cmd_dir/filter.sed -e "s;$TMPFILE;test.img;" -e 's/test_filesys:.*//g' -i $OUT
+
+cmp -s $OUT $EXP
+status=$?
+
+if [ "$status" = 0 ] ; then
+	echo "$test_name: $test_description: ok"
+	touch $test_name.ok
+else
+	echo "$test_name: $test_description: failed"
+	diff $DIFF_OPTS $EXP $OUT > $test_name.failed
+fi
+
+rm $EXP
+
+unset IMAGE FSCK_OPT OUT EXP CONF
+
+else #if test -x $RESIZE2FS_EXE -a -x $DEBUGFS_EXE; then
+	echo "$test_name: $test_description: skipped"
+fi
+


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 22/34] libext2fs: find inode goal when allocating blocks
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (20 preceding siblings ...)
  2014-09-13 22:13 ` [PATCH 21/34] tests: test resize2fs 32->64 and 64->32bit conversion code Darrick J. Wong
@ 2014-09-13 22:13 ` Darrick J. Wong
  2014-09-13 22:13 ` [PATCH 23/34] libext2fs: find/alloc a range of empty blocks Darrick J. Wong
                   ` (11 subsequent siblings)
  33 siblings, 0 replies; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:13 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

Try to be a little smarter about where we go to allocate blocks for a
inode.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 e2fsck/pass2.c         |    3 ++-
 lib/ext2fs/alloc.c     |   10 ++++++++++
 lib/ext2fs/bmap.c      |    5 +++--
 lib/ext2fs/expanddir.c |    2 +-
 lib/ext2fs/ext2fs.h    |    1 +
 lib/ext2fs/ext_attr.c  |    4 +---
 lib/ext2fs/extent.c    |   10 ++--------
 lib/ext2fs/mkdir.c     |    3 ++-
 lib/ext2fs/symlink.c   |    3 ++-
 9 files changed, 24 insertions(+), 17 deletions(-)


diff --git a/e2fsck/pass2.c b/e2fsck/pass2.c
index 2060ed2..fa17f20 100644
--- a/e2fsck/pass2.c
+++ b/e2fsck/pass2.c
@@ -1807,7 +1807,8 @@ static int allocate_dir_block(e2fsck_t ctx,
 	pctx->errcode = ext2fs_map_cluster_block(fs, db->ino, &inode,
 						 db->blockcnt, &blk);
 	if (pctx->errcode || blk == 0) {
-		pctx->errcode = ext2fs_new_block2(fs, 0,
+		blk = ext2fs_find_inode_goal(fs, db->ino);
+		pctx->errcode = ext2fs_new_block2(fs, blk,
 						  ctx->block_found_map, &blk);
 		if (pctx->errcode) {
 			pctx->str = "ext2fs_new_block";
diff --git a/lib/ext2fs/alloc.c b/lib/ext2fs/alloc.c
index 4e3bfdb..e58c01b 100644
--- a/lib/ext2fs/alloc.c
+++ b/lib/ext2fs/alloc.c
@@ -303,3 +303,13 @@ void ext2fs_set_alloc_block_callback(ext2_filsys fs,
 
 	fs->get_alloc_block = func;
 }
+
+blk64_t ext2fs_find_inode_goal(ext2_filsys fs, ext2_ino_t ino)
+{
+	dgrp_t	group = ext2fs_group_of_ino(fs, ino);
+	__u8	log_flex = fs->super->s_log_groups_per_flex;
+
+	if (log_flex)
+		group = group & ~((1 << (log_flex)) - 1);
+	return ext2fs_group_first_block2(fs, group);
+}
diff --git a/lib/ext2fs/bmap.c b/lib/ext2fs/bmap.c
index a4dc8ef..7623052 100644
--- a/lib/ext2fs/bmap.c
+++ b/lib/ext2fs/bmap.c
@@ -252,7 +252,7 @@ got_block:
 		retval = extent_bmap(fs, ino, inode, handle, block_buf,
 				     0, block-1, 0, blocks_alloc, &blk64);
 		if (retval)
-			blk64 = 0;
+			blk64 = ext2fs_find_inode_goal(fs, ino);
 		retval = ext2fs_alloc_block2(fs, blk64, block_buf,
 					     &blk64);
 		if (retval)
@@ -368,7 +368,8 @@ errcode_t ext2fs_bmap2(ext2_filsys fs, ext2_ino_t ino, struct ext2_inode *inode,
 		}
 
 		*phys_blk = inode_bmap(inode, block);
-		b = block ? inode_bmap(inode, block-1) : 0;
+		b = block ? inode_bmap(inode, block-1) :
+			    ext2fs_find_inode_goal(fs, ino);
 
 		if ((*phys_blk == 0) && (bmap_flags & BMAP_ALLOC)) {
 			retval = ext2fs_alloc_block(fs, b, block_buf, &b);
diff --git a/lib/ext2fs/expanddir.c b/lib/ext2fs/expanddir.c
index ecc13ae..e8dff30 100644
--- a/lib/ext2fs/expanddir.c
+++ b/lib/ext2fs/expanddir.c
@@ -104,7 +104,7 @@ errcode_t ext2fs_expand_dir(ext2_filsys fs, ext2_ino_t dir)
 
 	es.done = 0;
 	es.err = 0;
-	es.goal = 0;
+	es.goal = ext2fs_find_inode_goal(fs, dir);
 	es.newblocks = 0;
 	es.dir = dir;
 
diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
index 3419185..dcc3ec4 100644
--- a/lib/ext2fs/ext2fs.h
+++ b/lib/ext2fs/ext2fs.h
@@ -696,6 +696,7 @@ extern void ext2fs_set_alloc_block_callback(ext2_filsys fs,
 					    errcode_t (**old)(ext2_filsys fs,
 							      blk64_t goal,
 							      blk64_t *ret));
+blk64_t ext2fs_find_inode_goal(ext2_filsys fs, ext2_ino_t ino);
 
 /* alloc_sb.c */
 extern int ext2fs_reserve_super_and_bgd(ext2_filsys fs,
diff --git a/lib/ext2fs/ext_attr.c b/lib/ext2fs/ext_attr.c
index b52abb5..633835b 100644
--- a/lib/ext2fs/ext_attr.c
+++ b/lib/ext2fs/ext_attr.c
@@ -374,7 +374,6 @@ static errcode_t prep_ea_block_for_write(ext2_filsys fs, ext2_ino_t ino,
 {
 	struct ext2_ext_attr_header *header;
 	void *block_buf = NULL;
-	dgrp_t grp;
 	blk64_t blk, goal;
 	errcode_t err;
 
@@ -420,8 +419,7 @@ static errcode_t prep_ea_block_for_write(ext2_filsys fs, ext2_ino_t ino,
 	}
 
 	/* Allocate a block */
-	grp = ext2fs_group_of_ino(fs, ino);
-	goal = ext2fs_inode_table_loc(fs, grp);
+	goal = ext2fs_find_inode_goal(fs, ino);
 	err = ext2fs_alloc_block2(fs, goal, NULL, &blk);
 	if (err)
 		goto out2;
diff --git a/lib/ext2fs/extent.c b/lib/ext2fs/extent.c
index c9ef701..4c6fbbf 100644
--- a/lib/ext2fs/extent.c
+++ b/lib/ext2fs/extent.c
@@ -1012,14 +1012,8 @@ static errcode_t extent_node_split(ext2_extent_handle_t handle,
 		goto done;
 	}
 
-	if (!goal_blk) {
-		dgrp_t	group = ext2fs_group_of_ino(handle->fs, handle->ino);
-		__u8	log_flex = handle->fs->super->s_log_groups_per_flex;
-
-		if (log_flex)
-			group = group & ~((1 << (log_flex)) - 1);
-		goal_blk = ext2fs_group_first_block2(handle->fs, group);
-	}
+	if (!goal_blk)
+		goal_blk = ext2fs_find_inode_goal(handle->fs, handle->ino);
 	retval = ext2fs_alloc_block2(handle->fs, goal_blk, block_buf,
 				    &new_node_pblk);
 	if (retval)
diff --git a/lib/ext2fs/mkdir.c b/lib/ext2fs/mkdir.c
index c4c7967..c88ff9e 100644
--- a/lib/ext2fs/mkdir.c
+++ b/lib/ext2fs/mkdir.c
@@ -69,7 +69,8 @@ errcode_t ext2fs_mkdir(ext2_filsys fs, ext2_ino_t parent, ext2_ino_t inum,
 	 * Allocate a data block for the directory
 	 */
 	if (!inline_data) {
-		retval = ext2fs_new_block2(fs, 0, 0, &blk);
+		retval = ext2fs_new_block2(fs, ext2fs_find_inode_goal(fs, ino),
+					   NULL, &blk);
 		if (retval)
 			goto cleanup;
 	}
diff --git a/lib/ext2fs/symlink.c b/lib/ext2fs/symlink.c
index f6eb6b6..e268ed4 100644
--- a/lib/ext2fs/symlink.c
+++ b/lib/ext2fs/symlink.c
@@ -53,7 +53,8 @@ errcode_t ext2fs_symlink(ext2_filsys fs, ext2_ino_t parent, ext2_ino_t ino,
 	 */
 	fastlink = (target_len < sizeof(inode.i_block));
 	if (!fastlink) {
-		retval = ext2fs_new_block2(fs, 0, 0, &blk);
+		retval = ext2fs_new_block2(fs, ext2fs_find_inode_goal(fs, ino),
+					   NULL, &blk);
 		if (retval)
 			goto cleanup;
 		retval = ext2fs_get_mem(fs->blocksize, &block_buf);


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 23/34] libext2fs: find/alloc a range of empty blocks
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (21 preceding siblings ...)
  2014-09-13 22:13 ` [PATCH 22/34] libext2fs: find inode goal when allocating blocks Darrick J. Wong
@ 2014-09-13 22:13 ` Darrick J. Wong
  2014-09-13 22:13 ` [PATCH 24/34] libext2fs: add new hooks to support large allocations Darrick J. Wong
                   ` (10 subsequent siblings)
  33 siblings, 0 replies; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:13 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

Provide a function that, given a goal pblk and a range, will try to
find a run of free blocks to satisfy the allocation.  By default the
function will look anywhere in the filesystem for the run, though this
can be constrained with optional flags.  One flag indicates that the
range must start at the goal block; the other flag indicates that we
should not return a range shorter than len.

v2: Add a second function to allocate a range of blocks.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 lib/ext2fs/alloc.c  |  141 +++++++++++++++++++++++++++++++++++++++++++++++++++
 lib/ext2fs/ext2fs.h |   11 ++++
 2 files changed, 152 insertions(+)


diff --git a/lib/ext2fs/alloc.c b/lib/ext2fs/alloc.c
index e58c01b..7a5245e 100644
--- a/lib/ext2fs/alloc.c
+++ b/lib/ext2fs/alloc.c
@@ -26,6 +26,16 @@
 #include "ext2_fs.h"
 #include "ext2fs.h"
 
+#define min(a, b) ((a) < (b) ? (a) : (b))
+
+#undef DEBUG
+
+#ifdef DEBUG
+# define dbg_printf(f, a...)  do {printf(f, ## a); fflush(stdout); } while (0)
+#else
+# define dbg_printf(f, a...)
+#endif
+
 /*
  * Clear the uninit block bitmap flag if necessary
  */
@@ -313,3 +323,134 @@ blk64_t ext2fs_find_inode_goal(ext2_filsys fs, ext2_ino_t ino)
 		group = group & ~((1 << (log_flex)) - 1);
 	return ext2fs_group_first_block2(fs, group);
 }
+
+/*
+ * Starting at _goal_, scan around the filesystem to find a run of free blocks
+ * that's at least _len_ blocks long.  Possible flags:
+ * - EXT2_NEWRANGE_EXACT_GOAL: The range of blocks must start at _goal_.
+ * - EXT2_NEWRANGE_MIN_LENGTH: do not return a allocation shorter than _len_.
+ * - EXT2_NEWRANGE_ZERO_BLOCKS: Zero blocks pblk to pblk+plen before returning.
+ *
+ * The starting block is returned in _pblk_ and the length is returned via
+ * _plen_.  The blocks are not marked in the bitmap; the caller must mark
+ * however much of the returned run they actually use, hopefully via
+ * ext2fs_block_alloc_stats_range().
+ *
+ * This function can return a range that is longer than what was requested.
+ */
+errcode_t ext2fs_new_range(ext2_filsys fs, int flags, blk64_t goal,
+			   blk64_t len, ext2fs_block_bitmap map, blk64_t *pblk,
+			   blk64_t *plen)
+{
+	errcode_t retval;
+	blk64_t start, end, b;
+	int looped = 0;
+	blk64_t max_blocks = ext2fs_blocks_count(fs->super);
+
+	dbg_printf("%s: flags=0x%x goal=%llu len=%llu\n", __func__, flags,
+		   goal, len);
+	EXT2_CHECK_MAGIC(fs, EXT2_ET_MAGIC_EXT2FS_FILSYS);
+	if (len == 0 || (flags & ~EXT2_NEWRANGE_ALL_FLAGS))
+		return EXT2_ET_INVALID_ARGUMENT;
+	if (!map)
+		map = fs->block_map;
+	if (!map)
+		return EXT2_ET_NO_BLOCK_BITMAP;
+	if (!goal || goal >= ext2fs_blocks_count(fs->super))
+		goal = fs->super->s_first_data_block;
+
+	start = goal;
+	while (!looped || start <= goal) {
+		retval = ext2fs_find_first_zero_block_bitmap2(map, start,
+							      max_blocks - 1,
+							      &start);
+		if (retval == ENOENT) {
+			/*
+			 * If there are no free blocks beyond the starting
+			 * point, try scanning the whole filesystem, unless the
+			 * user told us only to allocate from _goal_, or if
+			 * we're already scanning the whole filesystem.
+			 */
+			if (flags & EXT2_NEWRANGE_FIXED_GOAL ||
+			    start == fs->super->s_first_data_block)
+				goto fail;
+			start = fs->super->s_first_data_block;
+			continue;
+		} else if (retval)
+			goto errout;
+
+		if (flags & EXT2_NEWRANGE_FIXED_GOAL && start != goal)
+			goto fail;
+
+		b = min(start + len - 1, max_blocks - 1);
+		retval =  ext2fs_find_first_set_block_bitmap2(map, start, b,
+							      &end);
+		if (retval == ENOENT)
+			end = b + 1;
+		else if (retval)
+			goto errout;
+
+		if (!(flags & EXT2_NEWRANGE_MIN_LENGTH) ||
+		    (end - start) >= len) {
+			/* Success! */
+			*pblk = start;
+			*plen = end - start;
+			dbg_printf("%s: new_range goal=%llu--%llu "
+				   "blk=%llu--%llu %llu\n",
+				   __func__, goal, goal + len - 1,
+				   *pblk, *pblk + *plen - 1, *plen);
+
+			for (b = start; b < end;
+			     b += fs->super->s_blocks_per_group)
+				clear_block_uninit(fs,
+						ext2fs_group_of_blk2(fs, b));
+			return 0;
+		}
+
+		if (flags & EXT2_NEWRANGE_FIXED_GOAL)
+			goto fail;
+		start = end;
+		if (start >= max_blocks) {
+			if (looped)
+				goto fail;
+			looped = 1;
+			start = fs->super->s_first_data_block;
+		}
+	}
+
+fail:
+	retval = EXT2_ET_BLOCK_ALLOC_FAIL;
+errout:
+	return retval;
+}
+
+errcode_t ext2fs_alloc_range(ext2_filsys fs, int flags, blk64_t goal,
+			     blk_t len, blk64_t *ret)
+{
+	int newr_flags = EXT2_NEWRANGE_MIN_LENGTH;
+	errcode_t retval;
+	blk64_t plen;
+
+	EXT2_CHECK_MAGIC(fs, EXT2_ET_MAGIC_EXT2FS_FILSYS);
+	if (len == 0 || (flags & ~EXT2_ALLOCRANGE_ALL_FLAGS))
+		return EXT2_ET_INVALID_ARGUMENT;
+
+	if (flags & EXT2_ALLOCRANGE_FIXED_GOAL)
+		newr_flags |= EXT2_NEWRANGE_FIXED_GOAL;
+
+	retval = ext2fs_new_range(fs, newr_flags, goal, len, NULL, ret, &plen);
+	if (retval)
+		return retval;
+
+	if (plen < len)
+		return EXT2_ET_BLOCK_ALLOC_FAIL;
+
+	if (flags & EXT2_ALLOCRANGE_ZERO_BLOCKS) {
+		retval = ext2fs_zero_blocks2(fs, *ret, len, NULL, NULL);
+		if (retval)
+			return retval;
+	}
+
+	ext2fs_block_alloc_stats_range(fs, *ret, len, +1);
+	return retval;
+}
diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
index dcc3ec4..8d3ade8 100644
--- a/lib/ext2fs/ext2fs.h
+++ b/lib/ext2fs/ext2fs.h
@@ -697,6 +697,17 @@ extern void ext2fs_set_alloc_block_callback(ext2_filsys fs,
 							      blk64_t goal,
 							      blk64_t *ret));
 blk64_t ext2fs_find_inode_goal(ext2_filsys fs, ext2_ino_t ino);
+#define EXT2_NEWRANGE_FIXED_GOAL	(0x1)
+#define EXT2_NEWRANGE_MIN_LENGTH	(0x2)
+#define EXT2_NEWRANGE_ALL_FLAGS		(0x3)
+errcode_t ext2fs_new_range(ext2_filsys fs, int flags, blk64_t goal,
+			   blk64_t len, ext2fs_block_bitmap map, blk64_t *pblk,
+			   blk64_t *plen);
+#define EXT2_ALLOCRANGE_FIXED_GOAL	(0x1)
+#define EXT2_ALLOCRANGE_ZERO_BLOCKS	(0x2)
+#define EXT2_ALLOCRANGE_ALL_FLAGS	(0x3)
+errcode_t ext2fs_alloc_range(ext2_filsys fs, int flags, blk64_t goal,
+			     blk_t len, blk64_t *ret);
 
 /* alloc_sb.c */
 extern int ext2fs_reserve_super_and_bgd(ext2_filsys fs,


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 24/34] libext2fs: add new hooks to support large allocations
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (22 preceding siblings ...)
  2014-09-13 22:13 ` [PATCH 23/34] libext2fs: find/alloc a range of empty blocks Darrick J. Wong
@ 2014-09-13 22:13 ` Darrick J. Wong
  2014-09-13 22:14 ` [PATCH 25/34] libext2fs: implement fallocate Darrick J. Wong
                   ` (9 subsequent siblings)
  33 siblings, 0 replies; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:13 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

Add a new get_alloc_blocks hook and a block_alloc_stats_range hook so
that e2fsck can capture allocation requests spanning more than a
block to its block_found_map.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 e2fsck/pass1.c           |   45 +++++++++++++++++++++++++++++++++++++++++++++
 lib/ext2fs/alloc.c       |   37 ++++++++++++++++++++++++++++++++++++-
 lib/ext2fs/alloc_stats.c |   16 ++++++++++++++++
 lib/ext2fs/ext2fs.h      |   16 ++++++++++++++++
 4 files changed, 113 insertions(+), 1 deletion(-)


diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
index a963849..2d59f63 100644
--- a/e2fsck/pass1.c
+++ b/e2fsck/pass1.c
@@ -3802,6 +3802,26 @@ static errcode_t e2fsck_get_alloc_block(ext2_filsys fs, blk64_t goal,
 	return (0);
 }
 
+static errcode_t e2fsck_new_range(ext2_filsys fs, int flags, blk64_t goal,
+				  blk64_t len, blk64_t *pblk, blk64_t *plen)
+{
+	e2fsck_t ctx = (e2fsck_t) fs->priv_data;
+	errcode_t	retval;
+
+	if (ctx->block_found_map)
+		return ext2fs_new_range(fs, flags, goal, len,
+					ctx->block_found_map, pblk, plen);
+
+	if (!fs->block_map) {
+		retval = ext2fs_read_block_bitmap(fs);
+		if (retval)
+			return retval;
+	}
+
+	return ext2fs_new_range(fs, flags, goal, len, fs->block_map,
+				pblk, plen);
+}
+
 static void e2fsck_block_alloc_stats(ext2_filsys fs, blk64_t blk, int inuse)
 {
 	e2fsck_t ctx = (e2fsck_t) fs->priv_data;
@@ -3821,6 +3841,28 @@ static void e2fsck_block_alloc_stats(ext2_filsys fs, blk64_t blk, int inuse)
 	}
 }
 
+static void e2fsck_block_alloc_stats_range(ext2_filsys fs, blk64_t blk,
+					   blk_t num, int inuse)
+{
+	e2fsck_t ctx = (e2fsck_t) fs->priv_data;
+
+	/* Never free a critical metadata block */
+	if (ctx->block_found_map &&
+	    ctx->block_metadata_map &&
+	    inuse < 0 &&
+	    ext2fs_test_block_bitmap_range2(ctx->block_metadata_map, blk, num))
+		return;
+
+	if (ctx->block_found_map) {
+		if (inuse > 0)
+			ext2fs_mark_block_bitmap_range2(ctx->block_found_map,
+							blk, num);
+		else
+			ext2fs_unmark_block_bitmap_range2(ctx->block_found_map,
+							blk, num);
+	}
+}
+
 void e2fsck_use_inode_shortcuts(e2fsck_t ctx, int use_shortcuts)
 {
 	ext2_filsys fs = ctx->fs;
@@ -3844,4 +3886,7 @@ void e2fsck_intercept_block_allocations(e2fsck_t ctx)
 	ext2fs_set_alloc_block_callback(ctx->fs, e2fsck_get_alloc_block, 0);
 	ext2fs_set_block_alloc_stats_callback(ctx->fs,
 						e2fsck_block_alloc_stats, 0);
+	ext2fs_set_new_range_callback(ctx->fs, e2fsck_new_range, NULL);
+	ext2fs_set_block_alloc_stats_range_callback(ctx->fs,
+					e2fsck_block_alloc_stats_range, NULL);
 }
diff --git a/lib/ext2fs/alloc.c b/lib/ext2fs/alloc.c
index 7a5245e..3723b78 100644
--- a/lib/ext2fs/alloc.c
+++ b/lib/ext2fs/alloc.c
@@ -346,12 +346,32 @@ errcode_t ext2fs_new_range(ext2_filsys fs, int flags, blk64_t goal,
 	blk64_t start, end, b;
 	int looped = 0;
 	blk64_t max_blocks = ext2fs_blocks_count(fs->super);
+	errcode_t (*nrf)(ext2_filsys fs, int flags, blk64_t goal,
+			 blk64_t len, blk64_t *pblk, blk64_t *plen);
 
 	dbg_printf("%s: flags=0x%x goal=%llu len=%llu\n", __func__, flags,
 		   goal, len);
 	EXT2_CHECK_MAGIC(fs, EXT2_ET_MAGIC_EXT2FS_FILSYS);
 	if (len == 0 || (flags & ~EXT2_NEWRANGE_ALL_FLAGS))
 		return EXT2_ET_INVALID_ARGUMENT;
+
+	if (!map && fs->new_range) {
+		/*
+		 * In case there are clients out there whose new_range
+		 * handlers call ext2fs_new_range with a NULL block map,
+		 * temporarily swap out the function pointer so that we don't
+		 * end up in an infinite loop.
+		 */
+		nrf = fs->new_range;
+		fs->new_range = NULL;
+		retval = nrf(fs, flags, goal, len, pblk, plen);
+		fs->new_range = nrf;
+		if (retval)
+			return retval;
+		start = *pblk;
+		end = *pblk + *plen;
+		goto allocated;
+	}
 	if (!map)
 		map = fs->block_map;
 	if (!map)
@@ -399,7 +419,7 @@ errcode_t ext2fs_new_range(ext2_filsys fs, int flags, blk64_t goal,
 				   "blk=%llu--%llu %llu\n",
 				   __func__, goal, goal + len - 1,
 				   *pblk, *pblk + *plen - 1, *plen);
-
+allocated:
 			for (b = start; b < end;
 			     b += fs->super->s_blocks_per_group)
 				clear_block_uninit(fs,
@@ -424,6 +444,21 @@ errout:
 	return retval;
 }
 
+void ext2fs_set_new_range_callback(ext2_filsys fs,
+	errcode_t (*func)(ext2_filsys fs, int flags, blk64_t goal,
+			       blk64_t len, blk64_t *pblk, blk64_t *plen),
+	errcode_t (**old)(ext2_filsys fs, int flags, blk64_t goal,
+			       blk64_t len, blk64_t *pblk, blk64_t *plen))
+{
+	if (!fs || fs->magic != EXT2_ET_MAGIC_EXT2FS_FILSYS)
+		return;
+
+	if (old)
+		*old = fs->new_range;
+
+	fs->new_range = func;
+}
+
 errcode_t ext2fs_alloc_range(ext2_filsys fs, int flags, blk64_t goal,
 			     blk_t len, blk64_t *ret)
 {
diff --git a/lib/ext2fs/alloc_stats.c b/lib/ext2fs/alloc_stats.c
index 3d3697c..5999082 100644
--- a/lib/ext2fs/alloc_stats.c
+++ b/lib/ext2fs/alloc_stats.c
@@ -145,4 +145,20 @@ void ext2fs_block_alloc_stats_range(ext2_filsys fs, blk64_t blk,
 	}
 	ext2fs_mark_super_dirty(fs);
 	ext2fs_mark_bb_dirty(fs);
+	if (fs->block_alloc_stats_range)
+		(fs->block_alloc_stats_range)(fs, blk, num, inuse);
+}
+
+void ext2fs_set_block_alloc_stats_range_callback(ext2_filsys fs,
+	void (*func)(ext2_filsys fs, blk64_t blk,
+				    blk_t num, int inuse),
+	void (**old)(ext2_filsys fs, blk64_t blk,
+				    blk_t num, int inuse))
+{
+	if (!fs || fs->magic != EXT2_ET_MAGIC_EXT2FS_FILSYS)
+		return;
+	if (old)
+		*old = fs->block_alloc_stats_range;
+
+	fs->block_alloc_stats_range = func;
 }
diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
index 8d3ade8..402a19b 100644
--- a/lib/ext2fs/ext2fs.h
+++ b/lib/ext2fs/ext2fs.h
@@ -285,6 +285,12 @@ struct struct_ext2_filsys {
 
 	io_channel			journal_io;
 	char				*journal_name;
+
+	/* New block range allocation hooks */
+	errcode_t (*new_range)(ext2_filsys fs, int flags, blk64_t goal,
+			       blk64_t len, blk64_t *pblk, blk64_t *plen);
+	void (*block_alloc_stats_range)(ext2_filsys fs, blk64_t blk, blk_t num,
+					int inuse);
 };
 
 #if EXT2_FLAT_INCLUDES
@@ -696,6 +702,16 @@ extern void ext2fs_set_alloc_block_callback(ext2_filsys fs,
 					    errcode_t (**old)(ext2_filsys fs,
 							      blk64_t goal,
 							      blk64_t *ret));
+extern void ext2fs_set_new_range_callback(ext2_filsys fs,
+	errcode_t (*func)(ext2_filsys fs, int flags, blk64_t goal,
+			       blk64_t len, blk64_t *pblk, blk64_t *plen),
+	errcode_t (**old)(ext2_filsys fs, int flags, blk64_t goal,
+			       blk64_t len, blk64_t *pblk, blk64_t *plen));
+extern void ext2fs_set_block_alloc_stats_range_callback(ext2_filsys fs,
+	void (*func)(ext2_filsys fs, blk64_t blk,
+				    blk_t num, int inuse),
+	void (**old)(ext2_filsys fs, blk64_t blk,
+				    blk_t num, int inuse));
 blk64_t ext2fs_find_inode_goal(ext2_filsys fs, ext2_ino_t ino);
 #define EXT2_NEWRANGE_FIXED_GOAL	(0x1)
 #define EXT2_NEWRANGE_MIN_LENGTH	(0x2)


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 25/34] libext2fs: implement fallocate
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (23 preceding siblings ...)
  2014-09-13 22:13 ` [PATCH 24/34] libext2fs: add new hooks to support large allocations Darrick J. Wong
@ 2014-09-13 22:14 ` Darrick J. Wong
  2014-09-13 22:14 ` [PATCH 26/34] libext2fs: use fallocate for creating journals and hugefiles Darrick J. Wong
                   ` (8 subsequent siblings)
  33 siblings, 0 replies; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:14 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

Create a library function to perform fallocation on arbitrary files,
and wire up a few users for this function.  This is a bit more intense
than Ted's original mk_hugefiles implementation since we have to honor
any blocks that may already be allocated to the file.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 lib/ext2fs/Makefile.in |    8 
 lib/ext2fs/bmap.c      |    2 
 lib/ext2fs/ext2fs.h    |   10 +
 lib/ext2fs/fallocate.c |  852 ++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 871 insertions(+), 1 deletion(-)
 create mode 100644 lib/ext2fs/fallocate.c


diff --git a/lib/ext2fs/Makefile.in b/lib/ext2fs/Makefile.in
index 343d5d0..bc0dc8a 100644
--- a/lib/ext2fs/Makefile.in
+++ b/lib/ext2fs/Makefile.in
@@ -79,6 +79,7 @@ OBJS= $(DEBUGFS_LIB_OBJS) $(RESIZE_LIB_OBJS) $(E2IMAGE_LIB_OBJS) \
 	expanddir.o \
 	ext_attr.o \
 	extent.o \
+	fallocate.o \
 	fileio.o \
 	finddev.o \
 	flushb.o \
@@ -763,6 +764,13 @@ extent.o: $(srcdir)/extent.c $(top_builddir)/lib/config.h \
  $(top_srcdir)/lib/et/com_err.h $(srcdir)/ext2_io.h \
  $(top_builddir)/lib/ext2fs/ext2_err.h $(srcdir)/ext2_ext_attr.h \
  $(srcdir)/bitops.h $(srcdir)/e2image.h
+fallocate.o: $(srcdir)/fallocate.c $(top_builddir)/lib/config.h \
+ $(top_builddir)/lib/dirpaths.h $(srcdir)/ext2_fs.h \
+ $(top_builddir)/lib/ext2fs/ext2_types.h $(srcdir)/ext2fsP.h \
+ $(srcdir)/ext2fs.h $(srcdir)/ext2_fs.h $(srcdir)/ext3_extents.h \
+ $(top_srcdir)/lib/et/com_err.h $(srcdir)/ext2_io.h \
+ $(top_builddir)/lib/ext2fs/ext2_err.h $(srcdir)/ext2_ext_attr.h \
+ $(srcdir)/bitops.h $(srcdir)/e2image.h
 fileio.o: $(srcdir)/fileio.c $(top_builddir)/lib/config.h \
  $(top_builddir)/lib/dirpaths.h $(srcdir)/ext2_fs.h \
  $(top_builddir)/lib/ext2fs/ext2_types.h $(srcdir)/ext2fs.h \
diff --git a/lib/ext2fs/bmap.c b/lib/ext2fs/bmap.c
index 7623052..c1a8931 100644
--- a/lib/ext2fs/bmap.c
+++ b/lib/ext2fs/bmap.c
@@ -67,7 +67,7 @@ static _BMAP_INLINE_ errcode_t block_ind_bmap(ext2_filsys fs, int flags,
 #endif
 
 	if (!b && (flags & BMAP_ALLOC)) {
-		b = nr ? ((blk_t *) block_buf)[nr-1] : 0;
+		b = nr ? ((blk_t *) block_buf)[nr-1] : ind;
 		retval = ext2fs_alloc_block(fs, b,
 					    block_buf + fs->blocksize, &b);
 		if (retval)
diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
index 402a19b..a279c9b 100644
--- a/lib/ext2fs/ext2fs.h
+++ b/lib/ext2fs/ext2fs.h
@@ -1258,6 +1258,16 @@ extern errcode_t ext2fs_extent_goto2(ext2_extent_handle_t handle,
 				     int leaf_level, blk64_t blk);
 extern errcode_t ext2fs_extent_fix_parents(ext2_extent_handle_t handle);
 
+/* fallocate.c */
+#define EXT2_FALLOCATE_ZERO_BLOCKS	(0x1)
+#define EXT2_FALLOCATE_FORCE_INIT	(0x2)
+#define EXT2_FALLOCATE_FORCE_UNINIT	(0x4)
+#define EXT2_FALLOCATE_INIT_BEYOND_EOF	(0x8)
+#define EXT2_FALLOCATE_ALL_FLAGS	(0xF)
+errcode_t ext2fs_fallocate(ext2_filsys fs, int flags, ext2_ino_t ino,
+			   struct ext2_inode *inode, blk64_t goal,
+			   blk64_t start, blk64_t len);
+
 /* fileio.c */
 extern errcode_t ext2fs_file_open2(ext2_filsys fs, ext2_ino_t ino,
 				   struct ext2_inode *inode,
diff --git a/lib/ext2fs/fallocate.c b/lib/ext2fs/fallocate.c
new file mode 100644
index 0000000..af0c1b6
--- /dev/null
+++ b/lib/ext2fs/fallocate.c
@@ -0,0 +1,852 @@
+/*
+ * fallocate.c -- Allocate large chunks of file.
+ *
+ * Copyright (C) 2014 Oracle.
+ *
+ * %Begin-Header%
+ * This file may be redistributed under the terms of the GNU Library
+ * General Public License, version 2.
+ * %End-Header%
+ */
+
+#include "config.h"
+
+#include "ext2_fs.h"
+#include "ext2fs.h"
+#define min(a, b) ((a) < (b) ? (a) : (b))
+
+#undef DEBUG
+
+#ifdef DEBUG
+# define dbg_printf(f, a...)  do {printf(f, ## a); fflush(stdout); } while (0)
+#else
+# define dbg_printf(f, a...)
+#endif
+
+/*
+ * Extent-based fallocate code.
+ *
+ * Find runs of unmapped logical blocks by starting at start and walking the
+ * extents until we reach the end of the range we want.
+ *
+ * For each run of unmapped blocks, try to find the extents on either side of
+ * the range.  If there's a left extent that can grow by at least a cluster and
+ * there are lblocks between start and the next lcluster after start, see if
+ * there's an implied cluster allocation; if so, zero the blocks (if the left
+ * extent is initialized) and adjust the extent.  Ditto for the blocks between
+ * the end of the last full lcluster and end, if there's a right extent.
+ *
+ * Try to attach as much as we can to the left extent, then try to attach as
+ * much as we can to the right extent.  For the remainder, try to allocate the
+ * whole range; map in whatever we get; and repeat until we're done.
+ *
+ * To attach to a left extent, figure out the maximum amount we can add to the
+ * extent and try to allocate that much, and append if successful.  To attach
+ * to a right extent, figure out the max we can add to the extent, try to
+ * allocate that much, and prepend if successful.
+ *
+ * We need an alloc_range function that tells us how much we can allocate given
+ * a maximum length and one of a suggested start, a fixed start, or a fixed end
+ * point.
+ *
+ * Every time we modify the extent tree we also need to update the block stats.
+ *
+ * At the end, update i_blocks and i_size appropriately.
+ */
+
+static void dbg_print_extent(char *desc, struct ext2fs_extent *extent)
+{
+#ifdef DEBUG
+	if (desc)
+		printf("%s: ", desc);
+	printf("extent: lblk %llu--%llu, len %u, pblk %llu, flags: ",
+	       extent->e_lblk, extent->e_lblk + extent->e_len - 1,
+	       extent->e_len, extent->e_pblk);
+	if (extent->e_flags & EXT2_EXTENT_FLAGS_LEAF)
+		fputs("LEAF ", stdout);
+	if (extent->e_flags & EXT2_EXTENT_FLAGS_UNINIT)
+		fputs("UNINIT ", stdout);
+	if (extent->e_flags & EXT2_EXTENT_FLAGS_SECOND_VISIT)
+		fputs("2ND_VISIT ", stdout);
+	if (!extent->e_flags)
+		fputs("(none)", stdout);
+	fputc('\n', stdout);
+	fflush(stdout);
+#endif
+}
+
+static errcode_t claim_range(ext2_filsys fs, struct ext2_inode *inode,
+			     blk64_t blk, blk64_t len)
+{
+	blk64_t	clusters;
+
+	clusters = (len + EXT2FS_CLUSTER_RATIO(fs) - 1) /
+		   EXT2FS_CLUSTER_RATIO(fs);
+	ext2fs_block_alloc_stats_range(fs, blk,
+			clusters * EXT2FS_CLUSTER_RATIO(fs), +1);
+	return ext2fs_iblk_add_blocks(fs, inode, clusters);
+}
+
+static errcode_t ext_falloc_helper(ext2_filsys fs,
+				   int flags,
+				   ext2_ino_t ino,
+				   struct ext2_inode *inode,
+				   ext2_extent_handle_t handle,
+				   struct ext2fs_extent *left_ext,
+				   struct ext2fs_extent *right_ext,
+				   blk64_t range_start, blk64_t range_len,
+				   blk64_t alloc_goal)
+{
+	struct ext2fs_extent	newex, ex;
+	int			op;
+	blk64_t			fillable, pblk, plen, x, y;
+	blk64_t			eof_blk = 0, cluster_fill = 0;
+	errcode_t		err;
+	blk_t			max_extent_len, max_uninit_len, max_init_len;
+
+#ifdef DEBUG
+	printf("%s: ", __func__);
+	if (left_ext)
+		printf("left_ext=%llu--%llu, ", left_ext->e_lblk,
+		       left_ext->e_lblk + left_ext->e_len - 1);
+	if (right_ext)
+		printf("right_ext=%llu--%llu, ", right_ext->e_lblk,
+		       right_ext->e_lblk + right_ext->e_len - 1);
+	printf("start=%llu len=%llu, goal=%llu\n", range_start, range_len,
+	       alloc_goal);
+	fflush(stdout);
+#endif
+	/* Can't create initialized extents past EOF? */
+	if (!(flags & EXT2_FALLOCATE_INIT_BEYOND_EOF))
+		eof_blk = EXT2_I_SIZE(inode) / fs->blocksize;
+
+	/* The allocation goal must be as far into a cluster as range_start. */
+	alloc_goal = (alloc_goal & ~EXT2FS_CLUSTER_MASK(fs)) |
+		     (range_start & EXT2FS_CLUSTER_MASK(fs));
+
+	max_uninit_len = EXT_UNINIT_MAX_LEN & ~EXT2FS_CLUSTER_MASK(fs);
+	max_init_len = EXT_INIT_MAX_LEN & ~EXT2FS_CLUSTER_MASK(fs);
+
+	/* We must lengthen the left extent to the end of the cluster */
+	if (left_ext && EXT2FS_CLUSTER_RATIO(fs) > 1) {
+		/* How many more blocks can be attached to left_ext? */
+		if (left_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT)
+			fillable = max_uninit_len - left_ext->e_len;
+		else
+			fillable = max_init_len - left_ext->e_len;
+
+		if (fillable > range_len)
+			fillable = range_len;
+		if (fillable == 0)
+			goto expand_right;
+
+		/*
+		 * If range_start isn't on a cluster boundary, try an
+		 * implied cluster allocation for left_ext.
+		 */
+		cluster_fill = EXT2FS_CLUSTER_RATIO(fs) -
+			       (range_start & EXT2FS_CLUSTER_MASK(fs));
+		cluster_fill &= EXT2FS_CLUSTER_MASK(fs);
+		if (cluster_fill == 0)
+			goto expand_right;
+
+		if (cluster_fill > fillable)
+			cluster_fill = fillable;
+
+		/* Don't expand an initialized left_ext beyond EOF */
+		if (!(flags & EXT2_FALLOCATE_INIT_BEYOND_EOF)) {
+			x = left_ext->e_lblk + left_ext->e_len - 1;
+			dbg_printf("%s: lend=%llu newlend=%llu eofblk=%llu\n",
+				   __func__, x, x + cluster_fill, eof_blk);
+			if (eof_blk >= x && eof_blk <= x + cluster_fill)
+				cluster_fill = eof_blk - x;
+			if (cluster_fill == 0)
+				goto expand_right;
+		}
+
+		err = ext2fs_extent_goto(handle, left_ext->e_lblk);
+		if (err)
+			goto expand_right;
+		left_ext->e_len += cluster_fill;
+		range_start += cluster_fill;
+		range_len -= cluster_fill;
+		alloc_goal += cluster_fill;
+
+		dbg_print_extent("ext_falloc clus left+", left_ext);
+		err = ext2fs_extent_replace(handle, 0, left_ext);
+		if (err)
+			goto out;
+		err = ext2fs_extent_fix_parents(handle);
+		if (err)
+			goto out;
+
+		/* Zero blocks */
+		if (!(left_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT)) {
+			err = ext2fs_zero_blocks2(fs, left_ext->e_pblk +
+						  left_ext->e_len -
+						  cluster_fill, cluster_fill,
+						  NULL, NULL);
+			if (err)
+				goto out;
+		}
+	}
+
+expand_right:
+	/* We must lengthen the right extent to the beginning of the cluster */
+	if (right_ext && EXT2FS_CLUSTER_RATIO(fs) > 1) {
+		/* How much can we attach to right_ext? */
+		if (right_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT)
+			fillable = max_uninit_len - right_ext->e_len;
+		else
+			fillable = max_init_len - right_ext->e_len;
+
+		if (fillable > range_len)
+			fillable = range_len;
+		if (fillable == 0)
+			goto try_merge;
+
+		/*
+		 * If range_end isn't on a cluster boundary, try an implied
+		 * cluster allocation for right_ext.
+		 */
+		cluster_fill = right_ext->e_lblk & EXT2FS_CLUSTER_MASK(fs);
+		if (cluster_fill == 0)
+			goto try_merge;
+
+		err = ext2fs_extent_goto(handle, right_ext->e_lblk);
+		if (err)
+			goto out;
+
+		if (cluster_fill > fillable)
+			cluster_fill = fillable;
+		right_ext->e_lblk -= cluster_fill;
+		right_ext->e_pblk -= cluster_fill;
+		right_ext->e_len += cluster_fill;
+		range_len -= cluster_fill;
+
+		dbg_print_extent("ext_falloc clus right+", right_ext);
+		err = ext2fs_extent_replace(handle, 0, right_ext);
+		if (err)
+			goto out;
+		err = ext2fs_extent_fix_parents(handle);
+		if (err)
+			goto out;
+
+		/* Zero blocks if necessary */
+		if (!(right_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT)) {
+			err = ext2fs_zero_blocks2(fs, right_ext->e_pblk,
+						  cluster_fill, NULL, NULL);
+			if (err)
+				goto out;
+		}
+	}
+
+try_merge:
+	/* Merge both extents together, perhaps? */
+	if (left_ext && right_ext) {
+		/* Are the two extents mergeable? */
+		if ((left_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT) !=
+		    (right_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT))
+			goto try_left;
+
+		/* User requires init/uninit but extent is uninit/init. */
+		if (((flags & EXT2_FALLOCATE_FORCE_INIT) &&
+		     (left_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT)) ||
+		    ((flags & EXT2_FALLOCATE_FORCE_UNINIT) &&
+		     !(left_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT)))
+			goto try_left;
+
+		/*
+		 * Skip initialized extent unless user wants to zero blocks
+		 * or requires init extent.
+		 */
+		if (!(left_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT) &&
+		    (!(flags & EXT2_FALLOCATE_ZERO_BLOCKS) ||
+		     !(flags & EXT2_FALLOCATE_FORCE_INIT)))
+			goto try_left;
+
+		/* Will it even fit? */
+		x = left_ext->e_len + range_len + right_ext->e_len;
+		if (x > (left_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT ?
+				max_uninit_len : max_init_len))
+			goto try_left;
+
+		err = ext2fs_extent_goto(handle, left_ext->e_lblk);
+		if (err)
+			goto try_left;
+
+		/* Allocate blocks */
+		y = left_ext->e_pblk + left_ext->e_len;
+		err = ext2fs_new_range(fs, EXT2_NEWRANGE_FIXED_GOAL |
+				       EXT2_NEWRANGE_MIN_LENGTH, y,
+				       right_ext->e_pblk - y + 1, NULL,
+				       &pblk, &plen);
+		if (err)
+			goto try_left;
+		if (pblk + plen != right_ext->e_pblk)
+			goto try_left;
+		err = claim_range(fs, inode, pblk, plen);
+		if (err)
+			goto out;
+
+		/* Modify extents */
+		left_ext->e_len = x;
+		dbg_print_extent("ext_falloc merge", left_ext);
+		err = ext2fs_extent_replace(handle, 0, left_ext);
+		if (err)
+			goto out;
+		err = ext2fs_extent_fix_parents(handle);
+		if (err)
+			goto out;
+		err = ext2fs_extent_get(handle, EXT2_EXTENT_NEXT_LEAF, &newex);
+		if (err)
+			goto out;
+		err = ext2fs_extent_delete(handle, 0);
+		if (err)
+			goto out;
+		err = ext2fs_extent_fix_parents(handle);
+		if (err)
+			goto out;
+		*right_ext = *left_ext;
+
+		/* Zero blocks */
+		if (!(left_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT) &&
+		    (flags & EXT2_FALLOCATE_ZERO_BLOCKS)) {
+			err = ext2fs_zero_blocks2(fs, range_start, range_len,
+						  NULL, NULL);
+			if (err)
+				goto out;
+		}
+
+		return 0;
+	}
+
+try_left:
+	/* Extend the left extent */
+	if (left_ext) {
+		/* How many more blocks can be attached to left_ext? */
+		if (left_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT)
+			fillable = max_uninit_len - left_ext->e_len;
+		else if (flags & EXT2_FALLOCATE_ZERO_BLOCKS)
+			fillable = max_init_len - left_ext->e_len;
+		else
+			fillable = 0;
+
+		/* User requires init/uninit but extent is uninit/init. */
+		if (((flags & EXT2_FALLOCATE_FORCE_INIT) &&
+		     (left_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT)) ||
+		    ((flags & EXT2_FALLOCATE_FORCE_UNINIT) &&
+		     !(left_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT)))
+			goto try_right;
+
+		if (fillable > range_len)
+			fillable = range_len;
+
+		/* Don't expand an initialized left_ext beyond EOF */
+		x = left_ext->e_lblk + left_ext->e_len - 1;
+		if (!(flags & EXT2_FALLOCATE_INIT_BEYOND_EOF)) {
+			dbg_printf("%s: lend=%llu newlend=%llu eofblk=%llu\n",
+				   __func__, x, x + fillable, eof_blk);
+			if (eof_blk >= x && eof_blk <= x + fillable)
+				fillable = eof_blk - x;
+		}
+
+		if (fillable == 0)
+			goto try_right;
+
+		/* Test if the right edge of the range is already mapped? */
+		if (EXT2FS_CLUSTER_RATIO(fs) > 1) {
+			err = ext2fs_map_cluster_block(fs, ino, inode,
+					x + fillable, &pblk);
+			if (err)
+				goto out;
+			if (pblk)
+				fillable -= 1 + ((x + fillable)
+						 & EXT2FS_CLUSTER_MASK(fs));
+			if (fillable == 0)
+				goto try_right;
+		}
+
+		/* Allocate range of blocks */
+		x = left_ext->e_pblk + left_ext->e_len;
+		err = ext2fs_new_range(fs, EXT2_NEWRANGE_FIXED_GOAL |
+				EXT2_NEWRANGE_MIN_LENGTH,
+				x, fillable, NULL, &pblk, &plen);
+		if (err)
+			goto try_right;
+		err = claim_range(fs, inode, pblk, plen);
+		if (err)
+			goto out;
+
+		/* Modify left_ext */
+		err = ext2fs_extent_goto(handle, left_ext->e_lblk);
+		if (err)
+			goto out;
+		range_start += plen;
+		range_len -= plen;
+		left_ext->e_len += plen;
+		dbg_print_extent("ext_falloc left+", left_ext);
+		err = ext2fs_extent_replace(handle, 0, left_ext);
+		if (err)
+			goto out;
+		err = ext2fs_extent_fix_parents(handle);
+		if (err)
+			goto out;
+
+		/* Zero blocks if necessary */
+		if (!(left_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT) &&
+		    (flags & EXT2_FALLOCATE_ZERO_BLOCKS)) {
+			err = ext2fs_zero_blocks2(fs, pblk, plen, NULL, NULL);
+			if (err)
+				goto out;
+		}
+	}
+
+try_right:
+	/* Extend the right extent */
+	if (right_ext) {
+		/* How much can we attach to right_ext? */
+		if (right_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT)
+			fillable = max_uninit_len - right_ext->e_len;
+		else if (flags & EXT2_FALLOCATE_ZERO_BLOCKS)
+			fillable = max_init_len - right_ext->e_len;
+		else
+			fillable = 0;
+
+		/* User requires init/uninit but extent is uninit/init. */
+		if (((flags & EXT2_FALLOCATE_FORCE_INIT) &&
+		     (right_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT)) ||
+		    ((flags & EXT2_FALLOCATE_FORCE_UNINIT) &&
+		     !(right_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT)))
+			goto try_anywhere;
+
+		if (fillable > range_len)
+			fillable = range_len;
+		if (fillable == 0)
+			goto try_anywhere;
+
+		/* Test if the left edge of the range is already mapped? */
+		if (EXT2FS_CLUSTER_RATIO(fs) > 1) {
+			err = ext2fs_map_cluster_block(fs, ino, inode,
+					right_ext->e_lblk - fillable, &pblk);
+			if (err)
+				goto out;
+			if (pblk)
+				fillable -= EXT2FS_CLUSTER_RATIO(fs) -
+						((right_ext->e_lblk - fillable)
+						 & EXT2FS_CLUSTER_MASK(fs));
+			if (fillable == 0)
+				goto try_anywhere;
+		}
+
+		/*
+		 * FIXME: It would be nice if we could handle allocating a
+		 * variable range from a fixed end point instead of just
+		 * skipping to the general allocator if the whole range is
+		 * unavailable.
+		 */
+		err = ext2fs_new_range(fs, EXT2_NEWRANGE_FIXED_GOAL |
+				EXT2_NEWRANGE_MIN_LENGTH,
+				right_ext->e_pblk - fillable,
+				fillable, NULL, &pblk, &plen);
+		if (err)
+			goto try_anywhere;
+		err = claim_range(fs, inode,
+			      pblk & ~EXT2FS_CLUSTER_MASK(fs),
+			      plen + (pblk & EXT2FS_CLUSTER_MASK(fs)));
+		if (err)
+			goto out;
+
+		/* Modify right_ext */
+		err = ext2fs_extent_goto(handle, right_ext->e_lblk);
+		if (err)
+			goto out;
+		range_len -= plen;
+		right_ext->e_lblk -= plen;
+		right_ext->e_pblk -= plen;
+		right_ext->e_len += plen;
+		dbg_print_extent("ext_falloc right+", right_ext);
+		err = ext2fs_extent_replace(handle, 0, right_ext);
+		if (err)
+			goto out;
+		err = ext2fs_extent_fix_parents(handle);
+		if (err)
+			goto out;
+
+		/* Zero blocks if necessary */
+		if (!(right_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT) &&
+		    (flags & EXT2_FALLOCATE_ZERO_BLOCKS)) {
+			err = ext2fs_zero_blocks2(fs, pblk,
+					plen + cluster_fill, NULL, NULL);
+			if (err)
+				goto out;
+		}
+	}
+
+try_anywhere:
+	/* Try implied cluster alloc on the left and right ends */
+	if (range_len > 0 && (range_start & EXT2FS_CLUSTER_MASK(fs))) {
+		cluster_fill = EXT2FS_CLUSTER_RATIO(fs) -
+			       (range_start & EXT2FS_CLUSTER_MASK(fs));
+		cluster_fill &= EXT2FS_CLUSTER_MASK(fs);
+		if (cluster_fill > range_len)
+			cluster_fill = range_len;
+		newex.e_lblk = range_start;
+		err = ext2fs_map_cluster_block(fs, ino, inode, newex.e_lblk,
+					       &pblk);
+		if (err)
+			goto out;
+		if (pblk == 0)
+			goto try_right_implied;
+		newex.e_pblk = pblk;
+		newex.e_len = cluster_fill;
+		newex.e_flags = (flags & EXT2_FALLOCATE_FORCE_INIT ? 0 :
+				 EXT2_EXTENT_FLAGS_UNINIT);
+		dbg_print_extent("ext_falloc iclus left+", &newex);
+		ext2fs_extent_goto(handle, newex.e_lblk);
+		err = ext2fs_extent_get(handle, EXT2_EXTENT_CURRENT,
+					&ex);
+		if (err == EXT2_ET_NO_CURRENT_NODE)
+			ex.e_lblk = 0;
+		else if (err)
+			goto out;
+
+		if (ex.e_lblk > newex.e_lblk)
+			op = 0; /* insert before */
+		else
+			op = EXT2_EXTENT_INSERT_AFTER;
+		dbg_printf("%s: inserting %s lblk %llu newex=%llu\n",
+			   __func__, op ? "after" : "before", ex.e_lblk,
+			   newex.e_lblk);
+		err = ext2fs_extent_insert(handle, op, &newex);
+		if (err)
+			goto out;
+		err = ext2fs_extent_fix_parents(handle);
+		if (err)
+			goto out;
+
+		if (!(newex.e_flags & EXT2_EXTENT_FLAGS_UNINIT) &&
+		    (flags & EXT2_FALLOCATE_ZERO_BLOCKS)) {
+			err = ext2fs_zero_blocks2(fs, newex.e_pblk,
+						  newex.e_len, NULL, NULL);
+			if (err)
+				goto out;
+		}
+
+		range_start += cluster_fill;
+		range_len -= cluster_fill;
+	}
+
+try_right_implied:
+	y = range_start + range_len;
+	if (range_len > 0 && (y & EXT2FS_CLUSTER_MASK(fs))) {
+		cluster_fill = y & EXT2FS_CLUSTER_MASK(fs);
+		if (cluster_fill > range_len)
+			cluster_fill = range_len;
+		newex.e_lblk = y & ~EXT2FS_CLUSTER_MASK(fs);
+		err = ext2fs_map_cluster_block(fs, ino, inode, newex.e_lblk,
+					       &pblk);
+		if (err)
+			goto out;
+		if (pblk == 0)
+			goto no_implied;
+		newex.e_pblk = pblk;
+		newex.e_len = cluster_fill;
+		newex.e_flags = (flags & EXT2_FALLOCATE_FORCE_INIT ? 0 :
+				 EXT2_EXTENT_FLAGS_UNINIT);
+		dbg_print_extent("ext_falloc iclus right+", &newex);
+		ext2fs_extent_goto(handle, newex.e_lblk);
+		err = ext2fs_extent_get(handle, EXT2_EXTENT_CURRENT,
+					&ex);
+		if (err == EXT2_ET_NO_CURRENT_NODE)
+			ex.e_lblk = 0;
+		else if (err)
+			goto out;
+
+		if (ex.e_lblk > newex.e_lblk)
+			op = 0; /* insert before */
+		else
+			op = EXT2_EXTENT_INSERT_AFTER;
+		dbg_printf("%s: inserting %s lblk %llu newex=%llu\n",
+			   __func__, op ? "after" : "before", ex.e_lblk,
+			   newex.e_lblk);
+		err = ext2fs_extent_insert(handle, op, &newex);
+		if (err)
+			goto out;
+		err = ext2fs_extent_fix_parents(handle);
+		if (err)
+			goto out;
+
+		if (!(newex.e_flags & EXT2_EXTENT_FLAGS_UNINIT) &&
+		    (flags & EXT2_FALLOCATE_ZERO_BLOCKS)) {
+			err = ext2fs_zero_blocks2(fs, newex.e_pblk,
+						  newex.e_len, NULL, NULL);
+			if (err)
+				goto out;
+		}
+
+		range_len -= cluster_fill;
+	}
+
+no_implied:
+	if (range_len == 0)
+		return 0;
+
+	newex.e_lblk = range_start;
+	if (flags & EXT2_FALLOCATE_FORCE_INIT) {
+		max_extent_len = max_init_len;
+		newex.e_flags = 0;
+	} else {
+		max_extent_len = max_uninit_len;
+		newex.e_flags = EXT2_EXTENT_FLAGS_UNINIT;
+	}
+	pblk = alloc_goal;
+	y = range_len;
+	for (x = 0; x < y;) {
+		cluster_fill = newex.e_lblk & EXT2FS_CLUSTER_MASK(fs);
+		fillable = min(range_len + cluster_fill, max_extent_len);
+		err = ext2fs_new_range(fs, 0, pblk & ~EXT2FS_CLUSTER_MASK(fs),
+				       fillable,
+				       NULL, &pblk, &plen);
+		if (err)
+			goto out;
+		err = claim_range(fs, inode, pblk, plen);
+		if (err)
+			goto out;
+
+		/* Create extent */
+		newex.e_pblk = pblk + cluster_fill;
+		newex.e_len = plen - cluster_fill;
+		dbg_print_extent("ext_falloc create", &newex);
+		ext2fs_extent_goto(handle, newex.e_lblk);
+		err = ext2fs_extent_get(handle, EXT2_EXTENT_CURRENT,
+					&ex);
+		if (err == EXT2_ET_NO_CURRENT_NODE)
+			ex.e_lblk = 0;
+		else if (err)
+			goto out;
+
+		if (ex.e_lblk > newex.e_lblk)
+			op = 0; /* insert before */
+		else
+			op = EXT2_EXTENT_INSERT_AFTER;
+		dbg_printf("%s: inserting %s lblk %llu newex=%llu\n",
+			   __func__, op ? "after" : "before", ex.e_lblk,
+			   newex.e_lblk);
+		err = ext2fs_extent_insert(handle, op, &newex);
+		if (err)
+			goto out;
+		err = ext2fs_extent_fix_parents(handle);
+		if (err)
+			goto out;
+
+		if (!(newex.e_flags & EXT2_EXTENT_FLAGS_UNINIT) &&
+		    (flags & EXT2_FALLOCATE_ZERO_BLOCKS)) {
+			err = ext2fs_zero_blocks2(fs, pblk, plen, NULL, NULL);
+			if (err)
+				goto out;
+		}
+
+		/* Update variables at end of loop */
+		x += plen - cluster_fill;
+		range_len -= plen - cluster_fill;
+		newex.e_lblk += plen - cluster_fill;
+		pblk += plen - cluster_fill;
+		if (pblk >= ext2fs_blocks_count(fs->super))
+			pblk = fs->super->s_first_data_block;
+	}
+
+out:
+	return err;
+}
+
+static errcode_t extent_fallocate(ext2_filsys fs, int flags, ext2_ino_t ino,
+				      struct ext2_inode *inode, blk64_t goal,
+				      blk64_t start, blk64_t len)
+{
+	ext2_extent_handle_t	handle;
+	struct ext2fs_extent	left_extent, right_extent;
+	struct ext2fs_extent	*left_adjacent, *right_adjacent;
+	errcode_t		err;
+	blk64_t			range_start, range_end = 0, end, next;
+	blk64_t			count, goal_distance;
+
+	end = start + len - 1;
+	err = ext2fs_extent_open2(fs, ino, inode, &handle);
+	if (err)
+		return err;
+
+	/*
+	 * Find the extent closest to the start of the alloc range.  We don't
+	 * check the return value because _goto() sets the current node to the
+	 * next-lowest extent if 'start' is in a hole; or the next-highest
+	 * extent if there aren't any lower ones; or doesn't set a current node
+	 * if there was a real error reading the extent tree.  In that case,
+	 * _get() will error out.
+	 */
+start_again:
+	ext2fs_extent_goto(handle, start);
+	err = ext2fs_extent_get(handle, EXT2_EXTENT_CURRENT, &left_extent);
+	if (err == EXT2_ET_NO_CURRENT_NODE) {
+		blk64_t max_blocks = ext2fs_blocks_count(fs->super);
+
+		if (goal == ~0ULL)
+			goal = ext2fs_find_inode_goal(fs, ino);
+		err = ext2fs_find_first_zero_block_bitmap2(fs->block_map,
+						goal, max_blocks - 1, &goal);
+		goal += start;
+		err = ext_falloc_helper(fs, flags, ino, inode, handle, NULL,
+					NULL, start, len, goal);
+		goto errout;
+	} else if (err)
+		goto errout;
+
+	dbg_print_extent("ext_falloc initial", &left_extent);
+	next = left_extent.e_lblk + left_extent.e_len;
+	if (left_extent.e_lblk > start) {
+		/* The nearest extent we found was beyond start??? */
+		goal = left_extent.e_pblk - (left_extent.e_lblk - start);
+		err = ext_falloc_helper(fs, flags, ino, inode, handle, NULL,
+					&left_extent, start,
+					left_extent.e_lblk - start, goal);
+		if (err)
+			goto errout;
+
+		goto start_again;
+	} else if (next >= start) {
+		range_start = next;
+		left_adjacent = &left_extent;
+	} else {
+		range_start = start;
+		left_adjacent = NULL;
+	}
+	goal = left_extent.e_pblk + (range_start - left_extent.e_lblk);
+	goal_distance = range_start - next;
+
+	do {
+		err = ext2fs_extent_get(handle, EXT2_EXTENT_NEXT_LEAF,
+					   &right_extent);
+		dbg_printf("%s: ino=%d get next =%d\n", __func__, ino,
+			   (int)err);
+		dbg_print_extent("ext_falloc next", &right_extent);
+		/* Stop if we've seen this extent before */
+		if (!err && right_extent.e_lblk <= left_extent.e_lblk)
+			err = EXT2_ET_EXTENT_NO_NEXT;
+
+		if (err && err != EXT2_ET_EXTENT_NO_NEXT)
+			goto errout;
+		if (err == EXT2_ET_EXTENT_NO_NEXT ||
+		    right_extent.e_lblk > end + 1) {
+			range_end = end;
+			right_adjacent = NULL;
+		} else {
+			/* Handle right_extent.e_lblk <= end */
+			range_end = right_extent.e_lblk - 1;
+			right_adjacent = &right_extent;
+		}
+		if (err != EXT2_ET_EXTENT_NO_NEXT &&
+		    goal_distance > (range_end - right_extent.e_lblk)) {
+			goal = right_extent.e_pblk -
+					(right_extent.e_lblk - range_start);
+			goal_distance = range_end - right_extent.e_lblk;
+		}
+
+		dbg_printf("%s: ino=%d rstart=%llu rend=%llu\n", __func__, ino,
+			   range_start, range_end);
+		err = 0;
+		if (range_start <= range_end) {
+			count = range_end - range_start + 1;
+			err = ext_falloc_helper(fs, flags, ino, inode, handle,
+						left_adjacent, right_adjacent,
+						range_start, count, goal);
+			if (err)
+				goto errout;
+		}
+
+		if (range_end == end)
+			break;
+
+		err = ext2fs_extent_goto(handle, right_extent.e_lblk);
+		if (err)
+			goto errout;
+		next = right_extent.e_lblk + right_extent.e_len;
+		left_extent = right_extent;
+		left_adjacent = &left_extent;
+		range_start = next;
+		goal = left_extent.e_pblk + (range_start - left_extent.e_lblk);
+		goal_distance = range_start - next;
+	} while (range_end < end);
+
+errout:
+	ext2fs_extent_free(handle);
+	return err;
+}
+
+/*
+ * Map physical blocks to a range of logical blocks within a file.  The range
+ * of logical blocks are (start, start + len).  If there are already extents,
+ * the mappings will try to extend the mappings; otherwise, it will try to map
+ * start as if logical block 0 points to goal.  If goal is ~0ULL, then the goal
+ * is calculated based on the inode group.
+ *
+ * Flags:
+ * - EXT2_FALLOCATE_ZERO_BLOCKS: Zero the blocks that are allocated.
+ * - EXT2_FALLOCATE_FORCE_INIT: Create only initialized extents.
+ * - EXT2_FALLOCATE_FORCE_UNINIT: Create only uninitialized extents.
+ * - EXT2_FALLOCATE_INIT_BEYOND_EOF: Create extents beyond EOF.
+ *
+ * If neither FORCE_INIT nor FORCE_UNINIT are specified, this function will
+ * try to expand any extents it finds, zeroing blocks as necessary.
+ */
+errcode_t ext2fs_fallocate(ext2_filsys fs, int flags, ext2_ino_t ino,
+			   struct ext2_inode *inode, blk64_t goal,
+			   blk64_t start, blk64_t len)
+{
+	struct ext2_inode	inode_buf;
+	blk64_t			blk, x;
+	errcode_t		err;
+
+	if (((flags & EXT2_FALLOCATE_FORCE_INIT) &&
+	    (flags & EXT2_FALLOCATE_FORCE_UNINIT)) ||
+	   (flags & ~EXT2_FALLOCATE_ALL_FLAGS))
+		return EXT2_ET_INVALID_ARGUMENT;
+
+	if (len > ext2fs_blocks_count(fs->super))
+		return EXT2_ET_BLOCK_ALLOC_FAIL;
+	else if (len == 0)
+		return 0;
+
+	/* Read inode structure if necessary */
+	if (!inode) {
+		err = ext2fs_read_inode(fs, ino, &inode_buf);
+		if (err)
+			return err;
+		inode = &inode_buf;
+	}
+	dbg_printf("%s: ino=%d start=%llu len=%llu goal=%llu\n", __func__, ino,
+		   start, len, goal);
+
+	if (inode->i_flags & EXT4_EXTENTS_FL) {
+		err = extent_fallocate(fs, flags, ino, inode, goal, start, len);
+		goto out;
+	}
+
+	/* XXX: Allocate a bunch of blocks the slow way */
+	for (blk = start; blk < start + len; blk++) {
+		err = ext2fs_bmap2(fs, ino, inode, NULL, 0, blk, 0, &x);
+		if (err)
+			return err;
+		if (x)
+			continue;
+
+		err = ext2fs_bmap2(fs, ino, inode, NULL,
+				   BMAP_ALLOC | BMAP_UNINIT, blk, 0, &x);
+		if (err)
+			return err;
+	}
+
+out:
+	if (inode == &inode_buf)
+		ext2fs_write_inode(fs, ino, inode);
+	return err;
+}


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 26/34] libext2fs: use fallocate for creating journals and hugefiles
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (24 preceding siblings ...)
  2014-09-13 22:14 ` [PATCH 25/34] libext2fs: implement fallocate Darrick J. Wong
@ 2014-09-13 22:14 ` Darrick J. Wong
  2014-09-13 22:14 ` [PATCH 27/34] debugfs: implement fallocate Darrick J. Wong
                   ` (7 subsequent siblings)
  33 siblings, 0 replies; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:14 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

Use the new fallocate API for creating the journal and the mk_hugefile
feature.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 lib/ext2fs/mkjournal.c |  134 +++++++-----------------------------------------
 misc/mk_hugefiles.c    |   96 ++++------------------------------
 2 files changed, 30 insertions(+), 200 deletions(-)


diff --git a/lib/ext2fs/mkjournal.c b/lib/ext2fs/mkjournal.c
index 3cc15a9..7443652 100644
--- a/lib/ext2fs/mkjournal.c
+++ b/lib/ext2fs/mkjournal.c
@@ -215,89 +215,6 @@ errcode_t ext2fs_zero_blocks(ext2_filsys fs, blk_t blk, int num,
 }
 
 /*
- * Helper function for creating the journal using direct I/O routines
- */
-struct mkjournal_struct {
-	int		num_blocks;
-	int		newblocks;
-	blk64_t		goal;
-	blk64_t		blk_to_zero;
-	int		zero_count;
-	int		flags;
-	char		*buf;
-	errcode_t	err;
-};
-
-static int mkjournal_proc(ext2_filsys	fs,
-			  blk64_t	*blocknr,
-			  e2_blkcnt_t	blockcnt,
-			  blk64_t	ref_block EXT2FS_ATTR((unused)),
-			  int		ref_offset EXT2FS_ATTR((unused)),
-			  void		*priv_data)
-{
-	struct mkjournal_struct *es = (struct mkjournal_struct *) priv_data;
-	blk64_t	new_blk;
-	errcode_t	retval;
-
-	if (*blocknr) {
-		es->goal = *blocknr;
-		return 0;
-	}
-	if (blockcnt &&
-	    (EXT2FS_B2C(fs, es->goal) == EXT2FS_B2C(fs, es->goal+1)))
-		new_blk = es->goal+1;
-	else {
-		es->goal &= ~EXT2FS_CLUSTER_MASK(fs);
-		retval = ext2fs_new_block2(fs, es->goal, 0, &new_blk);
-		if (retval) {
-			es->err = retval;
-			return BLOCK_ABORT;
-		}
-		ext2fs_block_alloc_stats2(fs, new_blk, +1);
-		es->newblocks++;
-	}
-	if (blockcnt >= 0)
-		es->num_blocks--;
-
-	retval = 0;
-	if (blockcnt <= 0)
-		retval = io_channel_write_blk64(fs->io, new_blk, 1, es->buf);
-	else if (!(es->flags & EXT2_MKJOURNAL_LAZYINIT)) {
-		if (es->zero_count) {
-			if ((es->blk_to_zero + es->zero_count == new_blk) &&
-			    (es->zero_count < 1024))
-				es->zero_count++;
-			else {
-				retval = ext2fs_zero_blocks2(fs,
-							     es->blk_to_zero,
-							     es->zero_count,
-							     0, 0);
-				es->zero_count = 0;
-			}
-		}
-		if (es->zero_count == 0) {
-			es->blk_to_zero = new_blk;
-			es->zero_count = 1;
-		}
-	}
-
-	if (blockcnt == 0)
-		memset(es->buf, 0, fs->blocksize);
-
-	if (retval) {
-		es->err = retval;
-		return BLOCK_ABORT;
-	}
-	*blocknr = es->goal = new_blk;
-
-	if (es->num_blocks == 0)
-		return (BLOCK_CHANGED | BLOCK_ABORT);
-	else
-		return BLOCK_CHANGED;
-
-}
-
-/*
  * Calculate the initial goal block to be roughly at the middle of the
  * filesystem.  Pick a group that has the largest number of free
  * blocks.
@@ -338,7 +255,8 @@ static errcode_t write_journal_inode(ext2_filsys fs, ext2_ino_t journal_ino,
 	errcode_t		retval;
 	struct ext2_inode	inode;
 	unsigned long long	inode_size;
-	struct mkjournal_struct	es;
+	int			falloc_flags = EXT2_FALLOCATE_FORCE_INIT;
+	blk64_t			zblk;
 
 	if ((retval = ext2fs_create_journal_superblock(fs, num_blocks, flags,
 						       &buf)))
@@ -355,40 +273,16 @@ static errcode_t write_journal_inode(ext2_filsys fs, ext2_ino_t journal_ino,
 		goto out2;
 	}
 
-	es.num_blocks = num_blocks;
-	es.newblocks = 0;
-	es.buf = buf;
-	es.err = 0;
-	es.flags = flags;
-	es.zero_count = 0;
-	es.goal = (goal != ~0ULL) ? goal : get_midpoint_journal_block(fs);
+	if (goal == ~0ULL)
+		goal = get_midpoint_journal_block(fs);
 
-	if (fs->super->s_feature_incompat & EXT3_FEATURE_INCOMPAT_EXTENTS) {
+	if (fs->super->s_feature_incompat & EXT3_FEATURE_INCOMPAT_EXTENTS)
 		inode.i_flags |= EXT4_EXTENTS_FL;
-		if ((retval = ext2fs_write_inode(fs, journal_ino, &inode)))
-			goto out2;
-	}
 
-	retval = ext2fs_block_iterate3(fs, journal_ino, BLOCK_FLAG_APPEND,
-				       0, mkjournal_proc, &es);
-	if (retval)
-		goto out2;
-	if (es.err) {
-		retval = es.err;
-		goto out2;
-	}
-	if (es.zero_count) {
-		retval = ext2fs_zero_blocks2(fs, es.blk_to_zero,
-					    es.zero_count, 0, 0);
-		if (retval)
-			goto out2;
-	}
-
-	if ((retval = ext2fs_read_inode(fs, journal_ino, &inode)))
-		goto out2;
+	if (!(flags & EXT2_MKJOURNAL_LAZYINIT))
+		falloc_flags |= EXT2_FALLOCATE_ZERO_BLOCKS;
 
 	inode_size = (unsigned long long)fs->blocksize * num_blocks;
-	ext2fs_iblk_add_blocks(fs, &inode, es.newblocks);
 	inode.i_mtime = inode.i_ctime = fs->now ? fs->now : time(0);
 	inode.i_links_count = 1;
 	inode.i_mode = LINUX_S_IFREG | 0600;
@@ -396,9 +290,21 @@ static errcode_t write_journal_inode(ext2_filsys fs, ext2_ino_t journal_ino,
 	if (retval)
 		goto out2;
 
+	retval = ext2fs_fallocate(fs, falloc_flags, journal_ino,
+				  &inode, goal, 0, num_blocks);
+	if (retval)
+		goto out2;
+
 	if ((retval = ext2fs_write_new_inode(fs, journal_ino, &inode)))
 		goto out2;
-	retval = 0;
+
+	retval = ext2fs_bmap2(fs, journal_ino, &inode, NULL, 0, 0, NULL, &zblk);
+	if (retval)
+		goto out2;
+
+	retval = io_channel_write_blk64(fs->io, zblk, 1, buf);
+	if (retval)
+		goto out2;
 
 	memcpy(fs->super->s_jnl_blocks, inode.i_block, EXT2_N_BLOCKS*4);
 	fs->super->s_jnl_blocks[15] = inode.i_size_high;
diff --git a/misc/mk_hugefiles.c b/misc/mk_hugefiles.c
index 3e4274c..5ac1114 100644
--- a/misc/mk_hugefiles.c
+++ b/misc/mk_hugefiles.c
@@ -258,12 +258,7 @@ static errcode_t mk_hugefile(ext2_filsys fs, blk64_t num,
 
 {
 	errcode_t		retval;
-	blk64_t			lblk, bend = 0;
-	__u64			size;
-	blk64_t			left;
-	blk64_t			count = 0;
 	struct ext2_inode	inode;
-	ext2_extent_handle_t	handle;
 
 	retval = ext2fs_new_inode(fs, 0, LINUX_S_IFREG, NULL, ino);
 	if (retval)
@@ -283,85 +278,20 @@ static errcode_t mk_hugefile(ext2_filsys fs, blk64_t num,
 
 	ext2fs_inode_alloc_stats2(fs, *ino, +1, 0);
 
-	retval = ext2fs_extent_open2(fs, *ino, &inode, &handle);
+	if (EXT2_HAS_INCOMPAT_FEATURE(fs->super,
+				      EXT3_FEATURE_INCOMPAT_EXTENTS))
+		inode.i_flags |= EXT4_EXTENTS_FL;
+	retval = ext2fs_fallocate(fs,
+				  EXT2_FALLOCATE_FORCE_INIT |
+				  EXT2_FALLOCATE_ZERO_BLOCKS,
+				  *ino, &inode, ~0ULL, 0, num);
 	if (retval)
 		return retval;
-
-	lblk = 0;
-	left = num ? num : 1;
-	while (left) {
-		blk64_t pblk, end;
-		blk64_t n = left;
-
-		retval =  ext2fs_find_first_zero_block_bitmap2(fs->block_map,
-			goal, ext2fs_blocks_count(fs->super) - 1, &end);
-		if (retval)
-			goto errout;
-		goal = end;
-
-		retval =  ext2fs_find_first_set_block_bitmap2(fs->block_map, goal,
-			       ext2fs_blocks_count(fs->super) - 1, &bend);
-		if (retval == ENOENT) {
-			bend = ext2fs_blocks_count(fs->super);
-			if (num == 0)
-				left = 0;
-		}
-		if (!num || bend - goal < left)
-			n = bend - goal;
-		pblk = goal;
-		if (num)
-			left -= n;
-		goal += n;
-		count += n;
-		ext2fs_block_alloc_stats_range(fs, pblk, n, +1);
-
-		if (zero_hugefile) {
-			blk64_t ret_blk;
-			retval = ext2fs_zero_blocks2(fs, pblk, n,
-						     &ret_blk, NULL);
-
-			if (retval)
-				com_err(program_name, retval,
-					_("while zeroing block %llu "
-					  "for hugefile"), ret_blk);
-		}
-
-		while (n) {
-			blk64_t l = n;
-			struct ext2fs_extent newextent;
-
-			if (l > EXT_INIT_MAX_LEN)
-				l = EXT_INIT_MAX_LEN;
-
-			newextent.e_len = l;
-			newextent.e_pblk = pblk;
-			newextent.e_lblk = lblk;
-			newextent.e_flags = 0;
-
-			retval = ext2fs_extent_insert(handle,
-					EXT2_EXTENT_INSERT_AFTER, &newextent);
-			if (retval)
-				return retval;
-			pblk += l;
-			lblk += l;
-			n -= l;
-		}
-	}
-
-	retval = ext2fs_read_inode(fs, *ino, &inode);
-	if (retval)
-		goto errout;
-
-	retval = ext2fs_iblk_add_blocks(fs, &inode,
-					count / EXT2FS_CLUSTER_RATIO(fs));
-	if (retval)
-		goto errout;
-	size = (__u64) count * fs->blocksize;
-	retval = ext2fs_inode_size_set(fs, &inode, size);
+	retval = ext2fs_inode_size_set(fs, &inode, num * fs->blocksize);
 	if (retval)
-		goto errout;
+		return retval;
 
-	retval = ext2fs_write_new_inode(fs, *ino, &inode);
+	retval = ext2fs_write_inode(fs, *ino, &inode);
 	if (retval)
 		goto errout;
 
@@ -379,13 +309,7 @@ retry:
 		goto retry;
 	}
 
-	if (retval)
-		goto errout;
-
 errout:
-	if (handle)
-		ext2fs_extent_free(handle);

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 27/34] debugfs: implement fallocate
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (25 preceding siblings ...)
  2014-09-13 22:14 ` [PATCH 26/34] libext2fs: use fallocate for creating journals and hugefiles Darrick J. Wong
@ 2014-09-13 22:14 ` Darrick J. Wong
  2014-09-13 22:14 ` [PATCH 28/34] tests: test debugfs punch command Darrick J. Wong
                   ` (6 subsequent siblings)
  33 siblings, 0 replies; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:14 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

Implement a fallocate function for debugfs, and add some tests to
demonstrate that it works (more or less).

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 debugfs/debug_cmds.ct             |    3 
 debugfs/debugfs.c                 |   36 +++
 debugfs/debugfs.h                 |    1 
 tests/f_fallocate/expect          |  423 +++++++++++++++++++++++++++++++++++++
 tests/f_fallocate/name            |    1 
 tests/f_fallocate/script          |  175 +++++++++++++++
 tests/f_fallocate_bigalloc/expect |  364 ++++++++++++++++++++++++++++++++
 tests/f_fallocate_bigalloc/name   |    1 
 tests/f_fallocate_bigalloc/script |  176 +++++++++++++++
 tests/f_fallocate_blkmap/expect   |   58 +++++
 tests/f_fallocate_blkmap/name     |    1 
 tests/f_fallocate_blkmap/script   |   85 +++++++
 12 files changed, 1324 insertions(+)
 create mode 100644 tests/f_fallocate/expect
 create mode 100644 tests/f_fallocate/name
 create mode 100644 tests/f_fallocate/script
 create mode 100644 tests/f_fallocate_bigalloc/expect
 create mode 100644 tests/f_fallocate_bigalloc/name
 create mode 100644 tests/f_fallocate_bigalloc/script
 create mode 100644 tests/f_fallocate_blkmap/expect
 create mode 100644 tests/f_fallocate_blkmap/name
 create mode 100644 tests/f_fallocate_blkmap/script


diff --git a/debugfs/debug_cmds.ct b/debugfs/debug_cmds.ct
index c6f6d6c..34dad9e 100644
--- a/debugfs/debug_cmds.ct
+++ b/debugfs/debug_cmds.ct
@@ -157,6 +157,9 @@ request do_dirsearch, "Search a directory for a particular filename",
 request do_bmap, "Calculate the logical->physical block mapping for an inode",
 	bmap;
 
+request do_fallocate, "Allocate uninitialized blocks to an inode",
+	fallocate;
+
 request do_punch, "Punch (or truncate) blocks from an inode by deallocating them",
 	punch, truncate;
 
diff --git a/debugfs/debugfs.c b/debugfs/debugfs.c
index db85028..b30a5ab 100644
--- a/debugfs/debugfs.c
+++ b/debugfs/debugfs.c
@@ -2073,6 +2073,42 @@ void do_punch(int argc, char *argv[])
 		return;
 	}
 }
+
+void do_fallocate(int argc, char *argv[])
+{
+	ext2_ino_t	ino;
+	blk64_t		start, end;
+	int		err;
+	errcode_t	errcode;
+
+	if (common_args_process(argc, argv, 3, 4, argv[0],
+				"<file> start_blk [end_blk]",
+				CHECK_FS_RW | CHECK_FS_BITMAPS))
+		return;
+
+	ino = string_to_inode(argv[1]);
+	if (!ino)
+		return;
+	err = strtoblk(argv[0], argv[2], "logical block", &start);
+	if (err)
+		return;
+	if (argc == 4) {
+		err = strtoblk(argv[0], argv[3], "logical block", &end);
+		if (err)
+			return;
+	} else
+		end = ~0;
+
+	errcode = ext2fs_fallocate(current_fs, EXT2_FALLOCATE_INIT_BEYOND_EOF,
+				   ino, NULL, ~0ULL, start, end - start + 1);
+
+	if (errcode) {
+		com_err(argv[0], errcode,
+			"while fallocating inode %u from %llu to %llu\n", ino,
+			(unsigned long long) start, (unsigned long long) end);
+		return;
+	}
+}
 #endif /* READ_ONLY */
 
 void do_symlink(int argc, char *argv[])
diff --git a/debugfs/debugfs.h b/debugfs/debugfs.h
index e163d0a..76bb22c 100644
--- a/debugfs/debugfs.h
+++ b/debugfs/debugfs.h
@@ -166,6 +166,7 @@ extern void do_imap(int argc, char **argv);
 extern void do_set_current_time(int argc, char **argv);
 extern void do_supported_features(int argc, char **argv);
 extern void do_punch(int argc, char **argv);
+extern void do_fallocate(int argc, char **argv);
 extern void do_symlink(int argc, char **argv);
 
 extern void do_dump_mmp(int argc, char **argv);
diff --git a/tests/f_fallocate/expect b/tests/f_fallocate/expect
new file mode 100644
index 0000000..e5aeb49
--- /dev/null
+++ b/tests/f_fallocate/expect
@@ -0,0 +1,423 @@
+Creating filesystem with 65536 1k blocks and 4096 inodes
+Superblock backups stored on blocks: 
+	8193, 24577, 40961, 57345
+
+Allocating group tables:    \b\b\bdone                            
+Writing inode tables:    \b\b\bdone                            
+Writing superblocks and filesystem accounting information:    \b\b\bdone
+
+Pass 1: Checking inodes, blocks, and sizes
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+test_filesys: 11/4096 files (0.0% non-contiguous), 2345/65536 blocks
+Exit status is 0
+debugfs write files
+debugfs: ex /a
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  1     0 -    39  1313 -  1352     40 Uninit
+debugfs: ex /sample
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4    10 -    10  5010 -  5010      1 Uninit
+ 0/ 0   2/  4    13 -    13  5013 -  5013      1 Uninit
+ 0/ 0   3/  4    26 -    26  5026 -  5026      1 Uninit
+ 0/ 0   4/  4    29 -    29  5029 -  5029      1 Uninit
+debugfs: ex /b8
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4     8 -    12 10008 - 10012      5 Uninit
+ 0/ 0   2/  4    13 -    25 10013 - 10025     13 Uninit
+ 0/ 0   3/  4    26 -    28 10026 - 10028      3 Uninit
+ 0/ 0   4/  4    29 -    39 10029 - 10039     11 Uninit
+debugfs: ex /b9
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4     9 -    12 10049 - 10052      4 Uninit
+ 0/ 0   2/  4    13 -    25 10053 - 10065     13 Uninit
+ 0/ 0   3/  4    26 -    28 10066 - 10068      3 Uninit
+ 0/ 0   4/  4    29 -    39 10069 - 10079     11 Uninit
+debugfs: ex /b10
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4    10 -    12 10090 - 10092      3 Uninit
+ 0/ 0   2/  4    13 -    25 10093 - 10105     13 Uninit
+ 0/ 0   3/  4    26 -    28 10106 - 10108      3 Uninit
+ 0/ 0   4/  4    29 -    39 10109 - 10119     11 Uninit
+debugfs: ex /b11
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4    10 -    12 10130 - 10132      3 Uninit
+ 0/ 0   2/  4    13 -    25 10133 - 10145     13 Uninit
+ 0/ 0   3/  4    26 -    28 10146 - 10148      3 Uninit
+ 0/ 0   4/  4    29 -    39 10149 - 10159     11 Uninit
+debugfs: ex /b12
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4    10 -    10 10170 - 10170      1 Uninit
+ 0/ 0   2/  4    12 -    25 10172 - 10185     14 Uninit
+ 0/ 0   3/  4    26 -    28 10186 - 10188      3 Uninit
+ 0/ 0   4/  4    29 -    39 10189 - 10199     11 Uninit
+debugfs: ex /b13
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4    10 -    10 10210 - 10210      1 Uninit
+ 0/ 0   2/  4    13 -    25 10213 - 10225     13 Uninit
+ 0/ 0   3/  4    26 -    28 10226 - 10228      3 Uninit
+ 0/ 0   4/  4    29 -    39 10229 - 10239     11 Uninit
+debugfs: ex /b14
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4    10 -    10 10250 - 10250      1 Uninit
+ 0/ 0   2/  4    13 -    25 10253 - 10265     13 Uninit
+ 0/ 0   3/  4    26 -    28 10266 - 10268      3 Uninit
+ 0/ 0   4/  4    29 -    39 10269 - 10279     11 Uninit
+debugfs: ex /b15
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4    10 -    10 10290 - 10290      1 Uninit
+ 0/ 0   2/  4    13 -    13 10293 - 10293      1 Uninit
+ 0/ 0   3/  4    15 -    28 10295 - 10308     14 Uninit
+ 0/ 0   4/  4    29 -    39 10309 - 10319     11 Uninit
+debugfs: ex /c24
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4     0 -    12 10320 - 10332     13 Uninit
+ 0/ 0   2/  4    13 -    24 10333 - 10344     12 Uninit
+ 0/ 0   3/  4    26 -    26 10346 - 10346      1 Uninit
+ 0/ 0   4/  4    29 -    29 10349 - 10349      1 Uninit
+debugfs: ex /c25
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4     0 -    12 10360 - 10372     13 Uninit
+ 0/ 0   2/  4    13 -    25 10373 - 10385     13 Uninit
+ 0/ 0   3/  4    26 -    26 10386 - 10386      1 Uninit
+ 0/ 0   4/  4    29 -    29 10389 - 10389      1 Uninit
+debugfs: ex /c26
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4     0 -    12 10400 - 10412     13 Uninit
+ 0/ 0   2/  4    13 -    25 10413 - 10425     13 Uninit
+ 0/ 0   3/  4    26 -    26 10426 - 10426      1 Uninit
+ 0/ 0   4/  4    29 -    29 10429 - 10429      1 Uninit
+debugfs: ex /c27
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4     0 -    12 10440 - 10452     13 Uninit
+ 0/ 0   2/  4    13 -    25 10453 - 10465     13 Uninit
+ 0/ 0   3/  4    26 -    27 10466 - 10467      2 Uninit
+ 0/ 0   4/  4    29 -    29 10469 - 10469      1 Uninit
+debugfs: ex /c28
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4     0 -    12 10480 - 10492     13 Uninit
+ 0/ 0   2/  4    13 -    25 10493 - 10505     13 Uninit
+ 0/ 0   3/  4    26 -    28 10506 - 10508      3 Uninit
+ 0/ 0   4/  4    29 -    29 10509 - 10509      1 Uninit
+debugfs: ex /c29
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4     0 -    12 10520 - 10532     13 Uninit
+ 0/ 0   2/  4    13 -    25 10533 - 10545     13 Uninit
+ 0/ 0   3/  4    26 -    28 10546 - 10548      3 Uninit
+ 0/ 0   4/  4    29 -    29 10549 - 10549      1 Uninit
+debugfs: ex /c30
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4     0 -    12 10560 - 10572     13 Uninit
+ 0/ 0   2/  4    13 -    25 10573 - 10585     13 Uninit
+ 0/ 0   3/  4    26 -    28 10586 - 10588      3 Uninit
+ 0/ 0   4/  4    29 -    30 10589 - 10590      2 Uninit
+debugfs: ex /c31
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4     0 -    12 10600 - 10612     13 Uninit
+ 0/ 0   2/  4    13 -    25 10613 - 10625     13 Uninit
+ 0/ 0   3/  4    26 -    28 10626 - 10628      3 Uninit
+ 0/ 0   4/  4    29 -    31 10629 - 10631      3 Uninit
+debugfs: ex /d
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4     4 -    12 10644 - 10652      9 Uninit
+ 0/ 0   2/  4    13 -    25 10653 - 10665     13 Uninit
+ 0/ 0   3/  4    26 -    28 10666 - 10668      3 Uninit
+ 0/ 0   4/  4    29 -    35 10669 - 10675      7 Uninit
+debugfs: ex /e
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    10 -    39  1353             30
+ 1/ 1   1/  5    10 -    10 10690 - 10690      1 Uninit
+ 1/ 1   2/  5    13 -    13 10693 - 10693      1 Uninit
+ 1/ 1   3/  5    19 -    20 10699 - 10700      2 Uninit
+ 1/ 1   4/  5    26 -    26 10706 - 10706      1 Uninit
+ 1/ 1   5/  5    29 -    29 10709 - 10709      1 Uninit
+debugfs: ex /f
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 -     0  1354              1
+ 1/ 1   1/ 33     0 -     0  9000 -  9000      1 Uninit
+ 1/ 1   2/ 33     1 -  1007  9001 - 10007   1007 Uninit
+ 1/ 1   3/ 33  1008 -  1016 10040 - 10048      9 Uninit
+ 1/ 1   4/ 33  1017 -  1026 10080 - 10089     10 Uninit
+ 1/ 1   5/ 33  1027 -  1036 10120 - 10129     10 Uninit
+ 1/ 1   6/ 33  1037 -  1046 10160 - 10169     10 Uninit
+ 1/ 1   7/ 33  1047 -  1047 10171 - 10171      1 Uninit
+ 1/ 1   8/ 33  1048 -  1057 10200 - 10209     10 Uninit
+ 1/ 1   9/ 33  1058 -  1059 10211 - 10212      2 Uninit
+ 1/ 1  10/ 33  1060 -  1069 10240 - 10249     10 Uninit
+ 1/ 1  11/ 33  1070 -  1071 10251 - 10252      2 Uninit
+ 1/ 1  12/ 33  1072 -  1081 10280 - 10289     10 Uninit
+ 1/ 1  13/ 33  1082 -  1083 10291 - 10292      2 Uninit
+ 1/ 1  14/ 33  1084 -  1084 10294 - 10294      1 Uninit
+ 1/ 1  15/ 33  1085 -  1085 10345 - 10345      1 Uninit
+ 1/ 1  16/ 33  1086 -  1087 10347 - 10348      2 Uninit
+ 1/ 1  17/ 33  1088 -  1097 10350 - 10359     10 Uninit
+ 1/ 1  18/ 33  1098 -  1099 10387 - 10388      2 Uninit
+ 1/ 1  19/ 33  1100 -  1109 10390 - 10399     10 Uninit
+ 1/ 1  20/ 33  1110 -  1111 10427 - 10428      2 Uninit
+ 1/ 1  21/ 33  1112 -  1121 10430 - 10439     10 Uninit
+ 1/ 1  22/ 33  1122 -  1122 10468 - 10468      1 Uninit
+ 1/ 1  23/ 33  1123 -  1132 10470 - 10479     10 Uninit
+ 1/ 1  24/ 33  1133 -  1142 10510 - 10519     10 Uninit
+ 1/ 1  25/ 33  1143 -  1152 10550 - 10559     10 Uninit
+ 1/ 1  26/ 33  1153 -  1161 10591 - 10599      9 Uninit
+ 1/ 1  27/ 33  1162 -  1173 10632 - 10643     12 Uninit
+ 1/ 1  28/ 33  1174 -  1187 10676 - 10689     14 Uninit
+ 1/ 1  29/ 33  1188 -  1189 10691 - 10692      2 Uninit
+ 1/ 1  30/ 33  1190 -  1194 10694 - 10698      5 Uninit
+ 1/ 1  31/ 33  1195 -  1199 10701 - 10705      5 Uninit
+ 1/ 1  32/ 33  1200 -  1201 10707 - 10708      2 Uninit
+ 1/ 1  33/ 33  1202 -  8999 10710 - 18507   7798 Uninit
+debugfs: ex /g8
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     8 -    39  1355             32
+ 1/ 1   1/  9     8 -     9 20008 - 20009      2 Uninit
+ 1/ 1   2/  9    10 -    10 20010 - 20010      1 
+ 1/ 1   3/  9    11 -    12 20011 - 20012      2 Uninit
+ 1/ 1   4/  9    13 -    13 20013 - 20013      1 
+ 1/ 1   5/  9    14 -    25 20014 - 20025     12 Uninit
+ 1/ 1   6/  9    26 -    26 20026 - 20026      1 
+ 1/ 1   7/  9    27 -    28 20027 - 20028      2 Uninit
+ 1/ 1   8/  9    29 -    29 20029 - 20029      1 
+ 1/ 1   9/  9    30 -    39 20030 - 20039     10 Uninit
+debugfs: ex /g9
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     9 -    39  1356             31
+ 1/ 1   1/  9     9 -     9 20049 - 20049      1 Uninit
+ 1/ 1   2/  9    10 -    10 20050 - 20050      1 
+ 1/ 1   3/  9    11 -    12 20051 - 20052      2 Uninit
+ 1/ 1   4/  9    13 -    13 20053 - 20053      1 
+ 1/ 1   5/  9    14 -    25 20054 - 20065     12 Uninit
+ 1/ 1   6/  9    26 -    26 20066 - 20066      1 
+ 1/ 1   7/  9    27 -    28 20067 - 20068      2 Uninit
+ 1/ 1   8/  9    29 -    29 20069 - 20069      1 
+ 1/ 1   9/  9    30 -    39 20070 - 20079     10 Uninit
+debugfs: ex /g10
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    10 -    39  1357             30
+ 1/ 1   1/  8    10 -    10 20090 - 20090      1 
+ 1/ 1   2/  8    11 -    12 20091 - 20092      2 Uninit
+ 1/ 1   3/  8    13 -    13 20093 - 20093      1 
+ 1/ 1   4/  8    14 -    25 20094 - 20105     12 Uninit
+ 1/ 1   5/  8    26 -    26 20106 - 20106      1 
+ 1/ 1   6/  8    27 -    28 20107 - 20108      2 Uninit
+ 1/ 1   7/  8    29 -    29 20109 - 20109      1 
+ 1/ 1   8/  8    30 -    39 20110 - 20119     10 Uninit
+debugfs: ex /g11
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    10 -    39  1358             30
+ 1/ 1   1/  8    10 -    10 20130 - 20130      1 
+ 1/ 1   2/  8    11 -    12 20131 - 20132      2 Uninit
+ 1/ 1   3/  8    13 -    13 20133 - 20133      1 
+ 1/ 1   4/  8    14 -    25 20134 - 20145     12 Uninit
+ 1/ 1   5/  8    26 -    26 20146 - 20146      1 
+ 1/ 1   6/  8    27 -    28 20147 - 20148      2 Uninit
+ 1/ 1   7/  8    29 -    29 20149 - 20149      1 
+ 1/ 1   8/  8    30 -    39 20150 - 20159     10 Uninit
+debugfs: ex /g12
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    10 -    39  1359             30
+ 1/ 1   1/  8    10 -    10 20170 - 20170      1 
+ 1/ 1   2/  8    12 -    12 20172 - 20172      1 Uninit
+ 1/ 1   3/  8    13 -    13 20173 - 20173      1 
+ 1/ 1   4/  8    14 -    25 20174 - 20185     12 Uninit
+ 1/ 1   5/  8    26 -    26 20186 - 20186      1 
+ 1/ 1   6/  8    27 -    28 20187 - 20188      2 Uninit
+ 1/ 1   7/  8    29 -    29 20189 - 20189      1 
+ 1/ 1   8/  8    30 -    39 20190 - 20199     10 Uninit
+debugfs: ex /g13
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    10 -    39  1360             30
+ 1/ 1   1/  7    10 -    10 20210 - 20210      1 
+ 1/ 1   2/  7    13 -    13 20213 - 20213      1 
+ 1/ 1   3/  7    14 -    25 20214 - 20225     12 Uninit
+ 1/ 1   4/  7    26 -    26 20226 - 20226      1 
+ 1/ 1   5/  7    27 -    28 20227 - 20228      2 Uninit
+ 1/ 1   6/  7    29 -    29 20229 - 20229      1 
+ 1/ 1   7/  7    30 -    39 20230 - 20239     10 Uninit
+debugfs: ex /g14
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    10 -    39  1361             30
+ 1/ 1   1/  7    10 -    10 20250 - 20250      1 
+ 1/ 1   2/  7    13 -    13 20253 - 20253      1 
+ 1/ 1   3/  7    14 -    25 20254 - 20265     12 Uninit
+ 1/ 1   4/  7    26 -    26 20266 - 20266      1 
+ 1/ 1   5/  7    27 -    28 20267 - 20268      2 Uninit
+ 1/ 1   6/  7    29 -    29 20269 - 20269      1 
+ 1/ 1   7/  7    30 -    39 20270 - 20279     10 Uninit
+debugfs: ex /g15
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    10 -    39  1362             30
+ 1/ 1   1/  7    10 -    10 20290 - 20290      1 
+ 1/ 1   2/  7    13 -    13 20293 - 20293      1 
+ 1/ 1   3/  7    15 -    25 20295 - 20305     11 Uninit
+ 1/ 1   4/  7    26 -    26 20306 - 20306      1 
+ 1/ 1   5/  7    27 -    28 20307 - 20308      2 Uninit
+ 1/ 1   6/  7    29 -    29 20309 - 20309      1 
+ 1/ 1   7/  7    30 -    39 20310 - 20319     10 Uninit
+debugfs: ex /h24
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 -    39  1363             40
+ 1/ 1   1/  7     0 -     9 20320 - 20329     10 Uninit
+ 1/ 1   2/  7    10 -    10 20330 - 20330      1 
+ 1/ 1   3/  7    11 -    12 20331 - 20332      2 Uninit
+ 1/ 1   4/  7    13 -    13 20333 - 20333      1 
+ 1/ 1   5/  7    14 -    24 20334 - 20344     11 Uninit
+ 1/ 1   6/  7    26 -    26 20346 - 20346      1 
+ 1/ 1   7/  7    29 -    29 20349 - 20349      1 
+debugfs: ex /h25
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 -    39  1364             40
+ 1/ 1   1/  7     0 -     9 20360 - 20369     10 Uninit
+ 1/ 1   2/  7    10 -    10 20370 - 20370      1 
+ 1/ 1   3/  7    11 -    12 20371 - 20372      2 Uninit
+ 1/ 1   4/  7    13 -    13 20373 - 20373      1 
+ 1/ 1   5/  7    14 -    25 20374 - 20385     12 Uninit
+ 1/ 1   6/  7    26 -    26 20386 - 20386      1 
+ 1/ 1   7/  7    29 -    29 20389 - 20389      1 
+debugfs: ex /h26
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 -    39  1365             40
+ 1/ 1   1/  7     0 -     9 20400 - 20409     10 Uninit
+ 1/ 1   2/  7    10 -    10 20410 - 20410      1 
+ 1/ 1   3/  7    11 -    12 20411 - 20412      2 Uninit
+ 1/ 1   4/  7    13 -    13 20413 - 20413      1 
+ 1/ 1   5/  7    14 -    25 20414 - 20425     12 Uninit
+ 1/ 1   6/  7    26 -    26 20426 - 20426      1 
+ 1/ 1   7/  7    29 -    29 20429 - 20429      1 
+debugfs: ex /h27
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 -    39  1366             40
+ 1/ 1   1/  8     0 -     9 20440 - 20449     10 Uninit
+ 1/ 1   2/  8    10 -    10 20450 - 20450      1 
+ 1/ 1   3/  8    11 -    12 20451 - 20452      2 Uninit
+ 1/ 1   4/  8    13 -    13 20453 - 20453      1 
+ 1/ 1   5/  8    14 -    25 20454 - 20465     12 Uninit
+ 1/ 1   6/  8    26 -    26 20466 - 20466      1 
+ 1/ 1   7/  8    27 -    27 20467 - 20467      1 Uninit
+ 1/ 1   8/  8    29 -    29 20469 - 20469      1 
+debugfs: ex /h28
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 -    39  1367             40
+ 1/ 1   1/  8     0 -     9 20480 - 20489     10 Uninit
+ 1/ 1   2/  8    10 -    10 20490 - 20490      1 
+ 1/ 1   3/  8    11 -    12 20491 - 20492      2 Uninit
+ 1/ 1   4/  8    13 -    13 20493 - 20493      1 
+ 1/ 1   5/  8    14 -    25 20494 - 20505     12 Uninit
+ 1/ 1   6/  8    26 -    26 20506 - 20506      1 
+ 1/ 1   7/  8    27 -    28 20507 - 20508      2 Uninit
+ 1/ 1   8/  8    29 -    29 20509 - 20509      1 
+debugfs: ex /h29
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 -    39  1368             40
+ 1/ 1   1/  8     0 -     9 20520 - 20529     10 Uninit
+ 1/ 1   2/  8    10 -    10 20530 - 20530      1 
+ 1/ 1   3/  8    11 -    12 20531 - 20532      2 Uninit
+ 1/ 1   4/  8    13 -    13 20533 - 20533      1 
+ 1/ 1   5/  8    14 -    25 20534 - 20545     12 Uninit
+ 1/ 1   6/  8    26 -    26 20546 - 20546      1 
+ 1/ 1   7/  8    27 -    28 20547 - 20548      2 Uninit
+ 1/ 1   8/  8    29 -    29 20549 - 20549      1 
+debugfs: ex /h30
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 -    39  1369             40
+ 1/ 1   1/  9     0 -     9 20560 - 20569     10 Uninit
+ 1/ 1   2/  9    10 -    10 20570 - 20570      1 
+ 1/ 1   3/  9    11 -    12 20571 - 20572      2 Uninit
+ 1/ 1   4/  9    13 -    13 20573 - 20573      1 
+ 1/ 1   5/  9    14 -    25 20574 - 20585     12 Uninit
+ 1/ 1   6/  9    26 -    26 20586 - 20586      1 
+ 1/ 1   7/  9    27 -    28 20587 - 20588      2 Uninit
+ 1/ 1   8/  9    29 -    29 20589 - 20589      1 
+ 1/ 1   9/  9    30 -    30 20590 - 20590      1 Uninit
+debugfs: ex /h31
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 -    39  1370             40
+ 1/ 1   1/  9     0 -     9 20600 - 20609     10 Uninit
+ 1/ 1   2/  9    10 -    10 20610 - 20610      1 
+ 1/ 1   3/  9    11 -    12 20611 - 20612      2 Uninit
+ 1/ 1   4/  9    13 -    13 20613 - 20613      1 
+ 1/ 1   5/  9    14 -    25 20614 - 20625     12 Uninit
+ 1/ 1   6/  9    26 -    26 20626 - 20626      1 
+ 1/ 1   7/  9    27 -    28 20627 - 20628      2 Uninit
+ 1/ 1   8/  9    29 -    29 20629 - 20629      1 
+ 1/ 1   9/  9    30 -    31 20630 - 20631      2 Uninit
+debugfs: ex /i
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     4 -    39  1371             36
+ 1/ 1   1/  9     4 -     9 20644 - 20649      6 Uninit
+ 1/ 1   2/  9    10 -    10 20650 - 20650      1 
+ 1/ 1   3/  9    11 -    12 20651 - 20652      2 Uninit
+ 1/ 1   4/  9    13 -    13 20653 - 20653      1 
+ 1/ 1   5/  9    14 -    25 20654 - 20665     12 Uninit
+ 1/ 1   6/  9    26 -    26 20666 - 20666      1 
+ 1/ 1   7/  9    27 -    28 20667 - 20668      2 Uninit
+ 1/ 1   8/  9    29 -    29 20669 - 20669      1 
+ 1/ 1   9/  9    30 -    35 20670 - 20675      6 Uninit
+debugfs: ex /j
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    10 -    39  1372             30
+ 1/ 1   1/  5    10 -    10 20690 - 20690      1 
+ 1/ 1   2/  5    13 -    13 20693 - 20693      1 
+ 1/ 1   3/  5    19 -    20 20699 - 20700      2 Uninit
+ 1/ 1   4/  5    26 -    26 20706 - 20706      1 
+ 1/ 1   5/  5    29 -    29 20709 - 20709      1 
+debugfs: ex /k
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 -  8999  1373           9000
+ 1/ 1   1/ 34     0 -     0 19000 - 19000      1 
+ 1/ 1   2/ 34     1 -  1007 19001 - 20007   1007 Uninit
+ 1/ 1   3/ 34  1008 -  1016 20040 - 20048      9 Uninit
+ 1/ 1   4/ 34  1017 -  1026 20080 - 20089     10 Uninit
+ 1/ 1   5/ 34  1027 -  1036 20120 - 20129     10 Uninit
+ 1/ 1   6/ 34  1037 -  1046 20160 - 20169     10 Uninit
+ 1/ 1   7/ 34  1047 -  1047 20171 - 20171      1 Uninit
+ 1/ 1   8/ 34  1048 -  1057 20200 - 20209     10 Uninit
+ 1/ 1   9/ 34  1058 -  1059 20211 - 20212      2 Uninit
+ 1/ 1  10/ 34  1060 -  1069 20240 - 20249     10 Uninit
+ 1/ 1  11/ 34  1070 -  1071 20251 - 20252      2 Uninit
+ 1/ 1  12/ 34  1072 -  1081 20280 - 20289     10 Uninit
+ 1/ 1  13/ 34  1082 -  1083 20291 - 20292      2 Uninit
+ 1/ 1  14/ 34  1084 -  1084 20294 - 20294      1 Uninit
+ 1/ 1  15/ 34  1085 -  1085 20345 - 20345      1 Uninit
+ 1/ 1  16/ 34  1086 -  1087 20347 - 20348      2 Uninit
+ 1/ 1  17/ 34  1088 -  1097 20350 - 20359     10 Uninit
+ 1/ 1  18/ 34  1098 -  1099 20387 - 20388      2 Uninit
+ 1/ 1  19/ 34  1100 -  1109 20390 - 20399     10 Uninit
+ 1/ 1  20/ 34  1110 -  1111 20427 - 20428      2 Uninit
+ 1/ 1  21/ 34  1112 -  1121 20430 - 20439     10 Uninit
+ 1/ 1  22/ 34  1122 -  1122 20468 - 20468      1 Uninit
+ 1/ 1  23/ 34  1123 -  1132 20470 - 20479     10 Uninit
+ 1/ 1  24/ 34  1133 -  1142 20510 - 20519     10 Uninit
+ 1/ 1  25/ 34  1143 -  1152 20550 - 20559     10 Uninit
+ 1/ 1  26/ 34  1153 -  1161 20591 - 20599      9 Uninit
+ 1/ 1  27/ 34  1162 -  1173 20632 - 20643     12 Uninit
+ 1/ 1  28/ 34  1174 -  1187 20676 - 20689     14 Uninit
+ 1/ 1  29/ 34  1188 -  1189 20691 - 20692      2 Uninit
+ 1/ 1  30/ 34  1190 -  1194 20694 - 20698      5 Uninit
+ 1/ 1  31/ 34  1195 -  1199 20701 - 20705      5 Uninit
+ 1/ 1  32/ 34  1200 -  1201 20707 - 20708      2 Uninit
+ 1/ 1  33/ 34  1202 -  5068 20710 - 24576   3867 Uninit
+ 1/ 1  34/ 34  5069 -  8999 24835 - 28765   3931 Uninit
+Pass 1: Checking inodes, blocks, and sizes
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+Free blocks count wrong for group #0 (6819, counted=6815).
+Fix? yes
+
+Free blocks count wrong for group #1 (622, counted=549).
+Fix? yes
+
+Free blocks count wrong for group #2 (565, counted=492).
+Fix? yes
+
+Free blocks count wrong (44260, counted=44110).
+Fix? yes
+
+
+test_filesys: ***** FILE SYSTEM WAS MODIFIED *****
+test_filesys: 51/4096 files (37.3% non-contiguous), 21426/65536 blocks
+Exit status is 1
diff --git a/tests/f_fallocate/name b/tests/f_fallocate/name
new file mode 100644
index 0000000..72d0ed3
--- /dev/null
+++ b/tests/f_fallocate/name
@@ -0,0 +1 @@
+fallocate sparse files and big files
diff --git a/tests/f_fallocate/script b/tests/f_fallocate/script
new file mode 100644
index 0000000..7ea2b62
--- /dev/null
+++ b/tests/f_fallocate/script
@@ -0,0 +1,175 @@
+if test -x $DEBUGFS_EXE; then
+
+FSCK_OPT=-fy
+OUT=$test_name.log
+if [ -f $test_dir/expect.gz ]; then
+	EXP=$test_name.tmp
+	gunzip < $test_dir/expect.gz > $EXP1
+else
+	EXP=$test_dir/expect
+fi
+
+cp /dev/null $OUT
+
+cat > $TMPFILE.conf << ENDL
+[fs_types]
+ext4 = {
+        base_features = sparse_super,filetype,resize_inode,dir_index,ext_attr,^has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize,64bit
+        blocksize = 1024
+        inode_size = 256
+        inode_ratio = 16384
+}
+ENDL
+MKE2FS_CONFIG=$TMPFILE.conf $MKE2FS -F -o Linux -b 1024 -O ^bigalloc -T ext4 $TMPFILE 65536 2>&1 | sed -f $cmd_dir/filter.sed >> $OUT 2>&1
+rm -rf $TMPFILE.conf
+
+$FSCK -fy -N test_filesys $TMPFILE > $OUT.new 2>&1
+status=$?
+echo Exit status is $status >> $OUT.new
+sed -f $cmd_dir/filter.sed -e "s;$TMPFILE;test.img;" $OUT.new >> $OUT
+rm -f $OUT.new
+
+echo "debugfs write files" >> $OUT
+make_file() {
+	name="$1"
+	start="$2"
+	flag="$3"
+
+	cat << ENDL
+write /dev/null $name
+sif /$name size 40960
+eo /$name
+set_bmap $flag 10 $((start + 10))
+set_bmap $flag 13 $((start + 13))
+set_bmap $flag 26 $((start + 26))
+set_bmap $flag 29 $((start + 29))
+ec
+sif /$name blocks 8
+setb $((start + 10))
+setb $((start + 13))
+setb $((start + 26))
+setb $((start + 29))
+ENDL
+}
+
+#Files we create:
+# a: fallocate a 40k file
+# b*: falloc sparse file starting at b*
+# c*: falloc spare file ending at c*
+# d: midcluster to midcluster, surrounding sparse
+# e: partial middle cluster alloc
+# f: one big file
+# g*: falloc sparse init file starting at g*
+# h*: falloc sparse init file ending at h*
+# i: midcluster to midcluster, surrounding sparse init
+# j: partial middle cluster alloc
+# k: one big init file
+base=5000
+cat > $TMPFILE.cmd << ENDL
+write /dev/null a
+sif /a size 40960
+fallocate /a 0 39
+ENDL
+echo "ex /a" >> $TMPFILE.cmd2
+
+make_file sample $base --uninit >> $TMPFILE.cmd
+echo "ex /sample" >> $TMPFILE.cmd2
+base=10000
+
+for i in 8 9 10 11 12 13 14 15; do
+	make_file b$i $(($base + (40 * ($i - 8)))) --uninit >> $TMPFILE.cmd
+	echo "fallocate /b$i $i 39" >> $TMPFILE.cmd
+	echo "ex /b$i" >> $TMPFILE.cmd2
+done
+
+for i in 24 25 26 27 28 29 30 31; do
+	make_file c$i $(($base + 320 + (40 * ($i - 24)))) --uninit >> $TMPFILE.cmd
+	echo "fallocate /c$i 0 $i" >> $TMPFILE.cmd
+	echo "ex /c$i" >> $TMPFILE.cmd2
+done
+
+make_file d $(($base + 640)) --uninit >> $TMPFILE.cmd
+echo "fallocate /d 4 35" >> $TMPFILE.cmd
+echo "ex /d" >> $TMPFILE.cmd2
+
+make_file e $(($base + 680)) --uninit >> $TMPFILE.cmd
+echo "fallocate /e 19 20" >> $TMPFILE.cmd
+echo "ex /e" >> $TMPFILE.cmd2
+
+cat >> $TMPFILE.cmd << ENDL
+write /dev/null f
+sif /f size 1024
+eo /f
+set_bmap --uninit 0 9000
+ec
+sif /f blocks 2
+setb 9000
+fallocate /f 0 8999
+ENDL
+echo "ex /f" >> $TMPFILE.cmd2
+
+# Now do it again, but with initialized blocks
+base=20000
+for i in 8 9 10 11 12 13 14 15; do
+	make_file g$i $(($base + (40 * ($i - 8)))) >> $TMPFILE.cmd
+	echo "fallocate /g$i $i 39" >> $TMPFILE.cmd
+	echo "ex /g$i" >> $TMPFILE.cmd2
+done
+
+for i in 24 25 26 27 28 29 30 31; do
+	make_file h$i $(($base + 320 + (40 * ($i - 24)))) >> $TMPFILE.cmd
+	echo "fallocate /h$i 0 $i" >> $TMPFILE.cmd
+	echo "ex /h$i" >> $TMPFILE.cmd2
+done
+
+make_file i $(($base + 640)) >> $TMPFILE.cmd
+echo "fallocate /i 4 35" >> $TMPFILE.cmd
+echo "ex /i" >> $TMPFILE.cmd2
+
+make_file j $(($base + 680)) >> $TMPFILE.cmd
+echo "fallocate /j 19 20" >> $TMPFILE.cmd
+echo "ex /j" >> $TMPFILE.cmd2
+
+cat >> $TMPFILE.cmd << ENDL
+write /dev/null k
+sif /k size 1024
+eo /k
+set_bmap 0 19000
+ec
+sif /k blocks 2
+setb 19000
+fallocate /k 0 8999
+sif /k size 9216000
+ENDL
+echo "ex /k" >> $TMPFILE.cmd2
+
+$DEBUGFS_EXE -w $TMPFILE -f $TMPFILE.cmd > /dev/null 2>&1
+$DEBUGFS_EXE $TMPFILE -f $TMPFILE.cmd2 >> $OUT.new 2>&1
+sed -f $cmd_dir/filter.sed < $OUT.new >> $OUT
+rm -rf $OUT.new $TMPFILE.cmd $TMPFILE.cmd2
+
+$FSCK -fy -N test_filesys $TMPFILE > $OUT.new 2>&1
+status=$?
+echo Exit status is $status >> $OUT.new
+sed -f $cmd_dir/filter.sed -e "s;$TMPFILE;test.img;" $OUT.new >> $OUT
+rm -f $OUT.new
+
+rm -f $TMPFILE
+
+cmp -s $OUT $EXP
+status=$?
+
+if [ "$status" = 0 ] ; then
+	echo "$test_name: $test_description: ok"
+	touch $test_name.ok
+else
+	echo "$test_name: $test_description: failed"
+	diff $DIFF_OPTS $EXP $OUT > $test_name.failed
+	rm -f $test_name.tmp
+fi
+
+unset IMAGE FSCK_OPT OUT EXP
+
+else #if test -x $DEBUGFS_EXE; then
+	echo "$test_name: $test_description: skipped"
+fi
diff --git a/tests/f_fallocate_bigalloc/expect b/tests/f_fallocate_bigalloc/expect
new file mode 100644
index 0000000..30d577a
--- /dev/null
+++ b/tests/f_fallocate_bigalloc/expect
@@ -0,0 +1,364 @@
+
+Warning: the bigalloc feature is still under development
+See https://ext4.wiki.kernel.org/index.php/Bigalloc for more information
+
+Creating filesystem with 65536 1k blocks and 4096 inodes
+
+Allocating group tables:    \b\b\bdone                            
+Writing inode tables:    \b\b\bdone                            
+Writing superblocks and filesystem accounting information:    \b\b\bdone
+
+Pass 1: Checking inodes, blocks, and sizes
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+test_filesys: 11/4096 files (9.1% non-contiguous), 1144/65536 blocks
+Exit status is 0
+debugfs write files
+debugfs: ex /a
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  1     0 -    39  1144 -  1183     40 Uninit
+debugfs: ex /sample
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4    10 -    10  5010 -  5010      1 Uninit
+ 0/ 0   2/  4    13 -    13  5013 -  5013      1 Uninit
+ 0/ 0   3/  4    26 -    26  5026 -  5026      1 Uninit
+ 0/ 0   4/  4    29 -    29  5029 -  5029      1 Uninit
+debugfs: ex /b8
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4     8 -    12 10008 - 10012      5 Uninit
+ 0/ 0   2/  4    13 -    23 10013 - 10023     11 Uninit
+ 0/ 0   3/  4    24 -    28 10024 - 10028      5 Uninit
+ 0/ 0   4/  4    29 -    39 10029 - 10039     11 Uninit
+debugfs: ex /b9
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4     9 -    12 10049 - 10052      4 Uninit
+ 0/ 0   2/  4    13 -    23 10053 - 10063     11 Uninit
+ 0/ 0   3/  4    24 -    28 10064 - 10068      5 Uninit
+ 0/ 0   4/  4    29 -    39 10069 - 10079     11 Uninit
+debugfs: ex /b10
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4    10 -    12 10090 - 10092      3 Uninit
+ 0/ 0   2/  4    13 -    23 10093 - 10103     11 Uninit
+ 0/ 0   3/  4    24 -    28 10104 - 10108      5 Uninit
+ 0/ 0   4/  4    29 -    39 10109 - 10119     11 Uninit
+debugfs: ex /b11
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4    10 -    12 10130 - 10132      3 Uninit
+ 0/ 0   2/  4    13 -    23 10133 - 10143     11 Uninit
+ 0/ 0   3/  4    24 -    28 10144 - 10148      5 Uninit
+ 0/ 0   4/  4    29 -    39 10149 - 10159     11 Uninit
+debugfs: ex /b12
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4    10 -    10 10170 - 10170      1 Uninit
+ 0/ 0   2/  4    12 -    23 10172 - 10183     12 Uninit
+ 0/ 0   3/  4    24 -    28 10184 - 10188      5 Uninit
+ 0/ 0   4/  4    29 -    39 10189 - 10199     11 Uninit
+debugfs: ex /b13
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4    10 -    10 10210 - 10210      1 Uninit
+ 0/ 0   2/  4    13 -    23 10213 - 10223     11 Uninit
+ 0/ 0   3/  4    24 -    28 10224 - 10228      5 Uninit
+ 0/ 0   4/  4    29 -    39 10229 - 10239     11 Uninit
+debugfs: ex /b14
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4    10 -    10 10250 - 10250      1 Uninit
+ 0/ 0   2/  4    13 -    23 10253 - 10263     11 Uninit
+ 0/ 0   3/  4    24 -    28 10264 - 10268      5 Uninit
+ 0/ 0   4/  4    29 -    39 10269 - 10279     11 Uninit
+debugfs: ex /b15
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    10 -    39  1184             30
+ 1/ 1   1/  5    10 -    10 10290 - 10290      1 Uninit
+ 1/ 1   2/  5    13 -    13 10293 - 10293      1 Uninit
+ 1/ 1   3/  5    15 -    15 10295 - 10295      1 Uninit
+ 1/ 1   4/  5    16 -    28 10296 - 10308     13 Uninit
+ 1/ 1   5/  5    29 -    39 10309 - 10319     11 Uninit
+debugfs: ex /c24
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 -    39  1192             40
+ 1/ 1   1/  5     0 -    12 10320 - 10332     13 Uninit
+ 1/ 1   2/  5    13 -    23 10333 - 10343     11 Uninit
+ 1/ 1   3/  5    24 -    24 10344 - 10344      1 Uninit
+ 1/ 1   4/  5    26 -    26 10346 - 10346      1 Uninit
+ 1/ 1   5/  5    29 -    29 10349 - 10349      1 Uninit
+debugfs: ex /c25
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4     0 -    12 10360 - 10372     13 Uninit
+ 0/ 0   2/  4    13 -    23 10373 - 10383     11 Uninit
+ 0/ 0   3/  4    24 -    26 10384 - 10386      3 Uninit
+ 0/ 0   4/  4    29 -    29 10389 - 10389      1 Uninit
+debugfs: ex /c26
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4     0 -    12 10400 - 10412     13 Uninit
+ 0/ 0   2/  4    13 -    23 10413 - 10423     11 Uninit
+ 0/ 0   3/  4    24 -    26 10424 - 10426      3 Uninit
+ 0/ 0   4/  4    29 -    29 10429 - 10429      1 Uninit
+debugfs: ex /c27
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4     0 -    12 10440 - 10452     13 Uninit
+ 0/ 0   2/  4    13 -    23 10453 - 10463     11 Uninit
+ 0/ 0   3/  4    24 -    27 10464 - 10467      4 Uninit
+ 0/ 0   4/  4    29 -    29 10469 - 10469      1 Uninit
+debugfs: ex /c28
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4     0 -    12 10480 - 10492     13 Uninit
+ 0/ 0   2/  4    13 -    23 10493 - 10503     11 Uninit
+ 0/ 0   3/  4    24 -    28 10504 - 10508      5 Uninit
+ 0/ 0   4/  4    29 -    29 10509 - 10509      1 Uninit
+debugfs: ex /c29
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4     0 -    12 10520 - 10532     13 Uninit
+ 0/ 0   2/  4    13 -    23 10533 - 10543     11 Uninit
+ 0/ 0   3/  4    24 -    28 10544 - 10548      5 Uninit
+ 0/ 0   4/  4    29 -    29 10549 - 10549      1 Uninit
+debugfs: ex /c30
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4     0 -    12 10560 - 10572     13 Uninit
+ 0/ 0   2/  4    13 -    23 10573 - 10583     11 Uninit
+ 0/ 0   3/  4    24 -    28 10584 - 10588      5 Uninit
+ 0/ 0   4/  4    29 -    30 10589 - 10590      2 Uninit
+debugfs: ex /c31
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4     0 -    12 10600 - 10612     13 Uninit
+ 0/ 0   2/  4    13 -    23 10613 - 10623     11 Uninit
+ 0/ 0   3/  4    24 -    28 10624 - 10628      5 Uninit
+ 0/ 0   4/  4    29 -    31 10629 - 10631      3 Uninit
+debugfs: ex /d
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  4     4 -    12 10644 - 10652      9 Uninit
+ 0/ 0   2/  4    13 -    23 10653 - 10663     11 Uninit
+ 0/ 0   3/  4    24 -    28 10664 - 10668      5 Uninit
+ 0/ 0   4/  4    29 -    35 10669 - 10675      7 Uninit
+debugfs: ex /e
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    10 -    39  1200             30
+ 1/ 1   1/  5    10 -    10 10690 - 10690      1 Uninit
+ 1/ 1   2/  5    13 -    13 10693 - 10693      1 Uninit
+ 1/ 1   3/  5    19 -    20 10699 - 10700      2 Uninit
+ 1/ 1   4/  5    26 -    26 10706 - 10706      1 Uninit
+ 1/ 1   5/  5    29 -    29 10709 - 10709      1 Uninit
+debugfs: ex /f
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 -     0  1208              1
+ 1/ 1   1/ 19     0 -     7  9000 -  9007      8 Uninit
+ 1/ 1   2/ 19     8 -  1007  9008 - 10007   1000 Uninit
+ 1/ 1   3/ 19  1008 -  1015 10040 - 10047      8 Uninit
+ 1/ 1   4/ 19  1016 -  1023 10080 - 10087      8 Uninit
+ 1/ 1   5/ 19  1024 -  1031 10120 - 10127      8 Uninit
+ 1/ 1   6/ 19  1032 -  1039 10160 - 10167      8 Uninit
+ 1/ 1   7/ 19  1040 -  1047 10200 - 10207      8 Uninit
+ 1/ 1   8/ 19  1048 -  1055 10240 - 10247      8 Uninit
+ 1/ 1   9/ 19  1056 -  1063 10280 - 10287      8 Uninit
+ 1/ 1  10/ 19  1064 -  1071 10352 - 10359      8 Uninit
+ 1/ 1  11/ 19  1072 -  1079 10392 - 10399      8 Uninit
+ 1/ 1  12/ 19  1080 -  1087 10432 - 10439      8 Uninit
+ 1/ 1  13/ 19  1088 -  1095 10472 - 10479      8 Uninit
+ 1/ 1  14/ 19  1096 -  1103 10512 - 10519      8 Uninit
+ 1/ 1  15/ 19  1104 -  1111 10552 - 10559      8 Uninit
+ 1/ 1  16/ 19  1112 -  1119 10592 - 10599      8 Uninit
+ 1/ 1  17/ 19  1120 -  1127 10632 - 10639      8 Uninit
+ 1/ 1  18/ 19  1128 -  1135 10680 - 10687      8 Uninit
+ 1/ 1  19/ 19  1136 -  8999 10712 - 18575   7864 Uninit
+debugfs: ex /g8
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     8 -    39  1216             32
+ 1/ 1   1/  6     8 -    12 20008 - 20012      5 
+ 1/ 1   2/  6    13 -    15 20013 - 20015      3 
+ 1/ 1   3/  6    16 -    23 20016 - 20023      8 Uninit
+ 1/ 1   4/  6    24 -    28 20024 - 20028      5 
+ 1/ 1   5/  6    29 -    31 20029 - 20031      3 
+ 1/ 1   6/  6    32 -    39 20032 - 20039      8 Uninit
+debugfs: ex /g9
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     9 -    39  1224             31
+ 1/ 1   1/  6     9 -    12 20049 - 20052      4 
+ 1/ 1   2/  6    13 -    15 20053 - 20055      3 
+ 1/ 1   3/  6    16 -    23 20056 - 20063      8 Uninit
+ 1/ 1   4/  6    24 -    28 20064 - 20068      5 
+ 1/ 1   5/  6    29 -    31 20069 - 20071      3 
+ 1/ 1   6/  6    32 -    39 20072 - 20079      8 Uninit
+debugfs: ex /g10
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    10 -    39  1232             30
+ 1/ 1   1/  6    10 -    12 20090 - 20092      3 
+ 1/ 1   2/  6    13 -    15 20093 - 20095      3 
+ 1/ 1   3/  6    16 -    23 20096 - 20103      8 Uninit
+ 1/ 1   4/  6    24 -    28 20104 - 20108      5 
+ 1/ 1   5/  6    29 -    31 20109 - 20111      3 
+ 1/ 1   6/  6    32 -    39 20112 - 20119      8 Uninit
+debugfs: ex /g11
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    10 -    39  1240             30
+ 1/ 1   1/  6    10 -    12 20130 - 20132      3 
+ 1/ 1   2/  6    13 -    15 20133 - 20135      3 
+ 1/ 1   3/  6    16 -    23 20136 - 20143      8 Uninit
+ 1/ 1   4/  6    24 -    28 20144 - 20148      5 
+ 1/ 1   5/  6    29 -    31 20149 - 20151      3 
+ 1/ 1   6/  6    32 -    39 20152 - 20159      8 Uninit
+debugfs: ex /g12
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    10 -    39  1248             30
+ 1/ 1   1/  6    10 -    10 20170 - 20170      1 
+ 1/ 1   2/  6    12 -    15 20172 - 20175      4 
+ 1/ 1   3/  6    16 -    23 20176 - 20183      8 Uninit
+ 1/ 1   4/  6    24 -    28 20184 - 20188      5 
+ 1/ 1   5/  6    29 -    31 20189 - 20191      3 
+ 1/ 1   6/  6    32 -    39 20192 - 20199      8 Uninit
+debugfs: ex /g13
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    10 -    39  1256             30
+ 1/ 1   1/  6    10 -    10 20210 - 20210      1 
+ 1/ 1   2/  6    13 -    15 20213 - 20215      3 
+ 1/ 1   3/  6    16 -    23 20216 - 20223      8 Uninit
+ 1/ 1   4/  6    24 -    28 20224 - 20228      5 
+ 1/ 1   5/  6    29 -    31 20229 - 20231      3 
+ 1/ 1   6/  6    32 -    39 20232 - 20239      8 Uninit
+debugfs: ex /g14
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    10 -    39  1264             30
+ 1/ 1   1/  6    10 -    10 20250 - 20250      1 
+ 1/ 1   2/  6    13 -    15 20253 - 20255      3 
+ 1/ 1   3/  6    16 -    23 20256 - 20263      8 Uninit
+ 1/ 1   4/  6    24 -    28 20264 - 20268      5 
+ 1/ 1   5/  6    29 -    31 20269 - 20271      3 
+ 1/ 1   6/  6    32 -    39 20272 - 20279      8 Uninit
+debugfs: ex /g15
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    10 -    39  1272             30
+ 1/ 1   1/  7    10 -    10 20290 - 20290      1 
+ 1/ 1   2/  7    13 -    13 20293 - 20293      1 
+ 1/ 1   3/  7    15 -    15 20295 - 20295      1 Uninit
+ 1/ 1   4/  7    16 -    23 20296 - 20303      8 Uninit
+ 1/ 1   5/  7    24 -    28 20304 - 20308      5 
+ 1/ 1   6/  7    29 -    31 20309 - 20311      3 
+ 1/ 1   7/  7    32 -    39 20312 - 20319      8 Uninit
+debugfs: ex /h24
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 -    39  1280             40
+ 1/ 1   1/  7     0 -     7 20320 - 20327      8 Uninit
+ 1/ 1   2/  7     8 -    12 20328 - 20332      5 
+ 1/ 1   3/  7    13 -    15 20333 - 20335      3 
+ 1/ 1   4/  7    16 -    23 20336 - 20343      8 Uninit
+ 1/ 1   5/  7    24 -    24 20344 - 20344      1 Uninit
+ 1/ 1   6/  7    26 -    26 20346 - 20346      1 
+ 1/ 1   7/  7    29 -    29 20349 - 20349      1 
+debugfs: ex /h25
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 -    39  1288             40
+ 1/ 1   1/  6     0 -     7 20360 - 20367      8 Uninit
+ 1/ 1   2/  6     8 -    12 20368 - 20372      5 
+ 1/ 1   3/  6    13 -    15 20373 - 20375      3 
+ 1/ 1   4/  6    16 -    23 20376 - 20383      8 Uninit
+ 1/ 1   5/  6    24 -    26 20384 - 20386      3 
+ 1/ 1   6/  6    29 -    29 20389 - 20389      1 
+debugfs: ex /h26
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 -    39  1296             40
+ 1/ 1   1/  6     0 -     7 20400 - 20407      8 Uninit
+ 1/ 1   2/  6     8 -    12 20408 - 20412      5 
+ 1/ 1   3/  6    13 -    15 20413 - 20415      3 
+ 1/ 1   4/  6    16 -    23 20416 - 20423      8 Uninit
+ 1/ 1   5/  6    24 -    26 20424 - 20426      3 
+ 1/ 1   6/  6    29 -    29 20429 - 20429      1 
+debugfs: ex /h27
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 -    39  1304             40
+ 1/ 1   1/  6     0 -     7 20440 - 20447      8 Uninit
+ 1/ 1   2/  6     8 -    12 20448 - 20452      5 
+ 1/ 1   3/  6    13 -    15 20453 - 20455      3 
+ 1/ 1   4/  6    16 -    23 20456 - 20463      8 Uninit
+ 1/ 1   5/  6    24 -    27 20464 - 20467      4 
+ 1/ 1   6/  6    29 -    29 20469 - 20469      1 
+debugfs: ex /h28
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 -    39  1312             40
+ 1/ 1   1/  6     0 -     7 20480 - 20487      8 Uninit
+ 1/ 1   2/  6     8 -    12 20488 - 20492      5 
+ 1/ 1   3/  6    13 -    15 20493 - 20495      3 
+ 1/ 1   4/  6    16 -    23 20496 - 20503      8 Uninit
+ 1/ 1   5/  6    24 -    28 20504 - 20508      5 
+ 1/ 1   6/  6    29 -    29 20509 - 20509      1 
+debugfs: ex /h29
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 -    39  1320             40
+ 1/ 1   1/  6     0 -     7 20520 - 20527      8 Uninit
+ 1/ 1   2/  6     8 -    12 20528 - 20532      5 
+ 1/ 1   3/  6    13 -    15 20533 - 20535      3 
+ 1/ 1   4/  6    16 -    23 20536 - 20543      8 Uninit
+ 1/ 1   5/  6    24 -    28 20544 - 20548      5 
+ 1/ 1   6/  6    29 -    29 20549 - 20549      1 
+debugfs: ex /h30
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 -    39  1328             40
+ 1/ 1   1/  6     0 -     7 20560 - 20567      8 Uninit
+ 1/ 1   2/  6     8 -    12 20568 - 20572      5 
+ 1/ 1   3/  6    13 -    15 20573 - 20575      3 
+ 1/ 1   4/  6    16 -    23 20576 - 20583      8 Uninit
+ 1/ 1   5/  6    24 -    28 20584 - 20588      5 
+ 1/ 1   6/  6    29 -    30 20589 - 20590      2 
+debugfs: ex /h31
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 -    39  1336             40
+ 1/ 1   1/  6     0 -     7 20600 - 20607      8 Uninit
+ 1/ 1   2/  6     8 -    12 20608 - 20612      5 
+ 1/ 1   3/  6    13 -    15 20613 - 20615      3 
+ 1/ 1   4/  6    16 -    23 20616 - 20623      8 Uninit
+ 1/ 1   5/  6    24 -    28 20624 - 20628      5 
+ 1/ 1   6/  6    29 -    31 20629 - 20631      3 
+debugfs: ex /i
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     4 -    39  1344             36
+ 1/ 1   1/  7     4 -     7 20644 - 20647      4 Uninit
+ 1/ 1   2/  7     8 -    12 20648 - 20652      5 
+ 1/ 1   3/  7    13 -    15 20653 - 20655      3 
+ 1/ 1   4/  7    16 -    23 20656 - 20663      8 Uninit
+ 1/ 1   5/  7    24 -    28 20664 - 20668      5 
+ 1/ 1   6/  7    29 -    31 20669 - 20671      3 
+ 1/ 1   7/  7    32 -    35 20672 - 20675      4 Uninit
+debugfs: ex /j
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    10 -    39  1352             30
+ 1/ 1   1/  5    10 -    10 20690 - 20690      1 
+ 1/ 1   2/  5    13 -    13 20693 - 20693      1 
+ 1/ 1   3/  5    19 -    20 20699 - 20700      2 Uninit
+ 1/ 1   4/  5    26 -    26 20706 - 20706      1 
+ 1/ 1   5/  5    29 -    29 20709 - 20709      1 
+debugfs: ex /k
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 -  8999  1360           9000
+ 1/ 1   1/ 19     0 -     7 19000 - 19007      8 
+ 1/ 1   2/ 19     8 -  1007 19008 - 20007   1000 Uninit
+ 1/ 1   3/ 19  1008 -  1015 20040 - 20047      8 Uninit
+ 1/ 1   4/ 19  1016 -  1023 20080 - 20087      8 Uninit
+ 1/ 1   5/ 19  1024 -  1031 20120 - 20127      8 Uninit
+ 1/ 1   6/ 19  1032 -  1039 20160 - 20167      8 Uninit
+ 1/ 1   7/ 19  1040 -  1047 20200 - 20207      8 Uninit
+ 1/ 1   8/ 19  1048 -  1055 20240 - 20247      8 Uninit
+ 1/ 1   9/ 19  1056 -  1063 20280 - 20287      8 Uninit
+ 1/ 1  10/ 19  1064 -  1071 20352 - 20359      8 Uninit
+ 1/ 1  11/ 19  1072 -  1079 20392 - 20399      8 Uninit
+ 1/ 1  12/ 19  1080 -  1087 20432 - 20439      8 Uninit
+ 1/ 1  13/ 19  1088 -  1095 20472 - 20479      8 Uninit
+ 1/ 1  14/ 19  1096 -  1103 20512 - 20519      8 Uninit
+ 1/ 1  15/ 19  1104 -  1111 20552 - 20559      8 Uninit
+ 1/ 1  16/ 19  1112 -  1119 20592 - 20599      8 Uninit
+ 1/ 1  17/ 19  1120 -  1127 20632 - 20639      8 Uninit
+ 1/ 1  18/ 19  1128 -  1135 20680 - 20687      8 Uninit
+ 1/ 1  19/ 19  1136 -  8999 20712 - 28575   7864 Uninit
+Pass 1: Checking inodes, blocks, and sizes
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+Free blocks count wrong for group #0 (5701, counted=5625).
+Fix? yes
+
+Free blocks count wrong (45608, counted=45000).
+Fix? yes
+
+
+test_filesys: ***** FILE SYSTEM WAS MODIFIED *****
+test_filesys: 51/4096 files (43.1% non-contiguous), 20536/65536 blocks
+Exit status is 1
diff --git a/tests/f_fallocate_bigalloc/name b/tests/f_fallocate_bigalloc/name
new file mode 100644
index 0000000..915645c
--- /dev/null
+++ b/tests/f_fallocate_bigalloc/name
@@ -0,0 +1 @@
+fallocate sparse files and big files with bigalloc
diff --git a/tests/f_fallocate_bigalloc/script b/tests/f_fallocate_bigalloc/script
new file mode 100644
index 0000000..199ae43
--- /dev/null
+++ b/tests/f_fallocate_bigalloc/script
@@ -0,0 +1,176 @@
+if test -x $DEBUGFS_EXE; then
+
+FSCK_OPT=-fy
+OUT=$test_name.log
+if [ -f $test_dir/expect.gz ]; then
+	EXP=$test_name.tmp
+	gunzip < $test_dir/expect.gz > $EXP1
+else
+	EXP=$test_dir/expect
+fi
+
+cp /dev/null $OUT
+
+cat > $TMPFILE.conf << ENDL
+[fs_types]
+ext4 = {
+	cluster_size = 8192
+        base_features = sparse_super,filetype,resize_inode,dir_index,ext_attr,^has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize,64bit
+        blocksize = 1024
+        inode_size = 256
+        inode_ratio = 16384
+}
+ENDL
+MKE2FS_CONFIG=$TMPFILE.conf $MKE2FS -F -o Linux -b 1024 -O bigalloc -T ext4 $TMPFILE 65536 2>&1 | sed -f $cmd_dir/filter.sed >> $OUT 2>&1
+rm -rf $TMPFILE.conf
+
+$FSCK -fy -N test_filesys $TMPFILE > $OUT.new 2>&1
+status=$?
+echo Exit status is $status >> $OUT.new
+sed -f $cmd_dir/filter.sed -e "s;$TMPFILE;test.img;" $OUT.new >> $OUT
+rm -f $OUT.new
+
+echo "debugfs write files" >> $OUT
+make_file() {
+	name="$1"
+	start="$2"
+	flag="$3"
+
+	cat << ENDL
+write /dev/null $name
+sif /$name size 40960
+eo /$name
+set_bmap $flag 10 $((start + 10))
+set_bmap $flag 13 $((start + 13))
+set_bmap $flag 26 $((start + 26))
+set_bmap $flag 29 $((start + 29))
+ec
+sif /$name blocks 32
+setb $((start + 10))
+setb $((start + 13))
+setb $((start + 26))
+setb $((start + 29))
+ENDL
+}
+
+#Files we create:
+# a: fallocate a 40k file
+# b*: falloc sparse file starting at b*
+# c*: falloc spare file ending at c*
+# d: midcluster to midcluster, surrounding sparse
+# e: partial middle cluster alloc
+# f: one big file
+# g*: falloc sparse init file starting at g*
+# h*: falloc sparse init file ending at h*
+# i: midcluster to midcluster, surrounding sparse init
+# j: partial middle cluster alloc
+# k: one big init file
+base=5000
+cat > $TMPFILE.cmd << ENDL
+write /dev/null a
+sif /a size 40960
+fallocate /a 0 39
+ENDL
+echo "ex /a" >> $TMPFILE.cmd2
+
+make_file sample $base --uninit >> $TMPFILE.cmd
+echo "ex /sample" >> $TMPFILE.cmd2
+base=10000
+
+for i in 8 9 10 11 12 13 14 15; do
+	make_file b$i $(($base + (40 * ($i - 8)))) --uninit >> $TMPFILE.cmd
+	echo "fallocate /b$i $i 39" >> $TMPFILE.cmd
+	echo "ex /b$i" >> $TMPFILE.cmd2
+done
+
+for i in 24 25 26 27 28 29 30 31; do
+	make_file c$i $(($base + 320 + (40 * ($i - 24)))) --uninit >> $TMPFILE.cmd
+	echo "fallocate /c$i 0 $i" >> $TMPFILE.cmd
+	echo "ex /c$i" >> $TMPFILE.cmd2
+done
+
+make_file d $(($base + 640)) --uninit >> $TMPFILE.cmd
+echo "fallocate /d 4 35" >> $TMPFILE.cmd
+echo "ex /d" >> $TMPFILE.cmd2
+
+make_file e $(($base + 680)) --uninit >> $TMPFILE.cmd
+echo "fallocate /e 19 20" >> $TMPFILE.cmd
+echo "ex /e" >> $TMPFILE.cmd2
+
+cat >> $TMPFILE.cmd << ENDL
+write /dev/null f
+sif /f size 1024
+eo /f
+set_bmap --uninit 0 9000
+ec
+sif /f blocks 16
+setb 9000
+fallocate /f 0 8999
+ENDL
+echo "ex /f" >> $TMPFILE.cmd2
+
+# Now do it again, but with initialized blocks
+base=20000
+for i in 8 9 10 11 12 13 14 15; do
+	make_file g$i $(($base + (40 * ($i - 8)))) >> $TMPFILE.cmd
+	echo "fallocate /g$i $i 39" >> $TMPFILE.cmd
+	echo "ex /g$i" >> $TMPFILE.cmd2
+done
+
+for i in 24 25 26 27 28 29 30 31; do
+	make_file h$i $(($base + 320 + (40 * ($i - 24)))) >> $TMPFILE.cmd
+	echo "fallocate /h$i 0 $i" >> $TMPFILE.cmd
+	echo "ex /h$i" >> $TMPFILE.cmd2
+done
+
+make_file i $(($base + 640)) >> $TMPFILE.cmd
+echo "fallocate /i 4 35" >> $TMPFILE.cmd
+echo "ex /i" >> $TMPFILE.cmd2
+
+make_file j $(($base + 680)) >> $TMPFILE.cmd
+echo "fallocate /j 19 20" >> $TMPFILE.cmd
+echo "ex /j" >> $TMPFILE.cmd2
+
+cat >> $TMPFILE.cmd << ENDL
+write /dev/null k
+sif /k size 1024
+eo /k
+set_bmap 0 19000
+ec
+sif /k blocks 16
+setb 19000
+fallocate /k 0 8999
+sif /k size 9216000
+ENDL
+echo "ex /k" >> $TMPFILE.cmd2
+
+$DEBUGFS_EXE -w $TMPFILE -f $TMPFILE.cmd > /dev/null 2>&1
+$DEBUGFS_EXE $TMPFILE -f $TMPFILE.cmd2 >> $OUT.new 2>&1
+sed -f $cmd_dir/filter.sed < $OUT.new >> $OUT
+rm -rf $OUT.new $TMPFILE.cmd $TMPFILE.cmd2
+
+$FSCK -fy -N test_filesys $TMPFILE > $OUT.new 2>&1
+status=$?
+echo Exit status is $status >> $OUT.new
+sed -f $cmd_dir/filter.sed -e "s;$TMPFILE;test.img;" $OUT.new >> $OUT
+rm -f $OUT.new
+
+rm -f $TMPFILE
+
+cmp -s $OUT $EXP
+status=$?
+
+if [ "$status" = 0 ] ; then
+	echo "$test_name: $test_description: ok"
+	touch $test_name.ok
+else
+	echo "$test_name: $test_description: failed"
+	diff $DIFF_OPTS $EXP $OUT > $test_name.failed
+	rm -f $test_name.tmp
+fi
+
+unset IMAGE FSCK_OPT OUT EXP
+
+else #if test -x $DEBUGFS_EXE; then
+	echo "$test_name: $test_description: skipped"
+fi
diff --git a/tests/f_fallocate_blkmap/expect b/tests/f_fallocate_blkmap/expect
new file mode 100644
index 0000000..f7ae606
--- /dev/null
+++ b/tests/f_fallocate_blkmap/expect
@@ -0,0 +1,58 @@
+Creating filesystem with 65536 1k blocks and 4096 inodes
+Superblock backups stored on blocks: 
+	8193, 24577, 40961, 57345
+
+Allocating group tables:    \b\b\bdone                            
+Writing inode tables:    \b\b\bdone                            
+Writing superblocks and filesystem accounting information:    \b\b\bdone
+
+Pass 1: Checking inodes, blocks, and sizes
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+test_filesys: 11/4096 files (0.0% non-contiguous), 2340/65536 blocks
+Exit status is 0
+debugfs write files
+debugfs: stat /a
+Inode: 12   Type: regular    Mode:  0666   Flags: 0x0
+Generation: 0    Version: 0x00000000:00000000
+User:     0   Group:     0   Size: 40960
+File ACL: 0    Directory ACL: 0
+Links: 1   Blockcount: 82
+Fragment:  Address: 0    Number: 0    Size: 0
+Size of extra inode fields: 28
+BLOCKS:
+(0-1):1312-1313, (2-11):8000-8009, (IND):8010, (12-39):8011-8038
+TOTAL: 41
+
+debugfs: stat /b
+Inode: 13   Type: regular    Mode:  0666   Flags: 0x0
+Generation: 0    Version: 0x00000000:00000000
+User:     0   Group:     0   Size: 10240000
+File ACL: 0    Directory ACL: 0
+Links: 1   Blockcount: 20082
+Fragment:  Address: 0    Number: 0    Size: 0
+Size of extra inode fields: 28
+BLOCKS:
+(0-11):10000-10011, (IND):10012, (12-267):10013-10268, (DIND):10269, (IND):10270, (268-523):10271-10526, (IND):10527, (524-779):10528-10783, (IND):10784, (780-1035):10785-11040, (IND):11041, (1036-1291):11042-11297, (IND):11298, (1292-1547):11299-11554, (IND):11555, (1548-1803):11556-11811, (IND):11812, (1804-2059):11813-12068, (IND):12069, (2060-2315):12070-12325, (IND):12326, (2316-2571):12327-12582, (IND):12583, (2572-2827):12584-12839, (IND):12840, (2828-3083):12841-13096, (IND):13097, (3084-3339):13098-13353, (IND):13354, (3340-3595):13355-13610, (IND):13611, (3596-3851):13612-13867, (IND):13868, (3852-4107):13869-14124, (IND):14125, (4108-4363):14126-14381, (IND):14382, (4364-4619):14383-14638, (IND):14639, (4620-4875):14640-14895, (IND):14896, (4876-5131):14897-15152, (IND):15153, 
 (5132-5387):15154-15409, (IND):15410, (5388-5643):15411-15666, (IND):15667, (5644-5899):15668-15923, (IND):15924, (5900-6155):15925-16180, (IND):16181, (6156-6411):16182-16437, (IND):16438,!
  (6412-6667):16439-16694, (IND):16695, (6668-6923):16696-16951, (IND):16952, (6924-7179):16953-17208, (IND):17209, (7180-7435):17210-17465, (IND):17466, (7436-7691):17467-17722, (IND):17723, (7692-7947):17724-17979, (IND):17980, (7948-8203):17981-18236, (IND):18237, (8204-8459):18238-18493, (IND):18494, (8460-8715):18495-18750, (IND):18751, (8716-8971):18752-19007, (IND):19008, (8972-9227):19009-19264, (IND):19265, (9228-9483):19266-19521, (IND):19522, (9484-9739):19523-19778, (IND):19779, (9740-9995):19780-20035, (IND):20036, (9996-9999):20037-20040
+TOTAL: 10041
+
+Pass 1: Checking inodes, blocks, and sizes
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+Free blocks count wrong for group #0 (6841, counted=6840).
+Fix? yes
+
+Free blocks count wrong for group #1 (1551, counted=1550).
+Fix? yes
+
+Free blocks count wrong (53116, counted=53114).
+Fix? yes
+
+
+test_filesys: ***** FILE SYSTEM WAS MODIFIED *****
+test_filesys: 13/4096 files (7.7% non-contiguous), 12422/65536 blocks
+Exit status is 1
diff --git a/tests/f_fallocate_blkmap/name b/tests/f_fallocate_blkmap/name
new file mode 100644
index 0000000..ba2b61d
--- /dev/null
+++ b/tests/f_fallocate_blkmap/name
@@ -0,0 +1 @@
+fallocate sparse files and big files on a blockmap fs
diff --git a/tests/f_fallocate_blkmap/script b/tests/f_fallocate_blkmap/script
new file mode 100644
index 0000000..da83cd1
--- /dev/null
+++ b/tests/f_fallocate_blkmap/script
@@ -0,0 +1,85 @@
+if test -x $DEBUGFS_EXE; then
+
+FSCK_OPT=-fy
+OUT=$test_name.log
+if [ -f $test_dir/expect.gz ]; then
+	EXP=$test_name.tmp
+	gunzip < $test_dir/expect.gz > $EXP1
+else
+	EXP=$test_dir/expect
+fi
+
+cp /dev/null $OUT
+
+cat > $TMPFILE.conf << ENDL
+[fs_types]
+ext4 = {
+        base_features = sparse_super,filetype,resize_inode,dir_index,ext_attr,^has_journal,^extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize,^64bit
+        blocksize = 1024
+        inode_size = 256
+        inode_ratio = 16384
+}
+ENDL
+MKE2FS_CONFIG=$TMPFILE.conf $MKE2FS -F -o Linux -b 1024 -O ^bigalloc -T ext4 $TMPFILE 65536 2>&1 | sed -f $cmd_dir/filter.sed >> $OUT 2>&1
+rm -rf $TMPFILE.conf
+
+$FSCK -fy -N test_filesys $TMPFILE > $OUT.new 2>&1
+status=$?
+echo Exit status is $status >> $OUT.new
+sed -f $cmd_dir/filter.sed -e "s;$TMPFILE;test.img;" $OUT.new >> $OUT
+rm -f $OUT.new
+
+echo "debugfs write files" >> $OUT
+
+#Files we create:
+# a: fallocate a 40k file
+# k: one big file
+base=5000
+cat > $TMPFILE.cmd << ENDL
+write /dev/null a
+sif /a bmap[2] 8000
+sif /a size 40960
+sif /a i_blocks 2
+setb 8000
+fallocate /a 0 39
+
+write /dev/null b
+sif /b size 10240000
+sif /b bmap[0] 10000
+sif /b i_blocks 2
+setb 10000
+fallocate /b 0 9999
+ENDL
+echo "stat /a" >> $TMPFILE.cmd2
+echo "stat /b" >> $TMPFILE.cmd2
+
+$DEBUGFS_EXE -w $TMPFILE -f $TMPFILE.cmd > /dev/null 2>&1
+$DEBUGFS_EXE $TMPFILE -f $TMPFILE.cmd2 >> $OUT.new 2>&1
+sed -f $cmd_dir/filter.sed -e '/^.*time:.*$/d' < $OUT.new >> $OUT
+rm -rf $OUT.new $TMPFILE.cmd $TMPFILE.cmd2
+
+$FSCK -fy -N test_filesys $TMPFILE > $OUT.new 2>&1
+status=$?
+echo Exit status is $status >> $OUT.new
+sed -f $cmd_dir/filter.sed -e "s;$TMPFILE;test.img;" $OUT.new >> $OUT
+rm -f $OUT.new
+
+rm -f $TMPFILE
+
+cmp -s $OUT $EXP
+status=$?
+
+if [ "$status" = 0 ] ; then
+	echo "$test_name: $test_description: ok"
+	touch $test_name.ok
+else
+	echo "$test_name: $test_description: failed"
+	diff $DIFF_OPTS $EXP $OUT > $test_name.failed
+	rm -f $test_name.tmp
+fi
+
+unset IMAGE FSCK_OPT OUT EXP
+
+else #if test -x $DEBUGFS_EXE; then
+	echo "$test_name: $test_description: skipped"
+fi


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 28/34] tests: test debugfs punch command
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (26 preceding siblings ...)
  2014-09-13 22:14 ` [PATCH 27/34] debugfs: implement fallocate Darrick J. Wong
@ 2014-09-13 22:14 ` Darrick J. Wong
  2014-09-19 16:26   ` Theodore Ts'o
  2014-09-13 22:14 ` [PATCH 30/34] fuse2fs: translate ACL structures Darrick J. Wong
                   ` (5 subsequent siblings)
  33 siblings, 1 reply; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:14 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

Test punching out various parts of sparse files.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 tests/f_punch/expect          |  143 +++++++++++++++++++++++++++++++++++++++++
 tests/f_punch/name            |    1 
 tests/f_punch/script          |  129 +++++++++++++++++++++++++++++++++++++
 tests/f_punch_bigalloc/expect |  142 +++++++++++++++++++++++++++++++++++++++++
 tests/f_punch_bigalloc/name   |    1 
 tests/f_punch_bigalloc/script |  130 +++++++++++++++++++++++++++++++++++++
 6 files changed, 546 insertions(+)
 create mode 100644 tests/f_punch/expect
 create mode 100644 tests/f_punch/name
 create mode 100644 tests/f_punch/script
 create mode 100644 tests/f_punch_bigalloc/expect
 create mode 100644 tests/f_punch_bigalloc/name
 create mode 100644 tests/f_punch_bigalloc/script


diff --git a/tests/f_punch/expect b/tests/f_punch/expect
new file mode 100644
index 0000000..a5b7fd8
--- /dev/null
+++ b/tests/f_punch/expect
@@ -0,0 +1,143 @@
+Creating filesystem with 65536 1k blocks and 4096 inodes
+Superblock backups stored on blocks: 
+	8193, 24577, 40961, 57345
+
+Allocating group tables:    \b\b\bdone                            
+Writing inode tables:    \b\b\bdone                            
+Writing superblocks and filesystem accounting information:    \b\b\bdone
+
+Pass 1: Checking inodes, blocks, and sizes
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+test_filesys: 11/4096 files (0.0% non-contiguous), 2345/65536 blocks
+Exit status is 0
+debugfs write files
+debugfs: ex /a
+Level Entries       Logical      Physical Length Flags
+debugfs: ex /sample
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 - 4294967295  1323              0
+ 1/ 1   1/  5     0 -     9  1313 -  1322     10 Uninit
+ 1/ 1   2/  5    11 -    12  1324 -  1325      2 Uninit
+ 1/ 1   3/  5    14 -    25  1327 -  1338     12 Uninit
+ 1/ 1   4/  5    27 -    28  1340 -  1341      2 Uninit
+ 1/ 1   5/  5    30 -    39  1343 -  1352     10 Uninit
+debugfs: ex /b8
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 - 4294967295  1390              0
+ 1/ 1   1/  4     0 -     0  1326 -  1326      1 Uninit
+ 1/ 1   2/  4     1 -     1  1339 -  1339      1 Uninit
+ 1/ 1   3/  4     2 -     2  1342 -  1342      1 Uninit
+ 1/ 1   4/  4     3 -     7  1353 -  1357      5 Uninit
+debugfs: ex /b9
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 - 4294967295  1368              0
+ 1/ 1   1/  1     0 -     8  1358 -  1366      9 Uninit
+debugfs: ex /b10
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 - 4294967295  1378              0
+ 1/ 1   1/  2     0 -     0  1367 -  1367      1 Uninit
+ 1/ 1   2/  2     1 -     9  1369 -  1377      9 Uninit
+debugfs: ex /b11
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 - 4294967295  1389              0
+ 1/ 1   1/  1     0 -     9  1379 -  1388     10 Uninit
+debugfs: ex /b12
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 - 4294967295  1401              0
+ 1/ 1   1/  2     0 -     9  1391 -  1400     10 Uninit
+ 1/ 1   2/  2    11 -    11  1402 -  1402      1 Uninit
+debugfs: ex /b13
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 - 4294967295  1413              0
+ 1/ 1   1/  2     0 -     9  1403 -  1412     10 Uninit
+ 1/ 1   2/  2    11 -    12  1414 -  1415      2 Uninit
+debugfs: ex /b14
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 - 4294967295  1426              0
+ 1/ 1   1/  2     0 -     9  1416 -  1425     10 Uninit
+ 1/ 1   2/  2    11 -    12  1427 -  1428      2 Uninit
+debugfs: ex /b15
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 - 4294967295  1439              0
+ 1/ 1   1/  3     0 -     9  1429 -  1438     10 Uninit
+ 1/ 1   2/  3    11 -    12  1440 -  1441      2 Uninit
+ 1/ 1   3/  3    14 -    14  1443 -  1443      1 Uninit
+debugfs: ex /c24
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    25 - 4294967295  1453         4294967271
+ 1/ 1   1/  3    25 -    25  1468 -  1468      1 Uninit
+ 1/ 1   2/  3    27 -    28  1470 -  1471      2 Uninit
+ 1/ 1   3/  3    30 -    39  1473 -  1482     10 Uninit
+debugfs: ex /c25
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    27 - 4294967295  1483         4294967269
+ 1/ 1   1/  2    27 -    28  1485 -  1486      2 Uninit
+ 1/ 1   2/  2    30 -    39  1488 -  1497     10 Uninit
+debugfs: ex /c26
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    27 - 4294967295  1484         4294967269
+ 1/ 1   1/  2    27 -    28  1498 -  1499      2 Uninit
+ 1/ 1   2/  2    30 -    39  1501 -  1510     10 Uninit
+debugfs: ex /c27
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    28 - 4294967295  1487         4294967268
+ 1/ 1   1/  2    28 -    28  1512 -  1512      1 Uninit
+ 1/ 1   2/  2    30 -    39  1514 -  1523     10 Uninit
+debugfs: ex /c28
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    30 - 4294967295  1500         4294967266
+ 1/ 1   1/  1    30 -    39  1526 -  1535     10 Uninit
+debugfs: ex /c29
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    30 - 4294967295  1511         4294967266
+ 1/ 1   1/  1    30 -    39  1537 -  1546     10 Uninit
+debugfs: ex /c30
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    31 - 4294967295  1513         4294967265
+ 1/ 1   1/  1    31 -    39  1549 -  1557      9 Uninit
+debugfs: ex /c31
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    32 - 4294967295  1524         4294967264
+ 1/ 1   1/  1    32 -    39  1560 -  1567      8 Uninit
+debugfs: ex /d
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 - 4294967295  1525              0
+ 1/ 1   1/  3     0 -     0  1442 -  1442      1 Uninit
+ 1/ 1   2/  3     1 -     3  1444 -  1446      3 Uninit
+ 1/ 1   3/  3    36 -    39  1573 -  1576      4 Uninit
+debugfs: ex /e
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 - 4294967295  1547              0
+ 1/ 1   1/ 11     0 -     5  1447 -  1452      6 Uninit
+ 1/ 1   2/ 11     6 -     9  1454 -  1457      4 Uninit
+ 1/ 1   3/ 11    11 -    12  1459 -  1460      2 Uninit
+ 1/ 1   4/ 11    14 -    18  1462 -  1466      5 Uninit
+ 1/ 1   5/ 11    21 -    21  1472 -  1472      1 Uninit
+ 1/ 1   6/ 11    22 -    22  1536 -  1536      1 Uninit
+ 1/ 1   7/ 11    23 -    23  1548 -  1548      1 Uninit
+ 1/ 1   8/ 11    24 -    25  1558 -  1559      2 Uninit
+ 1/ 1   9/ 11    27 -    28  1569 -  1570      2 Uninit
+ 1/ 1  10/ 11    30 -    30  1572 -  1572      1 Uninit
+ 1/ 1  11/ 11    31 -    39  1577 -  1585      9 Uninit
+debugfs: ex /f
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  2     0 -     0  9000 -  9000      1 Uninit
+ 0/ 0   2/  2  8999 -  8999 17999 - 17999      1 Uninit
+Pass 1: Checking inodes, blocks, and sizes
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+Free blocks count wrong for group #1 (7934, counted=7933).
+Fix? yes
+
+Free blocks count wrong (62923, counted=62922).
+Fix? yes
+
+
+test_filesys: ***** FILE SYSTEM WAS MODIFIED *****
+test_filesys: 32/4096 files (43.8% non-contiguous), 2614/65536 blocks
+Exit status is 1
diff --git a/tests/f_punch/name b/tests/f_punch/name
new file mode 100644
index 0000000..724639f
--- /dev/null
+++ b/tests/f_punch/name
@@ -0,0 +1 @@
+punch sparse files and big files
diff --git a/tests/f_punch/script b/tests/f_punch/script
new file mode 100644
index 0000000..db6c4dd
--- /dev/null
+++ b/tests/f_punch/script
@@ -0,0 +1,129 @@
+if test -x $DEBUGFS_EXE; then
+
+FSCK_OPT=-fy
+OUT=$test_name.log
+if [ -f $test_dir/expect.gz ]; then
+	EXP=$test_name.tmp
+	gunzip < $test_dir/expect.gz > $EXP1
+else
+	EXP=$test_dir/expect
+fi
+
+cp /dev/null $OUT
+
+cat > $TMPFILE.conf << ENDL
+[fs_types]
+ext4 = {
+        base_features = sparse_super,filetype,resize_inode,dir_index,ext_attr,^has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize,64bit
+        blocksize = 1024
+        inode_size = 256
+        inode_ratio = 16384
+}
+ENDL
+MKE2FS_CONFIG=$TMPFILE.conf $MKE2FS -F -o Linux -b 1024 -O ^bigalloc -T ext4 $TMPFILE 65536 2>&1 | sed -f $cmd_dir/filter.sed >> $OUT 2>&1
+rm -rf $TMPFILE.conf
+
+$FSCK -fy -N test_filesys $TMPFILE > $OUT.new 2>&1
+status=$?
+echo Exit status is $status >> $OUT.new
+sed -f $cmd_dir/filter.sed -e "s;$TMPFILE;test.img;" $OUT.new >> $OUT
+rm -f $OUT.new
+
+echo "debugfs write files" >> $OUT
+make_file() {
+	name="$1"
+	start="$2"
+	flag="$3"
+
+	cat << ENDL
+write /dev/null $name
+fallocate /$name 0 39
+punch /$name 10 10
+punch /$name 13 13
+punch /$name 26 26
+punch /$name 29 29
+ENDL
+}
+
+#Files we create:
+# a: punch a 40k file
+# b*: punch sparse file starting at b*
+# c*: punch spare file ending at c*
+# d: midcluster to midcluster, surrounding sparse
+# e: partial middle cluster alloc
+# f: one big file
+base=5000
+cat > $TMPFILE.cmd << ENDL
+write /dev/null a
+fallocate /a 0 39
+punch /a 0 39
+ENDL
+echo "ex /a" >> $TMPFILE.cmd2
+
+make_file sample $base --uninit >> $TMPFILE.cmd
+echo "ex /sample" >> $TMPFILE.cmd2
+base=10000
+
+for i in 8 9 10 11 12 13 14 15; do
+	make_file b$i $(($base + (40 * ($i - 8)))) --uninit >> $TMPFILE.cmd
+	echo "punch /b$i $i 39" >> $TMPFILE.cmd
+	echo "ex /b$i" >> $TMPFILE.cmd2
+done
+
+for i in 24 25 26 27 28 29 30 31; do
+	make_file c$i $(($base + 320 + (40 * ($i - 24)))) --uninit >> $TMPFILE.cmd
+	echo "punch /c$i 0 $i" >> $TMPFILE.cmd
+	echo "ex /c$i" >> $TMPFILE.cmd2
+done
+
+make_file d $(($base + 640)) --uninit >> $TMPFILE.cmd
+echo "punch /d 4 35" >> $TMPFILE.cmd
+echo "ex /d" >> $TMPFILE.cmd2
+
+make_file e $(($base + 680)) --uninit >> $TMPFILE.cmd
+echo "punch /e 19 20" >> $TMPFILE.cmd
+echo "ex /e" >> $TMPFILE.cmd2
+
+cat >> $TMPFILE.cmd << ENDL
+write /dev/null f
+sif /f size 1024
+eo /f
+set_bmap --uninit 0 9000
+ec
+sif /f blocks 2
+setb 9000
+fallocate /f 0 8999
+punch /f 1 8998
+ENDL
+echo "ex /f" >> $TMPFILE.cmd2
+
+$DEBUGFS_EXE -w $TMPFILE -f $TMPFILE.cmd > /dev/null 2>&1
+$DEBUGFS_EXE $TMPFILE -f $TMPFILE.cmd2 >> $OUT.new 2>&1
+sed -f $cmd_dir/filter.sed < $OUT.new >> $OUT
+rm -rf $OUT.new $TMPFILE.cmd $TMPFILE.cmd2
+
+$FSCK -fy -N test_filesys $TMPFILE > $OUT.new 2>&1
+status=$?
+echo Exit status is $status >> $OUT.new
+sed -f $cmd_dir/filter.sed -e "s;$TMPFILE;test.img;" $OUT.new >> $OUT
+rm -f $OUT.new
+
+rm -f $TMPFILE
+
+cmp -s $OUT $EXP
+status=$?
+
+if [ "$status" = 0 ] ; then
+	echo "$test_name: $test_description: ok"
+	touch $test_name.ok
+else
+	echo "$test_name: $test_description: failed"
+	diff $DIFF_OPTS $EXP $OUT > $test_name.failed
+	rm -f $test_name.tmp
+fi
+
+unset IMAGE FSCK_OPT OUT EXP
+
+else #if test -x $DEBUGFS_EXE; then
+	echo "$test_name: $test_description: skipped"
+fi
diff --git a/tests/f_punch_bigalloc/expect b/tests/f_punch_bigalloc/expect
new file mode 100644
index 0000000..adbef89
--- /dev/null
+++ b/tests/f_punch_bigalloc/expect
@@ -0,0 +1,142 @@
+
+Warning: the bigalloc feature is still under development
+See https://ext4.wiki.kernel.org/index.php/Bigalloc for more information
+
+Creating filesystem with 65536 1k blocks and 4096 inodes
+
+Allocating group tables:    \b\b\bdone                            
+Writing inode tables:    \b\b\bdone                            
+Writing superblocks and filesystem accounting information:    \b\b\bdone
+
+Pass 1: Checking inodes, blocks, and sizes
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+test_filesys: 11/4096 files (9.1% non-contiguous), 1144/65536 blocks
+Exit status is 0
+debugfs write files
+debugfs: ex /a
+Level Entries       Logical      Physical Length Flags
+debugfs: ex /sample
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 - 4294967295  1184              0
+ 1/ 1   1/  5     0 -     9  1144 -  1153     10 Uninit
+ 1/ 1   2/  5    11 -    12  1155 -  1156      2 Uninit
+ 1/ 1   3/  5    14 -    25  1158 -  1169     12 Uninit
+ 1/ 1   4/  5    27 -    28  1171 -  1172      2 Uninit
+ 1/ 1   5/  5    30 -    39  1174 -  1183     10 Uninit
+debugfs: ex /b8
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 - 4294967295  1232              0
+ 1/ 1   1/  1     0 -     7  1192 -  1199      8 Uninit
+debugfs: ex /b9
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 - 4294967295  1248              0
+ 1/ 1   1/  1     0 -     8  1200 -  1208      9 Uninit
+debugfs: ex /b10
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 - 4294967295  1272              0
+ 1/ 1   1/  1     0 -     9  1216 -  1225     10 Uninit
+debugfs: ex /b11
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 - 4294967295  1296              0
+ 1/ 1   1/  2     0 -     7  1240 -  1247      8 Uninit
+ 1/ 1   2/  2     8 -     9  1256 -  1257      2 Uninit
+debugfs: ex /b12
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 - 4294967295  1320              0
+ 1/ 1   1/  3     0 -     7  1264 -  1271      8 Uninit
+ 1/ 1   2/  3     8 -     9  1280 -  1281      2 Uninit
+ 1/ 1   3/  3    11 -    11  1283 -  1283      1 Uninit
+debugfs: ex /b13
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 - 4294967295  1344              0
+ 1/ 1   1/  3     0 -     7  1288 -  1295      8 Uninit
+ 1/ 1   2/  3     8 -     9  1304 -  1305      2 Uninit
+ 1/ 1   3/  3    11 -    12  1307 -  1308      2 Uninit
+debugfs: ex /b14
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 - 4294967295  1368              0
+ 1/ 1   1/  3     0 -     7  1312 -  1319      8 Uninit
+ 1/ 1   2/  3     8 -     9  1328 -  1329      2 Uninit
+ 1/ 1   3/  3    11 -    12  1331 -  1332      2 Uninit
+debugfs: ex /b15
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 - 4294967295  1392              0
+ 1/ 1   1/  4     0 -     7  1336 -  1343      8 Uninit
+ 1/ 1   2/  4     8 -     9  1352 -  1353      2 Uninit
+ 1/ 1   3/  4    11 -    12  1355 -  1356      2 Uninit
+ 1/ 1   4/  4    14 -    14  1358 -  1358      1 Uninit
+debugfs: ex /c24
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    25 - 4294967295  1416         4294967271
+ 1/ 1   1/  3    25 -    25  1401 -  1401      1 Uninit
+ 1/ 1   2/  3    27 -    28  1403 -  1404      2 Uninit
+ 1/ 1   3/  3    30 -    39  1406 -  1415     10 Uninit
+debugfs: ex /c25
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    27 - 4294967295  1440         4294967269
+ 1/ 1   1/  2    27 -    28  1427 -  1428      2 Uninit
+ 1/ 1   2/  2    30 -    39  1430 -  1439     10 Uninit
+debugfs: ex /c26
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    27 - 4294967295  1464         4294967269
+ 1/ 1   1/  2    27 -    28  1451 -  1452      2 Uninit
+ 1/ 1   2/  2    30 -    39  1454 -  1463     10 Uninit
+debugfs: ex /c27
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    28 - 4294967295  1488         4294967268
+ 1/ 1   1/  2    28 -    28  1476 -  1476      1 Uninit
+ 1/ 1   2/  2    30 -    39  1478 -  1487     10 Uninit
+debugfs: ex /c28
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    30 - 4294967295  1512         4294967266
+ 1/ 1   1/  1    30 -    39  1502 -  1511     10 Uninit
+debugfs: ex /c29
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    30 - 4294967295  1536         4294967266
+ 1/ 1   1/  1    30 -    39  1526 -  1535     10 Uninit
+debugfs: ex /c30
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    31 - 4294967295  1560         4294967265
+ 1/ 1   1/  1    31 -    39  1551 -  1559      9 Uninit
+debugfs: ex /c31
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1    32 - 4294967295  1584         4294967264
+ 1/ 1   1/  1    32 -    39  1576 -  1583      8 Uninit
+debugfs: ex /d
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 - 4294967295  1600              0
+ 1/ 1   1/  2     0 -     3  1360 -  1363      4 Uninit
+ 1/ 1   2/  2    36 -    39  1596 -  1599      4 Uninit
+debugfs: ex /e
+Level Entries       Logical      Physical Length Flags
+ 0/ 1   1/  1     0 - 4294967295  1624              0
+ 1/ 1   1/  8     0 -     9  1376 -  1385     10 Uninit
+ 1/ 1   2/  8    11 -    12  1387 -  1388      2 Uninit
+ 1/ 1   3/  8    14 -    15  1390 -  1391      2 Uninit
+ 1/ 1   4/  8    16 -    18  1568 -  1570      3 Uninit
+ 1/ 1   5/  8    21 -    23  1573 -  1575      3 Uninit
+ 1/ 1   6/  8    24 -    25  1608 -  1609      2 Uninit
+ 1/ 1   7/  8    27 -    28  1611 -  1612      2 Uninit
+ 1/ 1   8/  8    30 -    39  1614 -  1623     10 Uninit
+debugfs: ex /f
+Level Entries       Logical      Physical Length Flags
+ 0/ 0   1/  2     0 -     0  9000 -  9000      1 Uninit
+ 0/ 0   2/  2  8999 -  8999 17999 - 17999      1 Uninit
+Pass 1: Checking inodes, blocks, and sizes
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+Free blocks count wrong for group #0 (7987, counted=7986).
+Fix? yes
+
+Free blocks count wrong (63896, counted=63888).
+Fix? yes
+
+
+test_filesys: ***** FILE SYSTEM WAS MODIFIED *****
+test_filesys: 32/4096 files (43.8% non-contiguous), 1648/65536 blocks
+Exit status is 1
diff --git a/tests/f_punch_bigalloc/name b/tests/f_punch_bigalloc/name
new file mode 100644
index 0000000..6d61ebe
--- /dev/null
+++ b/tests/f_punch_bigalloc/name
@@ -0,0 +1 @@
+punch sparse files and big files with bigalloc
diff --git a/tests/f_punch_bigalloc/script b/tests/f_punch_bigalloc/script
new file mode 100644
index 0000000..5784154
--- /dev/null
+++ b/tests/f_punch_bigalloc/script
@@ -0,0 +1,130 @@
+if test -x $DEBUGFS_EXE; then
+
+FSCK_OPT=-fy
+OUT=$test_name.log
+if [ -f $test_dir/expect.gz ]; then
+	EXP=$test_name.tmp
+	gunzip < $test_dir/expect.gz > $EXP1
+else
+	EXP=$test_dir/expect
+fi
+
+cp /dev/null $OUT
+
+cat > $TMPFILE.conf << ENDL
+[fs_types]
+ext4 = {
+	cluster_size = 8192
+        base_features = sparse_super,filetype,resize_inode,dir_index,ext_attr,^has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize,64bit
+        blocksize = 1024
+        inode_size = 256
+        inode_ratio = 16384
+}
+ENDL
+MKE2FS_CONFIG=$TMPFILE.conf $MKE2FS -F -o Linux -b 1024 -O bigalloc -T ext4 $TMPFILE 65536 2>&1 | sed -f $cmd_dir/filter.sed >> $OUT 2>&1
+rm -rf $TMPFILE.conf
+
+$FSCK -fy -N test_filesys $TMPFILE > $OUT.new 2>&1
+status=$?
+echo Exit status is $status >> $OUT.new
+sed -f $cmd_dir/filter.sed -e "s;$TMPFILE;test.img;" $OUT.new >> $OUT
+rm -f $OUT.new
+
+echo "debugfs write files" >> $OUT
+make_file() {
+	name="$1"
+	start="$2"
+	flag="$3"
+
+	cat << ENDL
+write /dev/null $name
+fallocate /$name 0 39
+punch /$name 10 10
+punch /$name 13 13
+punch /$name 26 26
+punch /$name 29 29
+ENDL
+}
+
+#Files we create:
+# a: punch a 40k file
+# b*: punch sparse file starting at b*
+# c*: punch spare file ending at c*
+# d: midcluster to midcluster, surrounding sparse
+# e: partial middle cluster alloc
+# f: one big file
+base=5000
+cat > $TMPFILE.cmd << ENDL
+write /dev/null a
+fallocate /a 0 39
+punch /a 0 39
+ENDL
+echo "ex /a" >> $TMPFILE.cmd2
+
+make_file sample $base --uninit >> $TMPFILE.cmd
+echo "ex /sample" >> $TMPFILE.cmd2
+base=10000
+
+for i in 8 9 10 11 12 13 14 15; do
+	make_file b$i $(($base + (40 * ($i - 8)))) --uninit >> $TMPFILE.cmd
+	echo "punch /b$i $i 39" >> $TMPFILE.cmd
+	echo "ex /b$i" >> $TMPFILE.cmd2
+done
+
+for i in 24 25 26 27 28 29 30 31; do
+	make_file c$i $(($base + 320 + (40 * ($i - 24)))) --uninit >> $TMPFILE.cmd
+	echo "punch /c$i 0 $i" >> $TMPFILE.cmd
+	echo "ex /c$i" >> $TMPFILE.cmd2
+done
+
+make_file d $(($base + 640)) --uninit >> $TMPFILE.cmd
+echo "punch /d 4 35" >> $TMPFILE.cmd
+echo "ex /d" >> $TMPFILE.cmd2
+
+make_file e $(($base + 680)) --uninit >> $TMPFILE.cmd
+echo "punch /e 19 20" >> $TMPFILE.cmd
+echo "ex /e" >> $TMPFILE.cmd2
+
+cat >> $TMPFILE.cmd << ENDL
+write /dev/null f
+sif /f size 1024
+eo /f
+set_bmap --uninit 0 9000
+ec
+sif /f blocks 16
+setb 9000
+fallocate /f 0 8999
+punch /f 1 8998
+ENDL
+echo "ex /f" >> $TMPFILE.cmd2
+
+$DEBUGFS_EXE -w $TMPFILE -f $TMPFILE.cmd > /dev/null 2>&1
+$DEBUGFS_EXE $TMPFILE -f $TMPFILE.cmd2 >> $OUT.new 2>&1
+sed -f $cmd_dir/filter.sed < $OUT.new >> $OUT
+rm -rf $OUT.new $TMPFILE.cmd $TMPFILE.cmd2
+
+$FSCK -fy -N test_filesys $TMPFILE > $OUT.new 2>&1
+status=$?
+echo Exit status is $status >> $OUT.new
+sed -f $cmd_dir/filter.sed -e "s;$TMPFILE;test.img;" $OUT.new >> $OUT
+rm -f $OUT.new
+
+rm -f $TMPFILE
+
+cmp -s $OUT $EXP
+status=$?
+
+if [ "$status" = 0 ] ; then
+	echo "$test_name: $test_description: ok"
+	touch $test_name.ok
+else
+	echo "$test_name: $test_description: failed"
+	diff $DIFF_OPTS $EXP $OUT > $test_name.failed
+	rm -f $test_name.tmp
+fi
+
+unset IMAGE FSCK_OPT OUT EXP
+
+else #if test -x $DEBUGFS_EXE; then
+	echo "$test_name: $test_description: skipped"
+fi


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 30/34] fuse2fs: translate ACL structures
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (27 preceding siblings ...)
  2014-09-13 22:14 ` [PATCH 28/34] tests: test debugfs punch command Darrick J. Wong
@ 2014-09-13 22:14 ` Darrick J. Wong
  2014-09-13 22:14 ` [PATCH 31/34] fuse2fs: handle 64-bit dates correctly Darrick J. Wong
                   ` (4 subsequent siblings)
  33 siblings, 0 replies; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:14 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

Translate "native" ACL structures into ext4 ACL structures when
reading or writing the ACL EAs.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 configure       |    5 +
 configure.in    |    8 +-
 lib/config.h.in |    3 +
 misc/fuse2fs.c  |  263 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 4 files changed, 271 insertions(+), 8 deletions(-)


diff --git a/configure b/configure
index 6a9f82c..896602e 100755
--- a/configure
+++ b/configure
@@ -12409,7 +12409,7 @@ fi
 done
 
 fi
-for ac_header in  	dirent.h 	errno.h 	execinfo.h 	getopt.h 	malloc.h 	mntent.h 	paths.h 	semaphore.h 	setjmp.h 	signal.h 	stdarg.h 	stdint.h 	stdlib.h 	termios.h 	termio.h 	unistd.h 	utime.h 	attr/xattr.h 	linux/falloc.h 	linux/fd.h 	linux/major.h 	linux/loop.h 	net/if_dl.h 	netinet/in.h 	sys/disklabel.h 	sys/disk.h 	sys/file.h 	sys/ioctl.h 	sys/mkdev.h 	sys/mman.h 	sys/mount.h 	sys/prctl.h 	sys/resource.h 	sys/select.h 	sys/socket.h 	sys/sockio.h 	sys/stat.h 	sys/syscall.h 	sys/sysctl.h 	sys/sysmacros.h 	sys/time.h 	sys/types.h 	sys/un.h 	sys/wait.h
+for ac_header in  	dirent.h 	errno.h 	execinfo.h 	getopt.h 	malloc.h 	mntent.h 	paths.h 	semaphore.h 	setjmp.h 	signal.h 	stdarg.h 	stdint.h 	stdlib.h 	termios.h 	termio.h 	unistd.h 	utime.h 	attr/xattr.h 	linux/falloc.h 	linux/fd.h 	linux/major.h 	linux/loop.h 	net/if_dl.h 	netinet/in.h 	sys/acl.h 	sys/disklabel.h 	sys/disk.h 	sys/file.h 	sys/ioctl.h 	sys/mkdev.h 	sys/mman.h 	sys/mount.h 	sys/prctl.h 	sys/resource.h 	sys/select.h 	sys/socket.h 	sys/sockio.h 	sys/stat.h 	sys/syscall.h 	sys/sysctl.h 	sys/sysmacros.h 	sys/time.h 	sys/types.h 	sys/un.h 	sys/wait.h
 do :
   as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh`
 ac_fn_c_check_header_mongrel "$LINENO" "$ac_header" "$as_ac_Header" "$ac_includes_default"
@@ -13197,6 +13197,7 @@ else
 do :
   as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh`
 ac_fn_c_check_header_compile "$LINENO" "$ac_header" "$as_ac_Header" "#define _FILE_OFFSET_BITS	64
+#define FUSE_USE_VERSION 29
 "
 if eval test \"x\$"$as_ac_Header"\" = x"yes"; then :
   cat >>confdefs.h <<_ACEOF
@@ -13215,6 +13216,7 @@ done
 
 	cat confdefs.h - <<_ACEOF >conftest.$ac_ext
 /* end confdefs.h.  */
+#define FUSE_USE_VERSION 29
 #ifdef __linux__
 #include <linux/fs.h>
 #include <linux/falloc.h>
@@ -13334,6 +13336,7 @@ else
 do :
   as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh`
 ac_fn_c_check_header_compile "$LINENO" "$ac_header" "$as_ac_Header" "#define _FILE_OFFSET_BITS	64
+#define FUSE_USE_VERSION 29
 #ifdef __linux__
 # include <linux/fs.h>
 # include <linux/falloc.h>
diff --git a/configure.in b/configure.in
index 747922b..f6329ee 100644
--- a/configure.in
+++ b/configure.in
@@ -927,6 +927,7 @@ AC_CHECK_HEADERS(m4_flatten([
 	linux/loop.h
 	net/if_dl.h
 	netinet/in.h
+	sys/acl.h
 	sys/disklabel.h
 	sys/disk.h
 	sys/file.h
@@ -1166,10 +1167,12 @@ then
 else
 	AC_CHECK_HEADERS([pthread.h fuse.h], [],
 [AC_MSG_FAILURE([Cannot find fuse2fs headers.])],
-[#define _FILE_OFFSET_BITS	64])
+[#define _FILE_OFFSET_BITS	64
+#define FUSE_USE_VERSION 29])
 
 	AC_PREPROC_IFELSE(
-[AC_LANG_PROGRAM([[#ifdef __linux__
+[AC_LANG_PROGRAM([[#define FUSE_USE_VERSION 29
+#ifdef __linux__
 #include <linux/fs.h>
 #include <linux/falloc.h>
 #include <linux/xattr.h>
@@ -1184,6 +1187,7 @@ fi
 ,
 AC_CHECK_HEADERS([pthread.h fuse.h], [], [FUSE_CMT="#"],
 [#define _FILE_OFFSET_BITS	64
+#define FUSE_USE_VERSION 29
 #ifdef __linux__
 # include <linux/fs.h>
 # include <linux/falloc.h>
diff --git a/lib/config.h.in b/lib/config.h.in
index b10e91d..7005940 100644
--- a/lib/config.h.in
+++ b/lib/config.h.in
@@ -467,6 +467,9 @@
 /* Define to 1 if you have the `sysconf' function. */
 #undef HAVE_SYSCONF
 
+/* Define to 1 if you have the <sys/acl.h> header file. */
+#undef HAVE_SYS_ACL_H
+
 /* Define to 1 if you have the <sys/disklabel.h> header file. */
 #undef HAVE_SYS_DISKLABEL_H
 
diff --git a/misc/fuse2fs.c b/misc/fuse2fs.c
index 87ec55c..7c443b3 100644
--- a/misc/fuse2fs.c
+++ b/misc/fuse2fs.c
@@ -18,9 +18,15 @@
 # include <linux/falloc.h>
 # include <linux/xattr.h>
 # define FUSE_PLATFORM_OPTS	",nonempty,big_writes"
+# ifdef HAVE_SYS_ACL_H
+#  define TRANSLATE_LINUX_ACLS
+# endif
 #else
 # define FUSE_PLATFORM_OPTS	""
 #endif
+#ifdef TRANSLATE_LINUX_ACLS
+# include <sys/acl.h>
+#endif
 #include <sys/ioctl.h>
 #include <unistd.h>
 #include <fuse.h>
@@ -85,6 +91,200 @@ static ext2_filsys global_fs; /* Try not to use this directly */
 
 errcode_t ext2fs_run_ext3_journal(ext2_filsys *fs);
 
+/* ACL translation stuff */
+#ifdef TRANSLATE_LINUX_ACLS
+/*
+ * Copied from acl_ea.h in libacl source; ACLs have to be sent to and from fuse
+ * in this format... at least on Linux.
+ */
+#define ACL_EA_ACCESS		"system.posix_acl_access"
+#define ACL_EA_DEFAULT		"system.posix_acl_default"
+
+#define ACL_EA_VERSION		0x0002
+
+typedef struct {
+	u_int16_t	e_tag;
+	u_int16_t	e_perm;
+	u_int32_t	e_id;
+} acl_ea_entry;
+
+typedef struct {
+	u_int32_t	a_version;
+	acl_ea_entry	a_entries[0];
+} acl_ea_header;
+
+static inline size_t acl_ea_size(int count)
+{
+	return sizeof(acl_ea_header) + count * sizeof(acl_ea_entry);
+}
+
+static inline int acl_ea_count(size_t size)
+{
+	if (size < sizeof(acl_ea_header))
+		return -1;
+	size -= sizeof(acl_ea_header);
+	if (size % sizeof(acl_ea_entry))
+		return -1;
+	return size / sizeof(acl_ea_entry);
+}
+
+/*
+ * ext4 ACL structures, copied from fs/ext4/acl.h.
+ */
+#define EXT4_ACL_VERSION	0x0001
+
+typedef struct {
+	__u16		e_tag;
+	__u16		e_perm;
+	__u32		e_id;
+} ext4_acl_entry;
+
+typedef struct {
+	__u16		e_tag;
+	__u16		e_perm;
+} ext4_acl_entry_short;
+
+typedef struct {
+	__u32		a_version;
+} ext4_acl_header;
+
+static inline size_t ext4_acl_size(int count)
+{
+	if (count <= 4) {
+		return sizeof(ext4_acl_header) +
+		       count * sizeof(ext4_acl_entry_short);
+	} else {
+		return sizeof(ext4_acl_header) +
+		       4 * sizeof(ext4_acl_entry_short) +
+		       (count - 4) * sizeof(ext4_acl_entry);
+	}
+}
+
+static inline int ext4_acl_count(size_t size)
+{
+	ssize_t s;
+
+	size -= sizeof(ext4_acl_header);
+	s = size - 4 * sizeof(ext4_acl_entry_short);
+	if (s < 0) {
+		if (size % sizeof(ext4_acl_entry_short))
+			return -1;
+		return size / sizeof(ext4_acl_entry_short);
+	} else {
+		if (s % sizeof(ext4_acl_entry))
+			return -1;
+		return s / sizeof(ext4_acl_entry) + 4;
+	}
+}
+
+static errcode_t fuse_to_ext4_acl(acl_ea_header *facl, size_t facl_sz,
+				  ext4_acl_header **eacl, size_t *eacl_sz)
+{
+	int i, facl_count;
+	ext4_acl_header *h;
+	size_t h_sz;
+	ext4_acl_entry *e;
+	acl_ea_entry *a;
+	void *hptr;
+	errcode_t err;
+
+	facl_count = acl_ea_count(facl_sz);
+	h_sz = ext4_acl_size(facl_count);
+	if (facl_count < 0 || facl->a_version != ACL_EA_VERSION)
+		return EXT2_ET_INVALID_ARGUMENT;
+
+	err = ext2fs_get_mem(h_sz, &h);
+	if (err)
+		return err;
+
+	h->a_version = ext2fs_cpu_to_le32(EXT4_ACL_VERSION);
+	hptr = h + 1;
+	for (i = 0, a = facl->a_entries; i < facl_count; i++, a++) {
+		e = hptr;
+		e->e_tag = ext2fs_cpu_to_le16(a->e_tag);
+		e->e_perm = ext2fs_cpu_to_le16(a->e_perm);
+
+		switch (a->e_tag) {
+		case ACL_USER:
+		case ACL_GROUP:
+			e->e_id = ext2fs_cpu_to_le32(a->e_id);
+			hptr += sizeof(ext4_acl_entry);
+			break;
+		case ACL_USER_OBJ:
+		case ACL_GROUP_OBJ:
+		case ACL_MASK:
+		case ACL_OTHER:
+			hptr += sizeof(ext4_acl_entry_short);
+			break;
+		default:
+			err = EXT2_ET_INVALID_ARGUMENT;
+			goto out;
+		}
+	}
+
+	*eacl = h;
+	*eacl_sz = h_sz;
+	return err;
+out:
+	ext2fs_free_mem(&h);
+	return err;
+}
+
+static errcode_t ext4_to_fuse_acl(acl_ea_header **facl, size_t *facl_sz,
+				  ext4_acl_header *eacl, size_t eacl_sz)
+{
+	int i, eacl_count;
+	acl_ea_header *f;
+	ext4_acl_entry *e;
+	acl_ea_entry *a;
+	size_t f_sz;
+	void *hptr;
+	errcode_t err;
+
+	eacl_count = ext4_acl_count(eacl_sz);
+	f_sz = acl_ea_size(eacl_count);
+	if (eacl_count < 0 ||
+	    eacl->a_version != ext2fs_cpu_to_le32(EXT4_ACL_VERSION))
+		return EXT2_ET_INVALID_ARGUMENT;
+
+	err = ext2fs_get_mem(f_sz, &f);
+	if (err)
+		return err;
+
+	f->a_version = ACL_EA_VERSION;
+	hptr = eacl + 1;
+	for (i = 0, a = f->a_entries; i < eacl_count; i++, a++) {
+		e = hptr;
+		a->e_tag = ext2fs_le16_to_cpu(e->e_tag);
+		a->e_perm = ext2fs_le16_to_cpu(e->e_perm);
+
+		switch (a->e_tag) {
+		case ACL_USER:
+		case ACL_GROUP:
+			a->e_id = ext2fs_le32_to_cpu(e->e_id);
+			hptr += sizeof(ext4_acl_entry);
+			break;
+		case ACL_USER_OBJ:
+		case ACL_GROUP_OBJ:
+		case ACL_MASK:
+		case ACL_OTHER:
+			hptr += sizeof(ext4_acl_entry_short);
+			break;
+		default:
+			err = EXT2_ET_INVALID_ARGUMENT;
+			goto out;
+		}
+	}
+
+	*facl = f;
+	*facl_sz = f_sz;
+	return err;
+out:
+	ext2fs_free_mem(&f);
+	return err;
+}
+#endif /* TRANSLATE_LINUX_ACLS */
+
 /*
  * ext2_file_t contains a struct inode, so we can't leave files open.
  * Use this as a proxy instead.
@@ -2143,6 +2343,30 @@ static int op_statfs(const char *path, struct statvfs *buf)
 	return 0;
 }
 
+typedef errcode_t (*xattr_xlate_get)(void **cooked_buf, size_t *cooked_sz,
+				     const void *raw_buf, size_t raw_sz);
+typedef errcode_t (*xattr_xlate_set)(const void *cooked_buf, size_t cooked_sz,
+				     void **raw_buf, size_t *raw_sz);
+struct xattr_translate {
+	const char *prefix;
+	xattr_xlate_get get;
+	xattr_xlate_set set;
+};
+
+#define XATTR_TRANSLATOR(p, g, s) \
+	{.prefix = (p), \
+	 .get = (xattr_xlate_get)(g), \
+	 .set = (xattr_xlate_set)(s)}
+
+static struct xattr_translate xattr_translators[] = {
+#ifdef TRANSLATE_LINUX_ACLS
+	XATTR_TRANSLATOR(ACL_EA_ACCESS, ext4_to_fuse_acl, fuse_to_ext4_acl),
+	XATTR_TRANSLATOR(ACL_EA_DEFAULT, ext4_to_fuse_acl, fuse_to_ext4_acl),
+#endif
+	XATTR_TRANSLATOR(NULL, NULL, NULL),
+};
+#undef XATTR_TRANSLATOR
+
 static int op_getxattr(const char *path, const char *key, char *value,
 		       size_t len)
 {
@@ -2150,8 +2374,9 @@ static int op_getxattr(const char *path, const char *key, char *value,
 	struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
 	ext2_filsys fs;
 	struct ext2_xattr_handle *h;
-	void *ptr;
-	size_t plen;
+	struct xattr_translate *xt;
+	void *ptr, *cptr;
+	size_t plen, clen;
 	ext2_ino_t ino;
 	errcode_t err;
 	int ret = 0;
@@ -2194,6 +2419,17 @@ static int op_getxattr(const char *path, const char *key, char *value,
 		goto out2;
 	}
 
+	for (xt = xattr_translators; xt->prefix != NULL; xt++) {
+		if (strncmp(key, xt->prefix, strlen(xt->prefix)) == 0) {
+			err = xt->get(&cptr, &clen, ptr, plen);
+			if (err)
+				goto out3;
+			ext2fs_free_mem(&ptr);
+			ptr = cptr;
+			plen = clen;
+		}
+	}
+
 	if (!len) {
 		ret = plen;
 	} else if (len < plen) {
@@ -2203,6 +2439,7 @@ static int op_getxattr(const char *path, const char *key, char *value,
 		ret = plen;
 	}
 
+out3:
 	ext2fs_free_mem(&ptr);
 out2:
 	err = ext2fs_xattrs_close(&h);
@@ -2317,6 +2554,9 @@ static int op_setxattr(const char *path, const char *key, const char *value,
 	struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
 	ext2_filsys fs;
 	struct ext2_xattr_handle *h;
+	struct xattr_translate *xt;
+	void *cvalue;
+	size_t clen;
 	ext2_ino_t ino;
 	errcode_t err;
 	int ret = 0;
@@ -2356,19 +2596,32 @@ static int op_setxattr(const char *path, const char *key, const char *value,
 		goto out2;
 	}
 
-	err = ext2fs_xattr_set(h, key, value, len);
+	cvalue = (void *)value;
+	clen = len;
+	for (xt = xattr_translators; xt->prefix != NULL; xt++) {
+		if (strncmp(key, xt->prefix, strlen(xt->prefix)) == 0) {
+			err = xt->set(value, len, &cvalue, &clen);
+			if (err)
+				goto out3;
+		}
+	}
+
+	err = ext2fs_xattr_set(h, key, cvalue, clen);
 	if (err) {
 		ret = translate_error(fs, ino, err);
-		goto out2;
+		goto out3;
 	}
 
 	err = ext2fs_xattrs_write(h);
 	if (err) {
 		ret = translate_error(fs, ino, err);
-		goto out2;
+		goto out3;
 	}
 
 	ret = update_ctime(fs, ino, NULL);
+out3:
+	if (cvalue != value)
+		ext2fs_free_mem(&cvalue);
 out2:
 	err = ext2fs_xattrs_close(&h);
 	if (!ret && err)


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 31/34] fuse2fs: handle 64-bit dates correctly
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (28 preceding siblings ...)
  2014-09-13 22:14 ` [PATCH 30/34] fuse2fs: translate ACL structures Darrick J. Wong
@ 2014-09-13 22:14 ` Darrick J. Wong
  2014-09-13 22:14 ` [PATCH 32/34] fuse2fs: implement fallocate Darrick J. Wong
                   ` (3 subsequent siblings)
  33 siblings, 0 replies; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:14 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

Fix fuse2fs' interpretation of 64-bit date quantities to match the
kernel.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 misc/fuse2fs.c |   31 ++++++++++++++++++++++---------
 1 file changed, 22 insertions(+), 9 deletions(-)


diff --git a/misc/fuse2fs.c b/misc/fuse2fs.c
index 7c443b3..cadf93d 100644
--- a/misc/fuse2fs.c
+++ b/misc/fuse2fs.c
@@ -353,15 +353,24 @@ static int __translate_error(ext2_filsys fs, errcode_t err, ext2_ino_t ino,
 
 static inline __u32 ext4_encode_extra_time(const struct timespec *time)
 {
-	return (sizeof(time->tv_sec) > 4 ?
-		(time->tv_sec >> 32) & EXT4_EPOCH_MASK : 0) |
-	       ((time->tv_nsec << EXT4_EPOCH_BITS) & EXT4_NSEC_MASK);
+	__u32 extra = sizeof(time->tv_sec) > 4 ?
+			((time->tv_sec - (__s32)time->tv_sec) >> 32) &
+			EXT4_EPOCH_MASK : 0;
+	return extra | (time->tv_nsec << EXT4_EPOCH_BITS);
 }
 
 static inline void ext4_decode_extra_time(struct timespec *time, __u32 extra)
 {
-	if (sizeof(time->tv_sec) > 4)
-		time->tv_sec |= (__u64)((extra) & EXT4_EPOCH_MASK) << 32;
+	if (sizeof(time->tv_sec) > 4 && (extra & EXT4_EPOCH_MASK)) {
+		__u64 extra_bits = extra & EXT4_EPOCH_MASK;
+		/*
+		 * Prior to kernel 3.14?, we had a broken decode function,
+		 * wherein we effectively did this:
+		 * if (extra_bits == 3)
+		 *     extra_bits = 0;
+		 */
+		time->tv_sec += extra_bits << 32;
+	}
 	time->tv_nsec = ((extra) & EXT4_NSEC_MASK) >> EXT4_EPOCH_BITS;
 }
 
@@ -387,7 +396,7 @@ do {									       \
 	(timespec)->tv_sec = (signed)((raw_inode)->xtime);		       \
 	if (EXT4_FITS_IN_INODE(raw_inode, xtime ## _extra))		       \
 		ext4_decode_extra_time((timespec),			       \
-				       raw_inode->xtime ## _extra);	       \
+				       (raw_inode)->xtime ## _extra);	       \
 	else								       \
 		(timespec)->tv_nsec = 0;				       \
 } while (0)
@@ -749,6 +758,7 @@ static int stat_inode(ext2_filsys fs, ext2_ino_t ino, struct stat *statbuf)
 	dev_t fakedev = 0;
 	errcode_t err;
 	int ret = 0;
+	struct timespec tv;
 
 	memset(&inode, 0, sizeof(inode));
 	err = ext2fs_read_inode_full(fs, ino, (struct ext2_inode *)&inode,
@@ -766,9 +776,12 @@ static int stat_inode(ext2_filsys fs, ext2_ino_t ino, struct stat *statbuf)
 	statbuf->st_size = EXT2_I_SIZE(&inode);
 	statbuf->st_blksize = fs->blocksize;
 	statbuf->st_blocks = blocks_from_inode(fs, &inode);
-	statbuf->st_atime = inode.i_atime;
-	statbuf->st_mtime = inode.i_mtime;
-	statbuf->st_ctime = inode.i_ctime;
+	EXT4_INODE_GET_XTIME(i_atime, &tv, &inode);
+	statbuf->st_atime = tv.tv_sec;
+	EXT4_INODE_GET_XTIME(i_mtime, &tv, &inode);
+	statbuf->st_mtime = tv.tv_sec;
+	EXT4_INODE_GET_XTIME(i_ctime, &tv, &inode);
+	statbuf->st_ctime = tv.tv_sec;
 	if (LINUX_S_ISCHR(inode.i_mode) ||
 	    LINUX_S_ISBLK(inode.i_mode)) {
 		if (inode.i_block[0])


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 32/34] fuse2fs: implement fallocate
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (29 preceding siblings ...)
  2014-09-13 22:14 ` [PATCH 31/34] fuse2fs: handle 64-bit dates correctly Darrick J. Wong
@ 2014-09-13 22:14 ` Darrick J. Wong
  2014-09-13 22:15 ` [PATCH 34/34] tests: enable using fuse2fs with metadata checksum test Darrick J. Wong
                   ` (2 subsequent siblings)
  33 siblings, 0 replies; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:14 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

Use the (new) ext2fs_fallocate() to fallocate file space.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 misc/fuse2fs.c |   58 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 57 insertions(+), 1 deletion(-)


diff --git a/misc/fuse2fs.c b/misc/fuse2fs.c
index cadf93d..9fba411 100644
--- a/misc/fuse2fs.c
+++ b/misc/fuse2fs.c
@@ -3300,7 +3300,63 @@ out:
 static int fallocate_helper(struct fuse_file_info *fp, int mode, off_t offset,
 			    off_t len)
 {
-	return -EOPNOTSUPP;
+	struct fuse_context *ctxt = fuse_get_context();
+	struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+	struct fuse2fs_file_handle *fh = (struct fuse2fs_file_handle *)fp->fh;
+	ext2_filsys fs;
+	struct ext2_inode_large inode;
+	blk64_t start, end;
+	__u64 fsize;
+	errcode_t err;
+	int flags;
+
+	FUSE2FS_CHECK_CONTEXT(ff);
+	fs = ff->fs;
+	FUSE2FS_CHECK_MAGIC(fs, fh, FUSE2FS_FILE_MAGIC);
+	start = offset / fs->blocksize;
+	end = (offset + len - 1) / fs->blocksize;
+	dbg_printf("%s: ino=%d mode=0x%x start=%jd end=%llu\n", __func__,
+		   fh->ino, mode, offset / fs->blocksize, end);
+	if (!fs_can_allocate(ff, len / fs->blocksize))
+		return -ENOSPC;
+
+	memset(&inode, 0, sizeof(inode));
+	err = ext2fs_read_inode_full(fs, fh->ino, (struct ext2_inode *)&inode,
+				     sizeof(inode));
+	if (err)
+		return err;
+	fsize = EXT2_I_SIZE(&inode);
+
+	/* Allocate a bunch of blocks */
+	flags = (mode & FL_KEEP_SIZE_FLAG ? 0 :
+			EXT2_FALLOCATE_INIT_BEYOND_EOF);
+	err = ext2fs_fallocate(fs, flags, fh->ino,
+			       (struct ext2_inode *)&inode,
+			       ~0ULL, start, end - start + 1);
+	if (err && err != EXT2_ET_BLOCK_ALLOC_FAIL)
+		return translate_error(fs, fh->ino, err);
+
+	/* Update i_size */
+	if (!(mode & FL_KEEP_SIZE_FLAG)) {
+		if (offset + len > fsize) {
+			err = ext2fs_inode_size_set(fs,
+						(struct ext2_inode *)&inode,
+						offset + len);
+			if (err)
+				return translate_error(fs, fh->ino, err);
+		}
+	}
+
+	err = update_mtime(fs, fh->ino, &inode);
+	if (err)
+		return err;
+
+	err = ext2fs_write_inode_full(fs, fh->ino, (struct ext2_inode *)&inode,
+				      sizeof(inode));
+	if (err)
+		return translate_error(fs, fh->ino, err);
+
+	return err;
 }
 
 static errcode_t clean_block_middle(ext2_filsys fs, ext2_ino_t ino,


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 34/34] tests: enable using fuse2fs with metadata checksum test
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (30 preceding siblings ...)
  2014-09-13 22:14 ` [PATCH 32/34] fuse2fs: implement fallocate Darrick J. Wong
@ 2014-09-13 22:15 ` Darrick J. Wong
  2014-09-14 17:19 ` [PATCH 35/34] e2fsck: free bh when descriptor block checksum fails Darrick J. Wong
  2014-09-18 19:09 ` [PATCH 36/34] misc: fix Coverity complaints Darrick J. Wong
  33 siblings, 0 replies; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-13 22:15 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

Create custom mount/umount commands so that we can run the metadata
checksumming tests against fuse2fs.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 tests/fuse2fs/mount  |   28 ++++++++++++++++++++++++++++
 tests/fuse2fs/umount |   21 +++++++++++++++++++++
 2 files changed, 49 insertions(+)
 create mode 100755 tests/fuse2fs/mount
 create mode 100755 tests/fuse2fs/umount


diff --git a/tests/fuse2fs/mount b/tests/fuse2fs/mount
new file mode 100755
index 0000000..321b1f5
--- /dev/null
+++ b/tests/fuse2fs/mount
@@ -0,0 +1,28 @@
+#!/bin/bash
+
+# Mount ext4 via fuse.  Put tests/fuse2fs/ at the start of PATH if you want
+# to run the metadata checksumming tests with fuse2fs.
+
+for arg in "$@"; do
+	if [ -b "${arg}" ]; then
+		DEV="${arg}"
+	elif [ -d "${arg}" ]; then
+		MNT="${arg}"
+	fi
+done
+
+if [ -z "${DEV}" -o -z "${MNT}" ]; then
+	echo "Please specify a device and a mountpoint."
+fi
+
+DIR="$(readlink -f "$(dirname "$0")")"
+if [ -n "${FUSE2FS_DEBUG}" ]; then
+	"${DIR}/../../misc/fuse2fs" "${DEV}" "${MNT}" -d >> "${FUSE2FS_DEBUG}" 2>&1 &
+	sleep 1
+	exit 0
+else
+	"${DIR}/../../misc/fuse2fs" "${DEV}" "${MNT}"
+	ERR=$?
+	sleep 1
+	exit "${ERR}"
+fi
diff --git a/tests/fuse2fs/umount b/tests/fuse2fs/umount
new file mode 100755
index 0000000..b21ee5a
--- /dev/null
+++ b/tests/fuse2fs/umount
@@ -0,0 +1,21 @@
+#!/bin/bash
+
+# unmount a filesystem
+sync
+sync
+sync
+
+sleep 2
+if [ -x /bin/umount ]; then
+	/bin/umount "$@"
+	ERR=$?
+elif [ -x /sbin/umount ]; then
+	/sbin/umount "$@"
+	ERR=$?
+else
+	echo "Where is umount?"
+	exit 5
+fi
+sleep 1
+
+exit "${ERR}"


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 35/34] e2fsck: free bh when descriptor block checksum fails
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (31 preceding siblings ...)
  2014-09-13 22:15 ` [PATCH 34/34] tests: enable using fuse2fs with metadata checksum test Darrick J. Wong
@ 2014-09-14 17:19 ` Darrick J. Wong
  2014-09-14 19:11   ` Eric Sandeen
  2014-09-18 19:09 ` [PATCH 36/34] misc: fix Coverity complaints Darrick J. Wong
  33 siblings, 1 reply; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-14 17:19 UTC (permalink / raw)
  To: tytso; +Cc: linux-ext4, Eric Sandeen

Free the buffer head if the journal descriptor block fails checksum
verification.  This has been patched before (see "e2fsck: free bh on
csum verify error in do_one_pass") but apparently the patch was never
committed to jbd2 in the kernel, so when we resync'd the recovery code
with 3.16, the bug came back.  Sigh.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Eric Sandeen <sandeen@redhat.com>
---
 e2fsck/recovery.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/e2fsck/recovery.c b/e2fsck/recovery.c
index 3dc7c06..b5ce3b3 100644
--- a/e2fsck/recovery.c
+++ b/e2fsck/recovery.c
@@ -525,6 +525,7 @@ static int do_one_pass(journal_t *journal,
 			    !jbd2_descr_block_csum_verify(journal,
 							  bh->b_data)) {
 				err = -EIO;
+				brelse(bh);
 				goto failed;
 			}
 

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* RE: [PATCH 19/34] resize2fs: convert fs to and from 64bit mode
  2014-09-13 22:13 ` [PATCH 19/34] resize2fs: convert fs to and from 64bit mode Darrick J. Wong
@ 2014-09-14 17:34   ` TR Reardon
  2014-09-14 17:50     ` Darrick J. Wong
  0 siblings, 1 reply; 67+ messages in thread
From: TR Reardon @ 2014-09-14 17:34 UTC (permalink / raw)
  To: Darrick J. Wong, tytso; +Cc: linux-ext4

> Subject: [PATCH 19/34] resize2fs: convert fs to and from 64bit mode
> From: darrick.wong@oracle.com
> To: tytso@mit.edu; darrick.wong@oracle.com
> CC: linux-ext4@vger.kernel.org
> Date: Sat, 13 Sep 2014 15:13:18 -0700
>
> resize2fs does its magic by loading a filesystem, duplicating the
> in-memory image of that fs, moving relevant blocks out of the way of
> whatever new metadata get created, and finally writing everything back
> out to disk. Enabling 64bit mode enlarges the group descriptors,
> which makes resize2fs a reasonable vehicle for taking care of the rest
> of the bookkeeping requirements, so add to resize2fs the ability to
> convert a filesystem to 64bit mode and back.
>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
> resize/main.c | 40 ++++++
> resize/resize2fs.8.in | 18 +++
> resize/resize2fs.c | 326 ++++++++++++++++++++++++++++++++++++++++++++++++-
> resize/resize2fs.h | 3
> 4 files changed, 379 insertions(+), 8 deletions(-)
>
>
> diff --git a/resize/main.c b/resize/main.c
> index c107028..9fea3d8 100644
> --- a/resize/main.c
> +++ b/resize/main.c
> @@ -42,7 +42,7 @@ static char *device_name, *io_options;
> static void usage (char *prog)
> {
> fprintf (stderr, _("Usage: %s [-d debug_flags] [-f] [-F] [-M] [-P] "
> - "[-p] device [new_size]\n\n"), prog);
> + "[-p] device [-b|-s|new_size]\n\n"), prog);
>
> exit (1);
> }
> @@ -200,7 +200,7 @@ int main (int argc, char ** argv)
> if (argc && *argv)
> program_name = *argv;
>
> - while ((c = getopt (argc, argv, "d:fFhMPpS:")) != EOF) {
> + while ((c = getopt(argc, argv, "d:fFhMPpS:bs")) != EOF) {
> switch (c) {
> case 'h':
> usage(program_name);
> @@ -226,6 +226,12 @@ int main (int argc, char ** argv)
> case 'S':
> use_stride = atoi(optarg);
> break;
> + case 'b':
> + flags |= RESIZE_ENABLE_64BIT;
> + break;
> + case 's':
> + flags |= RESIZE_DISABLE_64BIT;
> + break;
> default:
> usage(program_name);
> }
> @@ -389,6 +395,10 @@ int main (int argc, char ** argv)
> if (sys_page_size> fs->blocksize)
> new_size &= ~((sys_page_size / fs->blocksize)-1);
> }
> + /* If changing 64bit, don't change the filesystem size. */
> + if (flags & (RESIZE_DISABLE_64BIT | RESIZE_ENABLE_64BIT)) {
> + new_size = ext2fs_blocks_count(fs->super);
> + }
> if (!EXT2_HAS_INCOMPAT_FEATURE(fs->super,
> EXT4_FEATURE_INCOMPAT_64BIT)) {
> /* Take 16T down to 2^32-1 blocks */
> @@ -440,7 +450,31 @@ int main (int argc, char ** argv)
> fs->blocksize / 1024, new_size);
> exit(1);
> }
> - if (new_size == ext2fs_blocks_count(fs->super)) {
> + if ((flags & RESIZE_DISABLE_64BIT) && (flags & RESIZE_ENABLE_64BIT)) {
> + fprintf(stderr, _("Cannot set and unset 64bit feature.\n"));
> + exit(1);
> + } else if (flags & (RESIZE_DISABLE_64BIT | RESIZE_ENABLE_64BIT)) {
> + new_size = ext2fs_blocks_count(fs->super);


Redundant to assignment just above? 		 	   		  

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 19/34] resize2fs: convert fs to and from 64bit mode
  2014-09-14 17:34   ` TR Reardon
@ 2014-09-14 17:50     ` Darrick J. Wong
  0 siblings, 0 replies; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-14 17:50 UTC (permalink / raw)
  To: TR Reardon; +Cc: tytso, linux-ext4

On Sun, Sep 14, 2014 at 01:34:14PM -0400, TR Reardon wrote:
> > Subject: [PATCH 19/34] resize2fs: convert fs to and from 64bit mode
> > From: darrick.wong@oracle.com
> > To: tytso@mit.edu; darrick.wong@oracle.com
> > CC: linux-ext4@vger.kernel.org
> > Date: Sat, 13 Sep 2014 15:13:18 -0700
> >
> > resize2fs does its magic by loading a filesystem, duplicating the
> > in-memory image of that fs, moving relevant blocks out of the way of
> > whatever new metadata get created, and finally writing everything back
> > out to disk. Enabling 64bit mode enlarges the group descriptors,
> > which makes resize2fs a reasonable vehicle for taking care of the rest
> > of the bookkeeping requirements, so add to resize2fs the ability to
> > convert a filesystem to 64bit mode and back.
> >
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> > resize/main.c | 40 ++++++
> > resize/resize2fs.8.in | 18 +++
> > resize/resize2fs.c | 326 ++++++++++++++++++++++++++++++++++++++++++++++++-
> > resize/resize2fs.h | 3
> > 4 files changed, 379 insertions(+), 8 deletions(-)
> >
> >
> > diff --git a/resize/main.c b/resize/main.c
> > index c107028..9fea3d8 100644
> > --- a/resize/main.c
> > +++ b/resize/main.c
> > @@ -42,7 +42,7 @@ static char *device_name, *io_options;
> > static void usage (char *prog)
> > {
> > fprintf (stderr, _("Usage: %s [-d debug_flags] [-f] [-F] [-M] [-P] "
> > - "[-p] device [new_size]\n\n"), prog);
> > + "[-p] device [-b|-s|new_size]\n\n"), prog);
> >
> > exit (1);
> > }
> > @@ -200,7 +200,7 @@ int main (int argc, char ** argv)
> > if (argc && *argv)
> > program_name = *argv;
> >
> > - while ((c = getopt (argc, argv, "d:fFhMPpS:")) != EOF) {
> > + while ((c = getopt(argc, argv, "d:fFhMPpS:bs")) != EOF) {
> > switch (c) {
> > case 'h':
> > usage(program_name);
> > @@ -226,6 +226,12 @@ int main (int argc, char ** argv)
> > case 'S':
> > use_stride = atoi(optarg);
> > break;
> > + case 'b':
> > + flags |= RESIZE_ENABLE_64BIT;
> > + break;
> > + case 's':
> > + flags |= RESIZE_DISABLE_64BIT;
> > + break;
> > default:
> > usage(program_name);
> > }
> > @@ -389,6 +395,10 @@ int main (int argc, char ** argv)
> > if (sys_page_size> fs->blocksize)
> > new_size &= ~((sys_page_size / fs->blocksize)-1);
> > }
> > + /* If changing 64bit, don't change the filesystem size. */
> > + if (flags & (RESIZE_DISABLE_64BIT | RESIZE_ENABLE_64BIT)) {
> > + new_size = ext2fs_blocks_count(fs->super);
> > + }
> > if (!EXT2_HAS_INCOMPAT_FEATURE(fs->super,
> > EXT4_FEATURE_INCOMPAT_64BIT)) {
> > /* Take 16T down to 2^32-1 blocks */
> > @@ -440,7 +450,31 @@ int main (int argc, char ** argv)
> > fs->blocksize / 1024, new_size);
> > exit(1);
> > }
> > - if (new_size == ext2fs_blocks_count(fs->super)) {
> > + if ((flags & RESIZE_DISABLE_64BIT) && (flags & RESIZE_ENABLE_64BIT)) {
> > + fprintf(stderr, _("Cannot set and unset 64bit feature.\n"));
> > + exit(1);
> > + } else if (flags & (RESIZE_DISABLE_64BIT | RESIZE_ENABLE_64BIT)) {
> > + new_size = ext2fs_blocks_count(fs->super);
> 
> 
> Redundant to assignment just above? 		 	   		  --

Yeah, I think so.  I think the original purpose was to reset new_size after the
size checks, but at the moment the only way that happens is if we've a 32bit FS
with exactly 2^32 blocks, which shouldn't be possible.

--D

> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 35/34] e2fsck: free bh when descriptor block checksum fails
  2014-09-14 17:19 ` [PATCH 35/34] e2fsck: free bh when descriptor block checksum fails Darrick J. Wong
@ 2014-09-14 19:11   ` Eric Sandeen
  2014-09-19  1:46     ` Theodore Ts'o
  0 siblings, 1 reply; 67+ messages in thread
From: Eric Sandeen @ 2014-09-14 19:11 UTC (permalink / raw)
  To: Darrick J. Wong, tytso; +Cc: linux-ext4

On 9/14/14 12:19 PM, Darrick J. Wong wrote:
> Free the buffer head if the journal descriptor block fails checksum
> verification.  This has been patched before (see "e2fsck: free bh on
> csum verify error in do_one_pass") but apparently the patch was never
> committed to jbd2 in the kernel, so when we resync'd the recovery code
> with 3.16, the bug came back.  Sigh.

Cool. 

Reviewed-by: Eric Sandeen <sandeen@redhat.com>

thanks for fixing up the kernel side... I hadn't thought about that,
obviously.  :(

-Eric

> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> Cc: Eric Sandeen <sandeen@redhat.com>
> ---
>  e2fsck/recovery.c |    1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/e2fsck/recovery.c b/e2fsck/recovery.c
> index 3dc7c06..b5ce3b3 100644
> --- a/e2fsck/recovery.c
> +++ b/e2fsck/recovery.c
> @@ -525,6 +525,7 @@ static int do_one_pass(journal_t *journal,
>  			    !jbd2_descr_block_csum_verify(journal,
>  							  bh->b_data)) {
>  				err = -EIO;
> +				brelse(bh);
>  				goto failed;
>  			}
>  
> 


^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH 36/34] misc: fix Coverity complaints
  2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
                   ` (32 preceding siblings ...)
  2014-09-14 17:19 ` [PATCH 35/34] e2fsck: free bh when descriptor block checksum fails Darrick J. Wong
@ 2014-09-18 19:09 ` Darrick J. Wong
  2014-09-19  1:47   ` Theodore Ts'o
  33 siblings, 1 reply; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-18 19:09 UTC (permalink / raw)
  To: tytso; +Cc: linux-ext4

Fix a few problems that Coverity picked up with error handling.

Fixes-Coverity-Bug: 1239278
Fixes-Coverity-Bug: 1239279
Fixes-Coverity-Bug: 1239280
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 debugfs/journal.c |   28 +++++++++++++++-------------
 debugfs/util.c    |   22 ++++++++++++++++------
 2 files changed, 31 insertions(+), 19 deletions(-)

diff --git a/debugfs/journal.c b/debugfs/journal.c
index 99547c9..82d5bca 100644
--- a/debugfs/journal.c
+++ b/debugfs/journal.c
@@ -657,10 +657,10 @@ void ext2fs_journal_release(ext2_filsys fs, journal_t *journal,
 	}
 	brelse(journal->j_sb_buffer);
 
-	if (fs->journal_io) {
-		if (fs && fs->io != fs->journal_io)
+	if (fs && fs->journal_io) {
+		if (fs->io != fs->journal_io)
 			io_channel_close(fs->journal_io);
-		fs->journal_io = 0;
+		fs->journal_io = NULL;
 	}
 
 #ifndef USE_INODE_IO
@@ -682,7 +682,6 @@ errcode_t ext2fs_check_ext3_journal(ext2_filsys fs)
 	journal_t *journal;
 	int recover = fs->super->s_feature_incompat &
 		EXT3_FEATURE_INCOMPAT_RECOVER;
-	int reset = 0;
 	errcode_t retval;
 
 	/* If we don't have any journal features, don't do anything more */
@@ -696,23 +695,25 @@ errcode_t ext2fs_check_ext3_journal(ext2_filsys fs)
 		return retval;
 
 	retval = ext2fs_journal_load(journal);
-	if (retval) {
-		ext2fs_journal_release(fs, journal, 0, 1);
-		return retval;
-	}
+	if (retval)
+		goto err;
 
 	/*
 	 * We want to make the flags consistent here.  We will not leave with
 	 * needs_recovery set but has_journal clear.  We can't get in a loop
 	 * with -y, -n, or -p, only if a user isn't making up their mind.
 	 */
-	if (!(sb->s_feature_compat & EXT3_FEATURE_COMPAT_HAS_JOURNAL))
-		return EXT2_ET_JOURNAL_FLAGS_WRONG;
+	if (!(sb->s_feature_compat & EXT3_FEATURE_COMPAT_HAS_JOURNAL)) {
+		retval = EXT2_ET_JOURNAL_FLAGS_WRONG;
+		goto err;
+	}
 
 	if (sb->s_feature_compat & EXT3_FEATURE_COMPAT_HAS_JOURNAL &&
 	    !(sb->s_feature_incompat & EXT3_FEATURE_INCOMPAT_RECOVER) &&
-	    journal->j_superblock->s_start != 0)
-		return EXT2_ET_JOURNAL_FLAGS_WRONG;
+	    journal->j_superblock->s_start != 0) {
+		retval = EXT2_ET_JOURNAL_FLAGS_WRONG;
+		goto err;
+	}
 
 	/*
 	 * If we don't need to do replay the journal, check to see if
@@ -728,7 +729,8 @@ errcode_t ext2fs_check_ext3_journal(ext2_filsys fs)
 		mark_buffer_dirty(journal->j_sb_buffer);
 	}
 
-	ext2fs_journal_release(fs, journal, reset, 0);
+err:
+	ext2fs_journal_release(fs, journal, 0, retval ? 1 : 0);
 	return retval;
 }
 
diff --git a/debugfs/util.c b/debugfs/util.c
index 54fcdc4..af14539 100644
--- a/debugfs/util.c
+++ b/debugfs/util.c
@@ -503,6 +503,7 @@ errcode_t read_list(const char *str, blk64_t **list, size_t *len)
 	blk64_t *lst = *list;
 	size_t ln = *len;
 	char *tok, *p = optarg;
+	errcode_t retval;
 
 	while ((tok = strtok(p, ","))) {
 		blk64_t *l;
@@ -517,13 +518,19 @@ errcode_t read_list(const char *str, blk64_t **list, size_t *len)
 			y = strtoull(e + 1, NULL, 0);
 			if (errno)
 				return errno;
-		} else if (*e != 0)
-			return EINVAL;
-		if (y < x)
-			return EINVAL;
+		} else if (*e != 0) {
+			retval = EINVAL;
+			goto err;
+		}
+		if (y < x) {
+			retval = EINVAL;
+			goto err;
+		}
 		l = realloc(lst, sizeof(blk64_t) * (ln + y - x + 1));
-		if (l == NULL)
-			return ENOMEM;
+		if (l == NULL) {
+			retval = ENOMEM;
+			goto err;
+		}
 		lst = l;
 		for (; x <= y; x++)
 			lst[ln++] = x;
@@ -533,4 +540,7 @@ errcode_t read_list(const char *str, blk64_t **list, size_t *len)
 	*list = lst;
 	*len = ln;
 	return 0;
+err:
+	free(lst);
+	return retval;
 }

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH 01/34] e2fsck: offer to clear overlapping extents
  2014-09-13 22:11 ` [PATCH 01/34] e2fsck: offer to clear overlapping extents Darrick J. Wong
@ 2014-09-19  1:45   ` Theodore Ts'o
  0 siblings, 0 replies; 67+ messages in thread
From: Theodore Ts'o @ 2014-09-19  1:45 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Sat, Sep 13, 2014 at 03:11:19PM -0700, Darrick J. Wong wrote:
> If in the course of iterating extents we find that an otherwise
> valid-seeming second extent maps the same logical blocks as a
> previously examined first extent, offer to clear the duplicate
> mapping.
> 
> The test for this is already in f_extents.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Applied, thanks.

					- Ted

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 02/34] e2fsck: fix sliding the directory block down on bigalloc
  2014-09-13 22:11 ` [PATCH 02/34] e2fsck: fix sliding the directory block down on bigalloc Darrick J. Wong
@ 2014-09-19  1:45   ` Theodore Ts'o
  0 siblings, 0 replies; 67+ messages in thread
From: Theodore Ts'o @ 2014-09-19  1:45 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Sat, Sep 13, 2014 at 03:11:25PM -0700, Darrick J. Wong wrote:
> If we find a hole in a directory on a bigalloc filesystem, we need to
> obey the cluster alignment rules when collapsing the gap to avoid
> later complaints.
> 
> Specifically, the calculation of the new logical cluster number was
> incorrect, and we need to ensure that the logical cluster alignment
> respects the physical cluster alignment, since we've concluded that
> the extent's logical block number is wrong.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Applied, thanks.

					- Ted

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 03/34] misc: zero s_jnl_blocks when adding journal online or removing external journal
  2014-09-13 22:11 ` [PATCH 03/34] misc: zero s_jnl_blocks when adding journal online or removing external journal Darrick J. Wong
@ 2014-09-19  1:45   ` Theodore Ts'o
  0 siblings, 0 replies; 67+ messages in thread
From: Theodore Ts'o @ 2014-09-19  1:45 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4, thomas_reardon

On Sat, Sep 13, 2014 at 03:11:32PM -0700, Darrick J. Wong wrote:
> Erase s_jnl_blocks when removing an external journal, or adding an
> internal journal online.  We can't add the backup for the internal
> journal because we have no good way to get the indirect block or ETB
> addresses, so the best we can do is hope that the user runs e2fsck,
> which will correct that.  We are motivated to erase during external
> journal removal to state emphatically that there's no journal.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> Reported-by: thomas_reardon@hotmail.com

Applied, thanks.

					- Ted

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 35/34] e2fsck: free bh when descriptor block checksum fails
  2014-09-14 19:11   ` Eric Sandeen
@ 2014-09-19  1:46     ` Theodore Ts'o
  0 siblings, 0 replies; 67+ messages in thread
From: Theodore Ts'o @ 2014-09-19  1:46 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Darrick J. Wong, linux-ext4

On Sun, Sep 14, 2014 at 02:11:25PM -0500, Eric Sandeen wrote:
> On 9/14/14 12:19 PM, Darrick J. Wong wrote:
> > Free the buffer head if the journal descriptor block fails checksum
> > verification.  This has been patched before (see "e2fsck: free bh on
> > csum verify error in do_one_pass") but apparently the patch was never
> > committed to jbd2 in the kernel, so when we resync'd the recovery code
> > with 3.16, the bug came back.  Sigh.
> 
> Cool. 
> 
> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
> 
> thanks for fixing up the kernel side... I hadn't thought about that,
> obviously.  :(

Applied, thanks.

					- Ted

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 36/34] misc: fix Coverity complaints
  2014-09-18 19:09 ` [PATCH 36/34] misc: fix Coverity complaints Darrick J. Wong
@ 2014-09-19  1:47   ` Theodore Ts'o
  0 siblings, 0 replies; 67+ messages in thread
From: Theodore Ts'o @ 2014-09-19  1:47 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Thu, Sep 18, 2014 at 12:09:02PM -0700, Darrick J. Wong wrote:
> Fix a few problems that Coverity picked up with error handling.
> 
> Fixes-Coverity-Bug: 1239278
> Fixes-Coverity-Bug: 1239279
> Fixes-Coverity-Bug: 1239280
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Applied, thanks.

				- Ted

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 06/34] debugfs: add LIBINTL to debugfs link command
  2014-09-13 22:11 ` [PATCH 06/34] debugfs: add LIBINTL to debugfs link command Darrick J. Wong
@ 2014-09-19  4:46   ` Theodore Ts'o
  2014-10-17 21:07     ` Darrick J. Wong
  0 siblings, 1 reply; 67+ messages in thread
From: Theodore Ts'o @ 2014-09-19  4:46 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Sat, Sep 13, 2014 at 03:11:52PM -0700, Darrick J. Wong wrote:
> Since debugfs now links in the journal code (which in turn depends on
> internationalization libraries) we must add a linker option to pull
> that in on Mac OSX.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

The dependency on the internationalization libraries wasn't caused by
the journal code.  It was caused by create_inode.h pulling in
nls-enable.h indiscriminately.  The following should fix things (I'll
check this in alongside some other patches needed to allow e2fsprogs
to build under dietlibc):

diff --git a/misc/create_inode.c b/misc/create_inode.c
index 7f57979..cf7c097 100644
--- a/misc/create_inode.c
+++ b/misc/create_inode.c
@@ -17,6 +17,7 @@
 #endif
 
 #include "create_inode.h"
+#include "nls-enable.h"
 
 #if __STDC_VERSION__ < 199901L
 # if __GNUC__ >= 2
diff --git a/misc/create_inode.h b/misc/create_inode.h
index 067bf96..145fd57 100644
--- a/misc/create_inode.h
+++ b/misc/create_inode.h
@@ -7,7 +7,6 @@
 #include "et/com_err.h"
 #include "e2p/e2p.h"
 #include "ext2fs/ext2fs.h"
-#include "nls-enable.h"
 
 struct hdlink_s
 {
diff --git a/misc/mke2fs.c b/misc/mke2fs.c
index 69045b2..f09351d 100644
--- a/misc/mke2fs.c
+++ b/misc/mke2fs.c
@@ -58,6 +58,7 @@ extern int optind;
 #include "quota/quotaio.h"
 #include "mke2fs.h"
 #include "create_inode.h"
+#include "nls-enable.h"
 
 #define STRIDE_LENGTH 8
 

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH 05/34] debugfs: manage needs_recover feature when messing with the journal
  2014-09-13 22:11 ` [PATCH 05/34] debugfs: manage needs_recover feature when messing with the journal Darrick J. Wong
@ 2014-09-19  6:01   ` Theodore Ts'o
  0 siblings, 0 replies; 67+ messages in thread
From: Theodore Ts'o @ 2014-09-19  6:01 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Sat, Sep 13, 2014 at 03:11:45PM -0700, Darrick J. Wong wrote:
> Set the needs_recover incompat feature when debugfs writes journal
> transactions so that we actually replay the journal contents at the
> next mount.
> 
> Likewise, clear it if we successfully recover the journal.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Applied, thanks.

					- Ted

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 07/34] ext2fs: add readahead method to improve scanning
  2014-09-13 22:11 ` [PATCH 07/34] ext2fs: add readahead method to improve scanning Darrick J. Wong
@ 2014-09-19 16:15   ` Theodore Ts'o
  0 siblings, 0 replies; 67+ messages in thread
From: Theodore Ts'o @ 2014-09-19 16:15 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4, Andreas Dilger

On Sat, Sep 13, 2014 at 03:11:58PM -0700, Darrick J. Wong wrote:
> Frøm: Andreas Dilger <adilger@whamcloud.com>
> 
> Add a readahead method for prefetching ranges of disk blocks.  This is
> useful for inode table scanning, and other large contiguous ranges of
> blocks, and may also prove useful for random block prefetch, since it
> will allow reordering of the IO without waiting synchronously for the
> reads to complete.
> 
> It is currently using the posix_fadvise(POSIX_FADV_WILLNEED)
> interface, as this proved most efficient during our testing.
> 
> [darrick.wong@oracle.com]
> Make the arguments to the readahead function take the same ULL values
> as the other IO functions, and return an appropriate error code when
> fadvise isn't available.
> 
> v2: Plumb in test_io.c for cache readahead.
> 
> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Thanks, applied.

					- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 10/34] dumpe2fs: provide a machine-readable group-only mode
  2014-09-13 22:12 ` [PATCH 10/34] dumpe2fs: provide a machine-readable group-only mode Darrick J. Wong
@ 2014-09-19 16:17   ` Theodore Ts'o
  0 siblings, 0 replies; 67+ messages in thread
From: Theodore Ts'o @ 2014-09-19 16:17 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Sat, Sep 13, 2014 at 03:12:18PM -0700, Darrick J. Wong wrote:
> Spit out just the group descriptor data in a machine readable format.
> This is most useful for testing and scripting purposes.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Applied, thanks.

					- Ted

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 11/34] dumpe2fs: output cleanup
  2014-09-13 22:12 ` [PATCH 11/34] dumpe2fs: output cleanup Darrick J. Wong
@ 2014-09-19 16:22   ` Theodore Ts'o
  2014-09-19 20:00     ` Darrick J. Wong
  0 siblings, 1 reply; 67+ messages in thread
From: Theodore Ts'o @ 2014-09-19 16:22 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4, TR Reardon

On Sat, Sep 13, 2014 at 03:12:26PM -0700, Darrick J. Wong wrote:
> Don't display unused inodes twice, and make it clear that we're
> printing a descriptor checksum.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> Cc: TR Reardon <thomas_reardon@hotmail.com>

One problem with the current output format is that exceeds the 80
character line limit pretty blatently:

Group 3: (Blocks 24577-32768) [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
  Checksum 0x5bd9, unused inodes 2048
  Backup superblock at 24577, Group descriptors at 24578-24578
  Reserved GDT blocks at 24579-24833
  Block bitmap at 261 (bg #0 + 260), csum 0x00000000, Inode bitmap at 269 (bg #0 + 268), csum 0x00000000
  Inode table at 1042-1297 (bg #0 + 1041)
  7935 free blocks, 2048 free inodes, 0 directories, 2048 unused inodes
  Free blocks: 24834-32768
  Free inodes: 6145-8192

If we are printing the checksum, we probably need to insert a line
break and indent before printing the Inode bitmap.  Does that seem
reasonable?

						- Ted

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 28/34] tests: test debugfs punch command
  2014-09-13 22:14 ` [PATCH 28/34] tests: test debugfs punch command Darrick J. Wong
@ 2014-09-19 16:26   ` Theodore Ts'o
  2014-09-19 20:01     ` Darrick J. Wong
  0 siblings, 1 reply; 67+ messages in thread
From: Theodore Ts'o @ 2014-09-19 16:26 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Sat, Sep 13, 2014 at 03:14:23PM -0700, Darrick J. Wong wrote:
> Test punching out various parts of sparse files.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  tests/f_punch/expect          |  143 +++++++++++++++++++++++++++++++++++++++++
>  tests/f_punch/name            |    1 
>  tests/f_punch/script          |  129 +++++++++++++++++++++++++++++++++++++
>  tests/f_punch_bigalloc/expect |  142 +++++++++++++++++++++++++++++++++++++++++
>  tests/f_punch_bigalloc/name   |    1 
>  tests/f_punch_bigalloc/script |  130 +++++++++++++++++++++++++++++++++++++

Even though these tests are using e2fsck, if the primary goal is to
test debugfs, these should probably renamed d_punch and
d_punch_bigalloc.

Cheers,

					- Ted

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 11/34] dumpe2fs: output cleanup
  2014-09-19 16:22   ` Theodore Ts'o
@ 2014-09-19 20:00     ` Darrick J. Wong
  2014-10-13 18:04       ` Darrick J. Wong
  0 siblings, 1 reply; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-19 20:00 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: linux-ext4, TR Reardon

On Fri, Sep 19, 2014 at 12:22:00PM -0400, Theodore Ts'o wrote:
> On Sat, Sep 13, 2014 at 03:12:26PM -0700, Darrick J. Wong wrote:
> > Don't display unused inodes twice, and make it clear that we're
> > printing a descriptor checksum.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > Cc: TR Reardon <thomas_reardon@hotmail.com>
> 
> One problem with the current output format is that exceeds the 80
> character line limit pretty blatently:
> 
> Group 3: (Blocks 24577-32768) [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
>   Checksum 0x5bd9, unused inodes 2048
>   Backup superblock at 24577, Group descriptors at 24578-24578
>   Reserved GDT blocks at 24579-24833
>   Block bitmap at 261 (bg #0 + 260), csum 0x00000000, Inode bitmap at 269 (bg #0 + 268), csum 0x00000000
>   Inode table at 1042-1297 (bg #0 + 1041)
>   7935 free blocks, 2048 free inodes, 0 directories, 2048 unused inodes
>   Free blocks: 24834-32768
>   Free inodes: 6145-8192
> 
> If we are printing the checksum, we probably need to insert a line
> break and indent before printing the Inode bitmap.  Does that seem
> reasonable?

Seems fine to me, since other BG fields get their own line anyway.  Do you want
me to make a(nother) patch, or have you already fixed this up in git?

--D
> 
> 						- Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 28/34] tests: test debugfs punch command
  2014-09-19 16:26   ` Theodore Ts'o
@ 2014-09-19 20:01     ` Darrick J. Wong
  0 siblings, 0 replies; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-19 20:01 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: linux-ext4

On Fri, Sep 19, 2014 at 12:26:32PM -0400, Theodore Ts'o wrote:
> On Sat, Sep 13, 2014 at 03:14:23PM -0700, Darrick J. Wong wrote:
> > Test punching out various parts of sparse files.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  tests/f_punch/expect          |  143 +++++++++++++++++++++++++++++++++++++++++
> >  tests/f_punch/name            |    1 
> >  tests/f_punch/script          |  129 +++++++++++++++++++++++++++++++++++++
> >  tests/f_punch_bigalloc/expect |  142 +++++++++++++++++++++++++++++++++++++++++
> >  tests/f_punch_bigalloc/name   |    1 
> >  tests/f_punch_bigalloc/script |  130 +++++++++++++++++++++++++++++++++++++
> 
> Even though these tests are using e2fsck, if the primary goal is to
> test debugfs, these should probably renamed d_punch and
> d_punch_bigalloc.

Oh.  d==debugfs and f==fsck.  For some reason I had thought 'f' implied 'file'
tests.  Want me to resend with that chnaged?

--D

> 
> Cheers,
> 
> 					- Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 12/34] misc: move check_plausibility into a separate file
  2014-09-13 22:12 ` [PATCH 12/34] misc: move check_plausibility into a separate file Darrick J. Wong
@ 2014-09-19 22:16   ` Theodore Ts'o
  0 siblings, 0 replies; 67+ messages in thread
From: Theodore Ts'o @ 2014-09-19 22:16 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Sat, Sep 13, 2014 at 03:12:33PM -0700, Darrick J. Wong wrote:
> Move check_plausibility() into a separate file so that various
> programs can use it without having to declare useless global variables
> that the util.c functions seem to require.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Thanks, applied.

					- Ted

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 13/34] misc: add plausibility checks to debugfs/tune2fs/dumpe2fs/e2fsck
  2014-09-13 22:12 ` [PATCH 13/34] misc: add plausibility checks to debugfs/tune2fs/dumpe2fs/e2fsck Darrick J. Wong
@ 2014-09-19 23:00   ` Theodore Ts'o
  0 siblings, 0 replies; 67+ messages in thread
From: Theodore Ts'o @ 2014-09-19 23:00 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Sat, Sep 13, 2014 at 03:12:39PM -0700, Darrick J. Wong wrote:
> If any of these utilities detect a bad superblock magic, call
> check_plausibility to see if blkid can identify the passed-in argument
> as something else (xfs, partition, etc.) in the hopes of catching a
> user error.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Applied, thanks.

						- Ted

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 14/34] misc: use libmagic when libblkid can't identify something
  2014-09-13 22:12 ` [PATCH 14/34] misc: use libmagic when libblkid can't identify something Darrick J. Wong
@ 2014-09-21  5:29   ` Theodore Ts'o
  0 siblings, 0 replies; 67+ messages in thread
From: Theodore Ts'o @ 2014-09-21  5:29 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Sat, Sep 13, 2014 at 03:12:46PM -0700, Darrick J. Wong wrote:
> If we're using check_plausibility() to try to identify something that
> obviously isn't an ext* filesystem and libblkid doesn't know what it
> is, try libmagic instead.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Thanks, applied.

					- Ted

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 15/34] libext2fs: support BLKZEROOUT/FALLOC_FL_ZERO_RANGE in ext2fs_zero_blocks
  2014-09-13 22:12 ` [PATCH 15/34] libext2fs: support BLKZEROOUT/FALLOC_FL_ZERO_RANGE in ext2fs_zero_blocks Darrick J. Wong
@ 2014-09-22  2:51   ` Theodore Ts'o
  2014-09-29 18:58     ` Darrick J. Wong
  2014-10-14  2:58   ` Darrick J. Wong
  2014-10-18 16:32   ` Theodore Ts'o
  2 siblings, 1 reply; 67+ messages in thread
From: Theodore Ts'o @ 2014-09-22  2:51 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Sat, Sep 13, 2014 at 03:12:53PM -0700, Darrick J. Wong wrote:
> +#if defined(HAVE_FALLOCATE) && defined(FALLOC_FL_ZERO_RANGE)
> +		int flag = FALLOC_FL_ZERO_RANGE;
> +		struct stat statbuf;
> +
> +		/*
> +		 * If we're trying to zero a range past the end of the file,
> +		 * just use regular fallocate to get there, because zeroing
> +		 * a range past EOF does not extend the file.
> +		 */

If we are operating on a regular file (for example, "mkfs.ext4
/tmp/foo.img 64M") we want to keep the file a sparse one; so if we are
trying to zero a range past the end of the file, it should be
sufficient simply use trucate to set i_size.  In fact, if we can use
FALLOC_FL_PUNCH on the regular file, we should try to use that
instead, I would think.

					- Ted

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 15/34] libext2fs: support BLKZEROOUT/FALLOC_FL_ZERO_RANGE in ext2fs_zero_blocks
  2014-09-22  2:51   ` Theodore Ts'o
@ 2014-09-29 18:58     ` Darrick J. Wong
  0 siblings, 0 replies; 67+ messages in thread
From: Darrick J. Wong @ 2014-09-29 18:58 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: linux-ext4

On Sun, Sep 21, 2014 at 10:51:09PM -0400, Theodore Ts'o wrote:
> On Sat, Sep 13, 2014 at 03:12:53PM -0700, Darrick J. Wong wrote:
> > +#if defined(HAVE_FALLOCATE) && defined(FALLOC_FL_ZERO_RANGE)
> > +		int flag = FALLOC_FL_ZERO_RANGE;
> > +		struct stat statbuf;
> > +
> > +		/*
> > +		 * If we're trying to zero a range past the end of the file,
> > +		 * just use regular fallocate to get there, because zeroing
> > +		 * a range past EOF does not extend the file.
> > +		 */
> 
> If we are operating on a regular file (for example, "mkfs.ext4
> /tmp/foo.img 64M") we want to keep the file a sparse one; so if we are
> trying to zero a range past the end of the file, it should be
> sufficient simply use trucate to set i_size.  In fact, if we can use
> FALLOC_FL_PUNCH on the regular file, we should try to use that
> instead, I would think.

I thought about making file-backed zero-out a simple truncate/punch operation,
since it would get us the results we want.  However, I had a look at what the
kernel's discard and zeroout implementations do for block devices, and came up
with:

discard: unprovision, may or may not return zeroes
zeroout: provision, return zeroes

(mkp is thinking about a zeroout that guarantees the zeroes but unprovisions if
possible a la FS hole punching, but we're not there yet.)

The users of the zero_blocks call (which uses this zeroout primitive) are
generally looking to clean off blocks in anticipation of them being written in
the near future so (to me) it makes more sense that after the call completes,
the block range has storage allocated to it.

Therefore, I took this approach to anticipate the needs of the callers and to
ensure that the side effects on the storage would be consistent between block
devices and file images.

(Of course, the user-visible effect is the same between the two approaches so I
don't really have a problem changing it.)

--D

> 
> 					- Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 16/34] libext2fs/e2fsck: refactor everyone who writes zero blocks to disk
  2014-09-13 22:12 ` [PATCH 16/34] libext2fs/e2fsck: refactor everyone who writes zero blocks to disk Darrick J. Wong
@ 2014-10-13 10:09   ` Theodore Ts'o
  2014-10-13 17:09     ` Darrick J. Wong
  0 siblings, 1 reply; 67+ messages in thread
From: Theodore Ts'o @ 2014-10-13 10:09 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Sat, Sep 13, 2014 at 03:12:59PM -0700, Darrick J. Wong wrote:
> Convert all call sites that write zero blocks to disk to use
> ext2fs_zero_blocks2() since it can use Linux's zero out feature to do
> the writes more quickly.  Reclaim the zero buffer at freefs time and
> make the write-zeroes fallback use a larger buffer.

This patch doesn't actually convert Linux to use the zero out feature,
and I'm not entirely sure how much benefit this is going to actually
give us, since in most of the places which you are converting to use
ext2fs_zero_blocks2() is only zero'ing a block or two.

On the cost side of the equation, the first time we try to zero out a
single 4k block, this patch causes us to ignore the block allocated
and passed into ext2fs_alloc_block2(), and instead allocate a 4
megabyte buffer which is used only for ext2fs_zero_blocks2, which is
not released until e2fsck/mke2fs/resize2fs exits.

If we had reliable trim/discard that was guaranteed to zero a block
and would never be dropped by the storage device, then maybe it would
be worth it, but as it is, the only real benefit I see from this patch
is the fact that patch results in the deletion of 84 lines of code.

Maybe it would be worth it to add a ext2fs_zero_blocks3() which takes
an optional temporary buffer, much like the other if we really want to
do the code refactor?

					- Ted

P.S.  Did you really see a speedup in using a 4MB zero block buffer,
instead of a 32k block buffer?  The reason why I had chosen a stride
length of 8 was that some ten years ago, using hardware I had at my
disposal, using a larger zero buffer really didn't improve performance
any.  I'm sure that things have changed since then, but on what
systems were you testing that motivated going to a 4 meg buffer?

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 17/34] libext2fs: support allocating uninit blocks in bmap2()
  2014-09-13 22:13 ` [PATCH 17/34] libext2fs: support allocating uninit blocks in bmap2() Darrick J. Wong
@ 2014-10-13 14:35   ` Theodore Ts'o
  2014-10-13 16:56     ` Darrick J. Wong
  0 siblings, 1 reply; 67+ messages in thread
From: Theodore Ts'o @ 2014-10-13 14:35 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Sat, Sep 13, 2014 at 03:13:06PM -0700, Darrick J. Wong wrote:
> In order to support fallocate, we need to be able to have
> ext2fs_bmap2() allocate blocks and put them into uninitialized
> extents.  There's a flag to do this in the extent code, but it's not
> exposed to the bmap2 interface, so plumb that in.  Eventually
> fallocate or fuse2fs or somebody will use it.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  lib/ext2fs/bmap.c   |   24 ++++++++++++++++++++++--
>  lib/ext2fs/ext2fs.h |    1 +
>  2 files changed, 23 insertions(+), 2 deletions(-)
> 
> 
> diff --git a/lib/ext2fs/bmap.c b/lib/ext2fs/bmap.c
> index c1d0e6f..a4dc8ef 100644
> --- a/lib/ext2fs/bmap.c
> +++ b/lib/ext2fs/bmap.c
> @@ -72,6 +72,11 @@ static _BMAP_INLINE_ errcode_t block_ind_bmap(ext2_filsys fs, int flags,
>  					    block_buf + fs->blocksize, &b);
>  		if (retval)
>  			return retval;
> +		if (flags & BMAP_UNINIT) {
> +			retval = ext2fs_zero_blocks2(fs, b, 1, NULL, NULL);
> +			if (retval)
> +				return retval;
> +		}

What I think we should do is to have two separate new BMAP_ flags;
BMAP_UNINIT, which sets the uninit bit, and BMAP_ZERO, which requests
that the block be zeroed.  I don't think it should follow that whe you
set the uninit bit via the libext2fs, the block wil automatically be
zeroed.  After all, userspace can't assume that if the uninit bit is
set, that the block will be pre-zeroed, since files fallocated by the
kernel won't meet that guarantee.

					- Ted

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 17/34] libext2fs: support allocating uninit blocks in bmap2()
  2014-10-13 14:35   ` Theodore Ts'o
@ 2014-10-13 16:56     ` Darrick J. Wong
  2014-10-13 18:34       ` Darrick J. Wong
  0 siblings, 1 reply; 67+ messages in thread
From: Darrick J. Wong @ 2014-10-13 16:56 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: linux-ext4

On Mon, Oct 13, 2014 at 10:35:26AM -0400, Theodore Ts'o wrote:
> On Sat, Sep 13, 2014 at 03:13:06PM -0700, Darrick J. Wong wrote:
> > In order to support fallocate, we need to be able to have
> > ext2fs_bmap2() allocate blocks and put them into uninitialized
> > extents.  There's a flag to do this in the extent code, but it's not
> > exposed to the bmap2 interface, so plumb that in.  Eventually
> > fallocate or fuse2fs or somebody will use it.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  lib/ext2fs/bmap.c   |   24 ++++++++++++++++++++++--
> >  lib/ext2fs/ext2fs.h |    1 +
> >  2 files changed, 23 insertions(+), 2 deletions(-)
> > 
> > 
> > diff --git a/lib/ext2fs/bmap.c b/lib/ext2fs/bmap.c
> > index c1d0e6f..a4dc8ef 100644
> > --- a/lib/ext2fs/bmap.c
> > +++ b/lib/ext2fs/bmap.c
> > @@ -72,6 +72,11 @@ static _BMAP_INLINE_ errcode_t block_ind_bmap(ext2_filsys fs, int flags,
> >  					    block_buf + fs->blocksize, &b);
> >  		if (retval)
> >  			return retval;
> > +		if (flags & BMAP_UNINIT) {
> > +			retval = ext2fs_zero_blocks2(fs, b, 1, NULL, NULL);
> > +			if (retval)
> > +				return retval;
> > +		}
> 
> What I think we should do is to have two separate new BMAP_ flags;
> BMAP_UNINIT, which sets the uninit bit, and BMAP_ZERO, which requests
> that the block be zeroed.  I don't think it should follow that whe you
> set the uninit bit via the libext2fs, the block wil automatically be
> zeroed.  After all, userspace can't assume that if the uninit bit is
> set, that the block will be pre-zeroed, since files fallocated by the
> kernel won't meet that guarantee.

On an extent based file, we can record the BLOCK_UNINIT status in the extent
flags so that subsequent reads return zeroes.  On a block mapped file it's not
possible to record the uninitialized status (short of unmapping the block), so
here I was trying to emulate the read behavior you'd get with an extent file.

Kernel fallocate() refuses to service non-extent files, so there's not much
precedent there unless you want to block BMAP_UNINIT on such files.

So... I agree that BMAP_ZERO would be a useful feature anyway.  My question is,
if we pass in a non-extent file with BMAP_UNINIT but not BMAP_ZERO, should we
simply return -EINVAL?

--D

> 
> 					- Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 16/34] libext2fs/e2fsck: refactor everyone who writes zero blocks to disk
  2014-10-13 10:09   ` Theodore Ts'o
@ 2014-10-13 17:09     ` Darrick J. Wong
  0 siblings, 0 replies; 67+ messages in thread
From: Darrick J. Wong @ 2014-10-13 17:09 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: linux-ext4

On Mon, Oct 13, 2014 at 06:09:03AM -0400, Theodore Ts'o wrote:
> On Sat, Sep 13, 2014 at 03:12:59PM -0700, Darrick J. Wong wrote:
> > Convert all call sites that write zero blocks to disk to use
> > ext2fs_zero_blocks2() since it can use Linux's zero out feature to do
> > the writes more quickly.  Reclaim the zero buffer at freefs time and
> > make the write-zeroes fallback use a larger buffer.
> 
> This patch doesn't actually convert Linux to use the zero out feature,

Assuming you meant '...convert e2fsprogs to use...'?

(You're right, it converts e2fsprogs to use ext2fs_zero_blocks(), which at this
point in the patch series might call BLKZEROOUT.)

> and I'm not entirely sure how much benefit this is going to actually
> give us, since in most of the places which you are converting to use
> ext2fs_zero_blocks2() is only zero'ing a block or two.
> 
> On the cost side of the equation, the first time we try to zero out a
> single 4k block, this patch causes us to ignore the block allocated
> and passed into ext2fs_alloc_block2(), and instead allocate a 4
> megabyte buffer which is used only for ext2fs_zero_blocks2, which is
> not released until e2fsck/mke2fs/resize2fs exits.
> 
> If we had reliable trim/discard that was guaranteed to zero a block
> and would never be dropped by the storage device, then maybe it would
> be worth it, but as it is, the only real benefit I see from this patch
> is the fact that patch results in the deletion of 84 lines of code.

I'm not calling discard/trim, I'm calling blkzeroout or (in the next patchbomb
rev) file punch.  Zero-out is supposed to be mandatory, unlike its flakey
cousin discard.  Right?

> Maybe it would be worth it to add a ext2fs_zero_blocks3() which takes
> an optional temporary buffer, much like the other if we really want to
> do the code refactor?

Yes, I think that would be a good idea -- only allocate the static buffer if
the caller declines to provide one.

> 					- Ted
> 
> P.S.  Did you really see a speedup in using a 4MB zero block buffer,
> instead of a 32k block buffer?  The reason why I had chosen a stride
> length of 8 was that some ten years ago, using hardware I had at my
> disposal, using a larger zero buffer really didn't improve performance
> any.  I'm sure that things have changed since then, but on what
> systems were you testing that motivated going to a 4 meg buffer?

A bunch of (probably crummy) consumer grade SSDs.  I suspect the erase size is
4MB, or at least a few megabytes. :)

I also noticed that my throwaway RAID0 (stripe size 512K) got faster at mkfs
time since it could issue IO to multiple disks simultaneously.

--D

> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 11/34] dumpe2fs: output cleanup
  2014-09-19 20:00     ` Darrick J. Wong
@ 2014-10-13 18:04       ` Darrick J. Wong
  0 siblings, 0 replies; 67+ messages in thread
From: Darrick J. Wong @ 2014-10-13 18:04 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: linux-ext4, TR Reardon

On Fri, Sep 19, 2014 at 01:00:05PM -0700, Darrick J. Wong wrote:
> On Fri, Sep 19, 2014 at 12:22:00PM -0400, Theodore Ts'o wrote:
> > On Sat, Sep 13, 2014 at 03:12:26PM -0700, Darrick J. Wong wrote:
> > > Don't display unused inodes twice, and make it clear that we're
> > > printing a descriptor checksum.
> > > 
> > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > Cc: TR Reardon <thomas_reardon@hotmail.com>
> > 
> > One problem with the current output format is that exceeds the 80
> > character line limit pretty blatently:
> > 
> > Group 3: (Blocks 24577-32768) [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
> >   Checksum 0x5bd9, unused inodes 2048
> >   Backup superblock at 24577, Group descriptors at 24578-24578
> >   Reserved GDT blocks at 24579-24833
> >   Block bitmap at 261 (bg #0 + 260), csum 0x00000000, Inode bitmap at 269 (bg #0 + 268), csum 0x00000000
> >   Inode table at 1042-1297 (bg #0 + 1041)
> >   7935 free blocks, 2048 free inodes, 0 directories, 2048 unused inodes
> >   Free blocks: 24834-32768
> >   Free inodes: 6145-8192
> > 
> > If we are printing the checksum, we probably need to insert a line
> > break and indent before printing the Inode bitmap.  Does that seem
> > reasonable?
> 
> Seems fine to me, since other BG fields get their own line anyway.  Do you want
> me to make a(nother) patch, or have you already fixed this up in git?

Done, but it's a huge patch (~4700 lines) on account of having to change a lot
of testcases' expect files.

I could change the patch to emit the newline before the inode bitmap line only
if metadata_csum is present.  It would make the output less consistent, though.
What do you think, Ted?

--D

> 
> --D
> > 
> > 						- Ted
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 17/34] libext2fs: support allocating uninit blocks in bmap2()
  2014-10-13 16:56     ` Darrick J. Wong
@ 2014-10-13 18:34       ` Darrick J. Wong
  0 siblings, 0 replies; 67+ messages in thread
From: Darrick J. Wong @ 2014-10-13 18:34 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: linux-ext4

On Mon, Oct 13, 2014 at 09:56:53AM -0700, Darrick J. Wong wrote:
> On Mon, Oct 13, 2014 at 10:35:26AM -0400, Theodore Ts'o wrote:
> > On Sat, Sep 13, 2014 at 03:13:06PM -0700, Darrick J. Wong wrote:
> > > In order to support fallocate, we need to be able to have
> > > ext2fs_bmap2() allocate blocks and put them into uninitialized
> > > extents.  There's a flag to do this in the extent code, but it's not
> > > exposed to the bmap2 interface, so plumb that in.  Eventually
> > > fallocate or fuse2fs or somebody will use it.
> > > 
> > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > ---
> > >  lib/ext2fs/bmap.c   |   24 ++++++++++++++++++++++--
> > >  lib/ext2fs/ext2fs.h |    1 +
> > >  2 files changed, 23 insertions(+), 2 deletions(-)
> > > 
> > > 
> > > diff --git a/lib/ext2fs/bmap.c b/lib/ext2fs/bmap.c
> > > index c1d0e6f..a4dc8ef 100644
> > > --- a/lib/ext2fs/bmap.c
> > > +++ b/lib/ext2fs/bmap.c
> > > @@ -72,6 +72,11 @@ static _BMAP_INLINE_ errcode_t block_ind_bmap(ext2_filsys fs, int flags,
> > >  					    block_buf + fs->blocksize, &b);
> > >  		if (retval)
> > >  			return retval;
> > > +		if (flags & BMAP_UNINIT) {
> > > +			retval = ext2fs_zero_blocks2(fs, b, 1, NULL, NULL);
> > > +			if (retval)
> > > +				return retval;
> > > +		}
> > 
> > What I think we should do is to have two separate new BMAP_ flags;
> > BMAP_UNINIT, which sets the uninit bit, and BMAP_ZERO, which requests
> > that the block be zeroed.  I don't think it should follow that whe you
> > set the uninit bit via the libext2fs, the block wil automatically be
> > zeroed.  After all, userspace can't assume that if the uninit bit is
> > set, that the block will be pre-zeroed, since files fallocated by the
> > kernel won't meet that guarantee.
> 
> On an extent based file, we can record the BLOCK_UNINIT status in the extent
> flags so that subsequent reads return zeroes.  On a block mapped file it's not
> possible to record the uninitialized status (short of unmapping the block), so
> here I was trying to emulate the read behavior you'd get with an extent file.
> 
> Kernel fallocate() refuses to service non-extent files, so there's not much
> precedent there unless you want to block BMAP_UNINIT on such files.
> 
> So... I agree that BMAP_ZERO would be a useful feature anyway.  My question is,
> if we pass in a non-extent file with BMAP_UNINIT but not BMAP_ZERO, should we
> simply return -EINVAL?

Never mind, now that I thought more about the implementation, I think we can
trust callers to DTRT.

Alternately, this enables that fallocate-but-dont-ever-zero use case which
keeps popping up, so it would seem to have /some/ value to some people.

--D

> --D
> 
> > 
> > 					- Ted
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 15/34] libext2fs: support BLKZEROOUT/FALLOC_FL_ZERO_RANGE in ext2fs_zero_blocks
  2014-09-13 22:12 ` [PATCH 15/34] libext2fs: support BLKZEROOUT/FALLOC_FL_ZERO_RANGE in ext2fs_zero_blocks Darrick J. Wong
  2014-09-22  2:51   ` Theodore Ts'o
@ 2014-10-14  2:58   ` Darrick J. Wong
  2014-10-18 16:32   ` Theodore Ts'o
  2 siblings, 0 replies; 67+ messages in thread
From: Darrick J. Wong @ 2014-10-14  2:58 UTC (permalink / raw)
  To: tytso; +Cc: linux-ext4

On Sat, Sep 13, 2014 at 03:12:53PM -0700, Darrick J. Wong wrote:
> Plumb a new call into the IO manager to support translating
> ext2fs_zero_blocks calls into the equivalent kernel-level BLKZEROOUT
> ioctl or FALLOC_FL_ZERO_RANGE fallocate flag primitives when possible.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  contrib/fallocate.c     |   14 +++++++++
>  lib/ext2fs/ext2_io.h    |    7 ++++-
>  lib/ext2fs/io_manager.c |   11 +++++++
>  lib/ext2fs/mkjournal.c  |    6 ++++
>  lib/ext2fs/test_io.c    |   21 ++++++++++++++
>  lib/ext2fs/unix_io.c    |   71 +++++++++++++++++++++++++++++++++++++++++++++++
>  6 files changed, 128 insertions(+), 2 deletions(-)
> 
> 
> diff --git a/contrib/fallocate.c b/contrib/fallocate.c
> index 1f9b59a..01d4af7 100644
> --- a/contrib/fallocate.c
> +++ b/contrib/fallocate.c
> @@ -36,6 +36,8 @@
>  // #include <linux/falloc.h>
>  #define FALLOC_FL_KEEP_SIZE	0x01
>  #define FALLOC_FL_PUNCH_HOLE	0x02 /* de-allocates range */
> +#define FALLOC_FL_COLLAPSE_RANGE	0x08
> +#define FALLOC_FL_ZERO_RANGE		0x10
>  
>  void usage(void)
>  {
> @@ -95,7 +97,7 @@ int main(int argc, char **argv)
>  	int	error;
>  	int	tflag = 0;
>  
> -	while ((opt = getopt(argc, argv, "npl:o:t")) != -1) {
> +	while ((opt = getopt(argc, argv, "npl:o:tzc")) != -1) {
>  		switch(opt) {
>  		case 'n':
>  			/* do not change filesize */
> @@ -106,6 +108,16 @@ int main(int argc, char **argv)
>  			falloc_mode = (FALLOC_FL_PUNCH_HOLE |
>  				       FALLOC_FL_KEEP_SIZE);
>  			break;
> +		case 'c':
> +			/* collapse range mode */
> +			falloc_mode = (FALLOC_FL_COLLAPSE_RANGE |
> +				       FALLOC_FL_KEEP_SIZE);
> +			break;
> +		case 'z':
> +			/* zero range mode */
> +			falloc_mode = (FALLOC_FL_ZERO_RANGE |
> +				       FALLOC_FL_KEEP_SIZE);
> +			break;
>  		case 'l':
>  			length = cvtnum(optarg);
>  			break;
> diff --git a/lib/ext2fs/ext2_io.h b/lib/ext2fs/ext2_io.h
> index 4c5a5c5..1faa720 100644
> --- a/lib/ext2fs/ext2_io.h
> +++ b/lib/ext2fs/ext2_io.h
> @@ -93,7 +93,9 @@ struct struct_io_manager {
>  	errcode_t (*cache_readahead)(io_channel channel,
>  				     unsigned long long block,
>  				     unsigned long long count);
> -	long	reserved[15];
> +	errcode_t (*zeroout)(io_channel channel, unsigned long long block,
> +			     unsigned long long count);
> +	long	reserved[14];
>  };
>  
>  #define IO_FLAG_RW		0x0001
> @@ -125,6 +127,9 @@ extern errcode_t io_channel_write_blk64(io_channel channel,
>  extern errcode_t io_channel_discard(io_channel channel,
>  				    unsigned long long block,
>  				    unsigned long long count);
> +extern errcode_t io_channel_zeroout(io_channel channel,
> +				    unsigned long long block,
> +				    unsigned long long count);
>  extern errcode_t io_channel_alloc_buf(io_channel channel,
>  				      int count, void *ptr);
>  extern errcode_t io_channel_cache_readahead(io_channel io,
> diff --git a/lib/ext2fs/io_manager.c b/lib/ext2fs/io_manager.c
> index dc5888d..c395d61 100644
> --- a/lib/ext2fs/io_manager.c
> +++ b/lib/ext2fs/io_manager.c
> @@ -112,6 +112,17 @@ errcode_t io_channel_discard(io_channel channel, unsigned long long block,
>  	return EXT2_ET_UNIMPLEMENTED;
>  }
>  
> +errcode_t io_channel_zeroout(io_channel channel, unsigned long long block,
> +			     unsigned long long count)
> +{
> +	EXT2_CHECK_MAGIC(channel, EXT2_ET_MAGIC_IO_CHANNEL);
> +
> +	if (channel->manager->zeroout)
> +		return (channel->manager->zeroout)(channel, block, count);
> +
> +	return EXT2_ET_UNIMPLEMENTED;
> +}
> +
>  errcode_t io_channel_alloc_buf(io_channel io, int count, void *ptr)
>  {
>  	size_t	size;
> diff --git a/lib/ext2fs/mkjournal.c b/lib/ext2fs/mkjournal.c
> index 6f3a862..5be425c 100644
> --- a/lib/ext2fs/mkjournal.c
> +++ b/lib/ext2fs/mkjournal.c
> @@ -164,6 +164,12 @@ errcode_t ext2fs_zero_blocks2(ext2_filsys fs, blk64_t blk, int num,
>  		}
>  		return 0;
>  	}
> +
> +	/* Try a zero out command, if supported */
> +	retval = io_channel_zeroout(fs->io, blk, num);
> +	if (retval == 0)
> +		return 0;
> +
>  	/* Allocate the zeroizing buffer if necessary */
>  	if (!buf) {
>  		buf = malloc(fs->blocksize * STRIDE_LENGTH);
> diff --git a/lib/ext2fs/test_io.c b/lib/ext2fs/test_io.c
> index b03a939..f7c50d1 100644
> --- a/lib/ext2fs/test_io.c
> +++ b/lib/ext2fs/test_io.c
> @@ -86,6 +86,7 @@ void (*test_io_cb_write_byte)
>  #define TEST_FLAG_SET_OPTION		0x20
>  #define TEST_FLAG_DISCARD		0x40
>  #define TEST_FLAG_READAHEAD		0x80
> +#define TEST_FLAG_ZEROOUT		0x100
>  
>  static void test_dump_block(io_channel channel,
>  			    struct test_private_data *data,
> @@ -507,6 +508,25 @@ static errcode_t test_cache_readahead(io_channel channel,
>  	return retval;
>  }
>  
> +static errcode_t test_zeroout(io_channel channel, unsigned long long block,
> +			      unsigned long long count)
> +{
> +	struct test_private_data *data;
> +	errcode_t	retval = 0;
> +
> +	EXT2_CHECK_MAGIC(channel, EXT2_ET_MAGIC_IO_CHANNEL);
> +	data = (struct test_private_data *) channel->private_data;
> +	EXT2_CHECK_MAGIC(data, EXT2_ET_MAGIC_TEST_IO_CHANNEL);
> +
> +	if (data->real)
> +		retval = io_channel_zeroout(data->real, block, count);
> +	if (data->flags & TEST_FLAG_ZEROOUT)
> +		fprintf(data->outfile,
> +			"Test_io: zeroout(%llu, %llu) returned %s\n",
> +			block, count, retval ? error_message(retval) : "OK");
> +	return retval;
> +}
> +
>  static struct struct_io_manager struct_test_manager = {
>  	.magic		= EXT2_ET_MAGIC_IO_MANAGER,
>  	.name		= "Test I/O Manager",
> @@ -523,6 +543,7 @@ static struct struct_io_manager struct_test_manager = {
>  	.write_blk64	= test_write_blk64,
>  	.discard	= test_discard,
>  	.cache_readahead	= test_cache_readahead,
> +	.zeroout	= test_zeroout,
>  };
>  
>  io_manager test_io_manager = &struct_test_manager;
> diff --git a/lib/ext2fs/unix_io.c b/lib/ext2fs/unix_io.c
> index 189adce..20e5b64 100644
> --- a/lib/ext2fs/unix_io.c
> +++ b/lib/ext2fs/unix_io.c
> @@ -986,6 +986,76 @@ unimplemented:
>  	return EXT2_ET_UNIMPLEMENTED;
>  }
>  
> +#if defined(__linux__) && !defined(BLKZEROOUT)
> +#define BLKZEROOUT		_IO(0x12, 127)
> +#endif
> +
> +#if defined(__linux__) && !defined(FALLOC_FL_ZERO_RANGE)
> +#define FALLOC_FL_ZERO_RANGE    0x10
> +#endif
> +
> +static errcode_t unix_zeroout(io_channel channel, unsigned long long block,
> +			      unsigned long long count)
> +{
> +	struct unix_private_data *data;
> +	int		ret;
> +
> +	EXT2_CHECK_MAGIC(channel, EXT2_ET_MAGIC_IO_CHANNEL);
> +	data = (struct unix_private_data *) channel->private_data;
> +	EXT2_CHECK_MAGIC(data, EXT2_ET_MAGIC_UNIX_IO_CHANNEL);
> +
> +	if (getenv("UNIX_IO_NOZEROOUT"))
> +		goto unimplemented;
> +
> +	if (channel->flags & CHANNEL_FLAGS_BLOCK_DEVICE) {
> +#ifdef BLKZEROOUT
> +		__u64 range[2];
> +
> +		range[0] = (__u64)(block) * channel->block_size;
> +		range[1] = (__u64)(count) * channel->block_size;
> +
> +		ret = ioctl(data->dev, BLKZEROOUT, &range);

NAK this line for now.  I think I've uncovered a race condition where
pwrite+blkzeroout+pread returns the pwritten contents instead of zeroes as I
was expecting.

--D

> +#else
> +		goto unimplemented;
> +#endif
> +	} else {
> +#if defined(HAVE_FALLOCATE) && defined(FALLOC_FL_ZERO_RANGE)
> +		int flag = FALLOC_FL_ZERO_RANGE;
> +		struct stat statbuf;
> +
> +		/*
> +		 * If we're trying to zero a range past the end of the file,
> +		 * just use regular fallocate to get there, because zeroing
> +		 * a range past EOF does not extend the file.
> +		 */
> +		ret = fstat(data->dev, &statbuf);
> +		if (ret)
> +			goto err;
> +		if (statbuf.st_size < (block + count) * channel->block_size)
> +			flag = 0;
> +		/*
> +		 * If we are not on block device, try to use the zero out
> +		 * primitive.
> +		 */
> +		ret = fallocate(data->dev,
> +				flag,
> +				(off_t)(block) * channel->block_size,
> +				(off_t)(count) * channel->block_size);
> +#else
> +		goto unimplemented;
> +#endif
> +	}
> +err:
> +	if (ret < 0) {
> +		if (errno == EOPNOTSUPP)
> +			goto unimplemented;
> +		return errno;
> +	}
> +	return 0;
> +unimplemented:
> +	return EXT2_ET_UNIMPLEMENTED;
> +}
> +
>  static struct struct_io_manager struct_unix_manager = {
>  	.magic		= EXT2_ET_MAGIC_IO_MANAGER,
>  	.name		= "Unix I/O Manager",
> @@ -1002,6 +1072,7 @@ static struct struct_io_manager struct_unix_manager = {
>  	.write_blk64	= unix_write_blk64,
>  	.discard	= unix_discard,
>  	.cache_readahead	= unix_cache_readahead,
> +	.zeroout	= unix_zeroout,
>  };
>  
>  io_manager unix_io_manager = &struct_unix_manager;
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 06/34] debugfs: add LIBINTL to debugfs link command
  2014-09-19  4:46   ` Theodore Ts'o
@ 2014-10-17 21:07     ` Darrick J. Wong
  2014-10-18 16:10       ` Theodore Ts'o
  0 siblings, 1 reply; 67+ messages in thread
From: Darrick J. Wong @ 2014-10-17 21:07 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: linux-ext4

On Fri, Sep 19, 2014 at 12:46:01AM -0400, Theodore Ts'o wrote:
> On Sat, Sep 13, 2014 at 03:11:52PM -0700, Darrick J. Wong wrote:
> > Since debugfs now links in the journal code (which in turn depends on
> > internationalization libraries) we must add a linker option to pull
> > that in on Mac OSX.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> 
> The dependency on the internationalization libraries wasn't caused by
> the journal code.  It was caused by create_inode.h pulling in
> nls-enable.h indiscriminately.  The following should fix things (I'll
> check this in alongside some other patches needed to allow e2fsprogs
> to build under dietlibc):

True, but since debugfs pulls in plausible.o which uses _(), it still needs
this patch to build on OSX 10.9.5.

--D

> 
> diff --git a/misc/create_inode.c b/misc/create_inode.c
> index 7f57979..cf7c097 100644
> --- a/misc/create_inode.c
> +++ b/misc/create_inode.c
> @@ -17,6 +17,7 @@
>  #endif
>  
>  #include "create_inode.h"
> +#include "nls-enable.h"
>  
>  #if __STDC_VERSION__ < 199901L
>  # if __GNUC__ >= 2
> diff --git a/misc/create_inode.h b/misc/create_inode.h
> index 067bf96..145fd57 100644
> --- a/misc/create_inode.h
> +++ b/misc/create_inode.h
> @@ -7,7 +7,6 @@
>  #include "et/com_err.h"
>  #include "e2p/e2p.h"
>  #include "ext2fs/ext2fs.h"
> -#include "nls-enable.h"
>  
>  struct hdlink_s
>  {
> diff --git a/misc/mke2fs.c b/misc/mke2fs.c
> index 69045b2..f09351d 100644
> --- a/misc/mke2fs.c
> +++ b/misc/mke2fs.c
> @@ -58,6 +58,7 @@ extern int optind;
>  #include "quota/quotaio.h"
>  #include "mke2fs.h"
>  #include "create_inode.h"
> +#include "nls-enable.h"
>  
>  #define STRIDE_LENGTH 8
>  
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 06/34] debugfs: add LIBINTL to debugfs link command
  2014-10-17 21:07     ` Darrick J. Wong
@ 2014-10-18 16:10       ` Theodore Ts'o
  0 siblings, 0 replies; 67+ messages in thread
From: Theodore Ts'o @ 2014-10-18 16:10 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Fri, Oct 17, 2014 at 02:07:37PM -0700, Darrick J. Wong wrote:
> 
> True, but since debugfs pulls in plausible.o which uses _(), it still needs
> this patch to build on OSX 10.9.5.

Good point.  I'l fix it this way.  (Basically, it's really pointless
to link debugfs with LIBINTL since debugfs doesn't have any I18N
support, and I'd prefer to limit unnecessary bloat.)

	     	 	   	 	     - Ted

commit 831aa869e8b1b287ca921e7ae181a4cdca839099
Author: Theodore Ts'o <tytso@mit.edu>
Date:   Sat Oct 18 09:13:09 2014 -0400

    debugfs: fix build on systems that don't have gettext built-in
    
    Debugfs (unlike all of the other programs in e2fsprogs) is not set up
    to use translated strings.  So when building misc/plausible.c for
    debugfs, we need to disable NLS.
    
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>

diff --git a/misc/nls-enable.h b/misc/nls-enable.h
index a91dcc1..2f62c01 100644
--- a/misc/nls-enable.h
+++ b/misc/nls-enable.h
@@ -1,4 +1,4 @@
-#ifdef ENABLE_NLS
+#if defined(ENABLE_NLS) && !defined(DEBUGFS)
 #include <libintl.h>
 #include <locale.h>
 #define _(a) (gettext (a))

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH 15/34] libext2fs: support BLKZEROOUT/FALLOC_FL_ZERO_RANGE in ext2fs_zero_blocks
  2014-09-13 22:12 ` [PATCH 15/34] libext2fs: support BLKZEROOUT/FALLOC_FL_ZERO_RANGE in ext2fs_zero_blocks Darrick J. Wong
  2014-09-22  2:51   ` Theodore Ts'o
  2014-10-14  2:58   ` Darrick J. Wong
@ 2014-10-18 16:32   ` Theodore Ts'o
  2014-10-20 23:37     ` Darrick J. Wong
  2 siblings, 1 reply; 67+ messages in thread
From: Theodore Ts'o @ 2014-10-18 16:32 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-ext4

On Sat, Sep 13, 2014 at 03:12:53PM -0700, Darrick J. Wong wrote:
> Plumb a new call into the IO manager to support translating
> ext2fs_zero_blocks calls into the equivalent kernel-level BLKZEROOUT
> ioctl or FALLOC_FL_ZERO_RANGE fallocate flag primitives when possible.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  contrib/fallocate.c     |   14 +++++++++

I've separated out the contrib/fallocate change and created a separate
commit for it, since it really is a separate change.

What I'd like to see for the zero_blocks change io_manager is:

(a) if we try to zero a range past the end of the file, we should just
truncate the file to set i_size.  Similarly, if this is a regular
file, we should try to use PUNCH_HOLE.  We already try to keep a raw
file system image file to be sparse, so I don't see any real problems
with this.

(b) for a block device, if IO_FLAG_DIRECT_IO is set, it shoud be safe
to try to use te BLKZEROOUT.  If not, we can use
posix_fadvise(POSIX_FADV_DONTNEED) and verify that this correctly zaps
the relevant parts of the buffer cache.  If it doesn't do the right
thing, we can use BLKFLSBUF, which will zap the entire buffer cache
for the device.  Which is pretty heavy weight, but I really think it
only makes sense to use zeroout for zeroing the inode table and the
journal file.

Even if we patch the kernel to make BLKZEROOUT to automatically do
this, we can't count on it, and in particular if it turns out we have
to use BLKFLSBUF, we're not going to want to use this for zero'ing a
single 4k block.  It doesn't happen that often, and I don't think
there will be much if any measurable difference in performance if we
use WRITE SAME vs. WRITE for a small region.

Does this make sense?

					- Ted

P.S.  Once we do this, when using mke2fs on a file, we should really
use punch_hole and disable lazy_itable_init, to save I/O bandwidth on
VM's running on cloud systems.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 15/34] libext2fs: support BLKZEROOUT/FALLOC_FL_ZERO_RANGE in ext2fs_zero_blocks
  2014-10-18 16:32   ` Theodore Ts'o
@ 2014-10-20 23:37     ` Darrick J. Wong
  0 siblings, 0 replies; 67+ messages in thread
From: Darrick J. Wong @ 2014-10-20 23:37 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: linux-ext4

On Sat, Oct 18, 2014 at 12:32:55PM -0400, Theodore Ts'o wrote:
> On Sat, Sep 13, 2014 at 03:12:53PM -0700, Darrick J. Wong wrote:
> > Plumb a new call into the IO manager to support translating
> > ext2fs_zero_blocks calls into the equivalent kernel-level BLKZEROOUT
> > ioctl or FALLOC_FL_ZERO_RANGE fallocate flag primitives when possible.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  contrib/fallocate.c     |   14 +++++++++
> 
> I've separated out the contrib/fallocate change and created a separate
> commit for it, since it really is a separate change.
>
> What I'd like to see for the zero_blocks change io_manager is:
> 
> (a) if we try to zero a range past the end of the file, we should just
> truncate the file to set i_size.  Similarly, if this is a regular
> file, we should try to use PUNCH_HOLE.  We already try to keep a raw
> file system image file to be sparse, so I don't see any real problems
> with this.

Done.  For files, it'll truncate if the file needs to be extended, and then
punch out the zero range.  If punch isn't supported, it'll try zero-range
as a last resort.

> (b) for a block device, if IO_FLAG_DIRECT_IO is set, it shoud be safe
> to try to use te BLKZEROOUT.  If not, we can use
> posix_fadvise(POSIX_FADV_DONTNEED) and verify that this correctly zaps
> the relevant parts of the buffer cache.  If it doesn't do the right
> thing, we can use BLKFLSBUF, which will zap the entire buffer cache
> for the device.  Which is pretty heavy weight, but I really think it
> only makes sense to use zeroout for zeroing the inode table and the
> journal file.

I agree that it makes sense not to zero-out single blocks on bdevs.

> Even if we patch the kernel to make BLKZEROOUT to automatically do
> this, we can't count on it, and in particular if it turns out we have
> to use BLKFLSBUF, we're not going to want to use this for zero'ing a
> single 4k block.  It doesn't happen that often, and I don't think
> there will be much if any measurable difference in performance if we
> use WRITE SAME vs. WRITE for a small region.
> 
> Does this make sense?
> 
> 					- Ted
> 
> P.S.  Once we do this, when using mke2fs on a file, we should really
> use punch_hole and disable lazy_itable_init, to save I/O bandwidth on
> VM's running on cloud systems.

Ok.  I think current mke2fs does this if device discard is turned on.

Curiously, it'll still zero the itable even if itable_zeroed == 1.

--D

^ permalink raw reply	[flat|nested] 67+ messages in thread

end of thread, other threads:[~2014-10-20 23:38 UTC | newest]

Thread overview: 67+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
2014-09-13 22:11 ` [PATCH 01/34] e2fsck: offer to clear overlapping extents Darrick J. Wong
2014-09-19  1:45   ` Theodore Ts'o
2014-09-13 22:11 ` [PATCH 02/34] e2fsck: fix sliding the directory block down on bigalloc Darrick J. Wong
2014-09-19  1:45   ` Theodore Ts'o
2014-09-13 22:11 ` [PATCH 03/34] misc: zero s_jnl_blocks when adding journal online or removing external journal Darrick J. Wong
2014-09-19  1:45   ` Theodore Ts'o
2014-09-13 22:11 ` [PATCH 04/34] libext2fs: ext2fs_new_block2() should call alloc_block hook Darrick J. Wong
2014-09-13 22:11 ` [PATCH 05/34] debugfs: manage needs_recover feature when messing with the journal Darrick J. Wong
2014-09-19  6:01   ` Theodore Ts'o
2014-09-13 22:11 ` [PATCH 06/34] debugfs: add LIBINTL to debugfs link command Darrick J. Wong
2014-09-19  4:46   ` Theodore Ts'o
2014-10-17 21:07     ` Darrick J. Wong
2014-10-18 16:10       ` Theodore Ts'o
2014-09-13 22:11 ` [PATCH 07/34] ext2fs: add readahead method to improve scanning Darrick J. Wong
2014-09-19 16:15   ` Theodore Ts'o
2014-09-13 22:12 ` [PATCH 08/34] libext2fs/e2fsck: provide routines to read-ahead metadata Darrick J. Wong
2014-09-13 22:12 ` [PATCH 09/34] e2fsck: read-ahead metadata during passes 1, 2, and 4 Darrick J. Wong
2014-09-13 22:12 ` [PATCH 10/34] dumpe2fs: provide a machine-readable group-only mode Darrick J. Wong
2014-09-19 16:17   ` Theodore Ts'o
2014-09-13 22:12 ` [PATCH 11/34] dumpe2fs: output cleanup Darrick J. Wong
2014-09-19 16:22   ` Theodore Ts'o
2014-09-19 20:00     ` Darrick J. Wong
2014-10-13 18:04       ` Darrick J. Wong
2014-09-13 22:12 ` [PATCH 12/34] misc: move check_plausibility into a separate file Darrick J. Wong
2014-09-19 22:16   ` Theodore Ts'o
2014-09-13 22:12 ` [PATCH 13/34] misc: add plausibility checks to debugfs/tune2fs/dumpe2fs/e2fsck Darrick J. Wong
2014-09-19 23:00   ` Theodore Ts'o
2014-09-13 22:12 ` [PATCH 14/34] misc: use libmagic when libblkid can't identify something Darrick J. Wong
2014-09-21  5:29   ` Theodore Ts'o
2014-09-13 22:12 ` [PATCH 15/34] libext2fs: support BLKZEROOUT/FALLOC_FL_ZERO_RANGE in ext2fs_zero_blocks Darrick J. Wong
2014-09-22  2:51   ` Theodore Ts'o
2014-09-29 18:58     ` Darrick J. Wong
2014-10-14  2:58   ` Darrick J. Wong
2014-10-18 16:32   ` Theodore Ts'o
2014-10-20 23:37     ` Darrick J. Wong
2014-09-13 22:12 ` [PATCH 16/34] libext2fs/e2fsck: refactor everyone who writes zero blocks to disk Darrick J. Wong
2014-10-13 10:09   ` Theodore Ts'o
2014-10-13 17:09     ` Darrick J. Wong
2014-09-13 22:13 ` [PATCH 17/34] libext2fs: support allocating uninit blocks in bmap2() Darrick J. Wong
2014-10-13 14:35   ` Theodore Ts'o
2014-10-13 16:56     ` Darrick J. Wong
2014-10-13 18:34       ` Darrick J. Wong
2014-09-13 22:13 ` [PATCH 18/34] libext2fs: file IO routines should handle uninit blocks Darrick J. Wong
2014-09-13 22:13 ` [PATCH 19/34] resize2fs: convert fs to and from 64bit mode Darrick J. Wong
2014-09-14 17:34   ` TR Reardon
2014-09-14 17:50     ` Darrick J. Wong
2014-09-13 22:13 ` [PATCH 20/34] resize2fs: adjust reserved_gdt_blocks when changing group descriptor size Darrick J. Wong
2014-09-13 22:13 ` [PATCH 21/34] tests: test resize2fs 32->64 and 64->32bit conversion code Darrick J. Wong
2014-09-13 22:13 ` [PATCH 22/34] libext2fs: find inode goal when allocating blocks Darrick J. Wong
2014-09-13 22:13 ` [PATCH 23/34] libext2fs: find/alloc a range of empty blocks Darrick J. Wong
2014-09-13 22:13 ` [PATCH 24/34] libext2fs: add new hooks to support large allocations Darrick J. Wong
2014-09-13 22:14 ` [PATCH 25/34] libext2fs: implement fallocate Darrick J. Wong
2014-09-13 22:14 ` [PATCH 26/34] libext2fs: use fallocate for creating journals and hugefiles Darrick J. Wong
2014-09-13 22:14 ` [PATCH 27/34] debugfs: implement fallocate Darrick J. Wong
2014-09-13 22:14 ` [PATCH 28/34] tests: test debugfs punch command Darrick J. Wong
2014-09-19 16:26   ` Theodore Ts'o
2014-09-19 20:01     ` Darrick J. Wong
2014-09-13 22:14 ` [PATCH 30/34] fuse2fs: translate ACL structures Darrick J. Wong
2014-09-13 22:14 ` [PATCH 31/34] fuse2fs: handle 64-bit dates correctly Darrick J. Wong
2014-09-13 22:14 ` [PATCH 32/34] fuse2fs: implement fallocate Darrick J. Wong
2014-09-13 22:15 ` [PATCH 34/34] tests: enable using fuse2fs with metadata checksum test Darrick J. Wong
2014-09-14 17:19 ` [PATCH 35/34] e2fsck: free bh when descriptor block checksum fails Darrick J. Wong
2014-09-14 19:11   ` Eric Sandeen
2014-09-19  1:46     ` Theodore Ts'o
2014-09-18 19:09 ` [PATCH 36/34] misc: fix Coverity complaints Darrick J. Wong
2014-09-19  1:47   ` Theodore Ts'o

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.