* mkfs'ing a 48-bit fs... or not. @ 2011-10-03 21:55 Eric Sandeen 2011-10-04 4:00 ` Ted Ts'o ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Eric Sandeen @ 2011-10-03 21:55 UTC (permalink / raw) To: ext4 development Has anyone tried mke2fs at its limits? The latest git tree seems to fail in several ways. (Richard Jones reported the initial failure) # truncate --size 1152921504606846976 reallybigfile # mke2fs -t ext4 reallybigfile first, Warning: the fs_type huge is not defined in mke2fs.conf (when types "big" and "huge" got added, they never got a mke2fs.conf update?) Then, I got: reallybigfile: Not enough space to build proposed filesystem while setting up superblock because: fs->group_desc_count = (blk_t) ext2fs_div64_ceil( ext2fs_blocks_count(super) - super->s_first_data_block, EXT2_BLOCKS_PER_GROUP(super)); if (fs->group_desc_count == 0) { retval = EXT2_ET_TOOSMALL; The div64_ceil returns > 2^32 (2^33, actually), and the cast to blk_t (which should be dgrp_t?) turns that into a 0. Trying it with "-O bigalloc" (which should be automatic at this size, I think?) just goes away for a very long time, I'm not sure what it's thinking about, or if it's in a loop somewhere (looking now). I also came across this in ext2fs_initialize() in the bigalloc case: if (super->s_clusters_per_group > EXT2_MAX_CLUSTERS_PER_GROUP(super)) super->s_blocks_per_group = EXT2_MAX_CLUSTERS_PER_GROUP(super); super->s_blocks_per_group = EXT2FS_C2B(fs, super->s_clusters_per_group); which seems to be incorrect; I doubt that you meant to set s_blocks_per_group under a conditional, and then unconditionally set it immediately after. I assume that should be super->s_clusters_per_group in the first case? I'll send a patch, assuming so. TBH I've kind of lost the thread on bigalloc, so just putting this out there for now while I look into things a bit more. -Eric ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: mkfs'ing a 48-bit fs... or not. 2011-10-03 21:55 mkfs'ing a 48-bit fs... or not Eric Sandeen @ 2011-10-04 4:00 ` Ted Ts'o 2011-10-04 4:26 ` [PATCH 1/2] Add "big" and "huge" types to mke2fs.conf Theodore Ts'o 2011-10-04 5:31 ` mkfs'ing a 48-bit fs... or not Andreas Dilger 2011-10-04 4:03 ` Eric Sandeen 2011-10-04 7:06 ` Richard W.M. Jones 2 siblings, 2 replies; 11+ messages in thread From: Ted Ts'o @ 2011-10-04 4:00 UTC (permalink / raw) To: Eric Sandeen; +Cc: ext4 development On Mon, Oct 03, 2011 at 04:55:11PM -0500, Eric Sandeen wrote: > Has anyone tried mke2fs at its limits? The latest git tree seems to fail in several ways. > (Richard Jones reported the initial failure) > > # truncate --size 1152921504606846976 reallybigfile > # mke2fs -t ext4 reallybigfile > > first, > > Warning: the fs_type huge is not defined in mke2fs.conf > > (when types "big" and "huge" got added, they never got a mke2fs.conf update?) It used to be that an undefined file system type didn't flag an error. It now does, so we should have definitions for them in mke2fs.conf. > reallybigfile: Not enough space to build proposed filesystem while setting up superblock > > because: > > fs->group_desc_count = (blk_t) ext2fs_div64_ceil( > ext2fs_blocks_count(super) - super->s_first_data_block, > EXT2_BLOCKS_PER_GROUP(super)); > if (fs->group_desc_count == 0) { > retval = EXT2_ET_TOOSMALL; > > The div64_ceil returns > 2^32 (2^33, actually), and the cast to blk_t > (which should be dgrp_t?) turns that into a 0. Yep, that should be dgrp_t. Oops. > Trying it with "-O bigalloc" (which should be automatic at this size, > I think?) just goes away for a very long time, I'm not sure what it's > thinking about, or if it's in a loop somewhere (looking now). Well, we probably do want to engage bigalloc automatically, at some point (I want to wait until bigalloc is in commonly used kernels, at least for community distro's). I'm not sure what the best cluster size to pick by default should be, though. 16k? 64k? - Ted ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/2] Add "big" and "huge" types to mke2fs.conf 2011-10-04 4:00 ` Ted Ts'o @ 2011-10-04 4:26 ` Theodore Ts'o 2011-10-04 4:27 ` [PATCH 2/2] libext2fs: fix bad cast which causes problems for file systems > 512EB Theodore Ts'o 2011-10-04 5:31 ` mkfs'ing a 48-bit fs... or not Andreas Dilger 1 sibling, 1 reply; 11+ messages in thread From: Theodore Ts'o @ 2011-10-04 4:26 UTC (permalink / raw) To: Ext4 Developers List; +Cc: sandeen, Theodore Ts'o mke2fs attempts to use the "big" and "huge" types, and now that mke2fs will complain if there are file system types which are undefined, let's add definitions for them. Thanks to Richard Jones for reporting this problem. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> --- misc/mke2fs-hurd.conf | 6 ++++++ misc/mke2fs.conf | 6 ++++++ 2 files changed, 12 insertions(+), 0 deletions(-) diff --git a/misc/mke2fs-hurd.conf b/misc/mke2fs-hurd.conf index 52ed7e5..4f0527d 100644 --- a/misc/mke2fs-hurd.conf +++ b/misc/mke2fs-hurd.conf @@ -21,6 +21,12 @@ floppy = { inode_ratio = 8192 } + big = { + inode_ratio = 32768 + } + huge = { + inode_ratio = 65536 + } news = { inode_ratio = 4096 } diff --git a/misc/mke2fs.conf b/misc/mke2fs.conf index 775e046..0871f77 100644 --- a/misc/mke2fs.conf +++ b/misc/mke2fs.conf @@ -30,6 +30,12 @@ inode_size = 128 inode_ratio = 8192 } + big = { + inode_ratio = 32768 + } + huge = { + inode_ratio = 65536 + } news = { inode_ratio = 4096 } -- 1.7.4.1.22.gec8e1.dirty ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/2] libext2fs: fix bad cast which causes problems for file systems > 512EB 2011-10-04 4:26 ` [PATCH 1/2] Add "big" and "huge" types to mke2fs.conf Theodore Ts'o @ 2011-10-04 4:27 ` Theodore Ts'o 2011-10-04 11:47 ` Eric Sandeen 0 siblings, 1 reply; 11+ messages in thread From: Theodore Ts'o @ 2011-10-04 4:27 UTC (permalink / raw) To: Ext4 Developers List; +Cc: sandeen, Theodore Ts'o If the number of block groups exceeds 2**32, a bad cast would lead to a bogus "Not enough space to build proposed filesystem while setting up superblock" failure. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> --- lib/ext2fs/initialize.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/lib/ext2fs/initialize.c b/lib/ext2fs/initialize.c index 2875f97..b050a0a 100644 --- a/lib/ext2fs/initialize.c +++ b/lib/ext2fs/initialize.c @@ -248,7 +248,7 @@ errcode_t ext2fs_initialize(const char *name, int flags, } retry: - fs->group_desc_count = (blk_t) ext2fs_div64_ceil( + fs->group_desc_count = (dgrp_t) ext2fs_div64_ceil( ext2fs_blocks_count(super) - super->s_first_data_block, EXT2_BLOCKS_PER_GROUP(super)); if (fs->group_desc_count == 0) { -- 1.7.4.1.22.gec8e1.dirty ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] libext2fs: fix bad cast which causes problems for file systems > 512EB 2011-10-04 4:27 ` [PATCH 2/2] libext2fs: fix bad cast which causes problems for file systems > 512EB Theodore Ts'o @ 2011-10-04 11:47 ` Eric Sandeen 2011-10-04 18:05 ` Ted Ts'o 0 siblings, 1 reply; 11+ messages in thread From: Eric Sandeen @ 2011-10-04 11:47 UTC (permalink / raw) To: Theodore Ts'o; +Cc: Ext4 Developers List On 10/3/11 11:27 PM, Theodore Ts'o wrote: > If the number of block groups exceeds 2**32, a bad cast would lead to > a bogus "Not enough space to build proposed filesystem while setting > up superblock" failure. It's the proper cast now, but I don't think it fixes the problem, since they are both __u32... But in any case, for the actual change at least: Reviewed-by: Eric Sandeen <sandeen@redhat.com> > Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> > --- > lib/ext2fs/initialize.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/lib/ext2fs/initialize.c b/lib/ext2fs/initialize.c > index 2875f97..b050a0a 100644 > --- a/lib/ext2fs/initialize.c > +++ b/lib/ext2fs/initialize.c > @@ -248,7 +248,7 @@ errcode_t ext2fs_initialize(const char *name, int flags, > } > > retry: > - fs->group_desc_count = (blk_t) ext2fs_div64_ceil( > + fs->group_desc_count = (dgrp_t) ext2fs_div64_ceil( > ext2fs_blocks_count(super) - super->s_first_data_block, > EXT2_BLOCKS_PER_GROUP(super)); > if (fs->group_desc_count == 0) { ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] libext2fs: fix bad cast which causes problems for file systems > 512EB 2011-10-04 11:47 ` Eric Sandeen @ 2011-10-04 18:05 ` Ted Ts'o 2011-10-04 18:15 ` Eric Sandeen 0 siblings, 1 reply; 11+ messages in thread From: Ted Ts'o @ 2011-10-04 18:05 UTC (permalink / raw) To: Eric Sandeen; +Cc: Richard W.M. Jones, Ext4 Developers List On Tue, Oct 04, 2011 at 06:47:12AM -0500, Eric Sandeen wrote: > On 10/3/11 11:27 PM, Theodore Ts'o wrote: > > If the number of block groups exceeds 2**32, a bad cast would lead to > > a bogus "Not enough space to build proposed filesystem while setting > > up superblock" failure. > > It's the proper cast now, but I don't think it fixes the problem, since they > are both __u32... Hmm, yes. And to be quite honest I'm not sure it's worth fixing. 2**32 block groups gets us up to 2**59 bytes assuming 4k blocks. The theoretical maximum given the current extent tree format is 2**60 assuming 4k blocks. So changing dgrp_t to be 64-bits just to get that last power of two (i.e., from 512EB to a full PB) doesn't seem worth it. Simply using a bigalloc cluster size of 8k would make the problem go away (and arguably we'd probably want a large cluster size if someone wanted to create a file system that big anyway). So maybe we should just check to see if the required number of block groups is greater than 2**32, and if so, give an error. - Ted ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] libext2fs: fix bad cast which causes problems for file systems > 512EB 2011-10-04 18:05 ` Ted Ts'o @ 2011-10-04 18:15 ` Eric Sandeen 0 siblings, 0 replies; 11+ messages in thread From: Eric Sandeen @ 2011-10-04 18:15 UTC (permalink / raw) To: Ted Ts'o; +Cc: Richard W.M. Jones, Ext4 Developers List On 10/4/11 1:05 PM, Ted Ts'o wrote: > On Tue, Oct 04, 2011 at 06:47:12AM -0500, Eric Sandeen wrote: >> On 10/3/11 11:27 PM, Theodore Ts'o wrote: >>> If the number of block groups exceeds 2**32, a bad cast would lead to >>> a bogus "Not enough space to build proposed filesystem while setting >>> up superblock" failure. >> >> It's the proper cast now, but I don't think it fixes the problem, since they >> are both __u32... > > Hmm, yes. > > And to be quite honest I'm not sure it's worth fixing. 2**32 block > groups gets us up to 2**59 bytes assuming 4k blocks. The theoretical > maximum given the current extent tree format is 2**60 assuming 4k > blocks. So changing dgrp_t to be 64-bits just to get that last power > of two (i.e., from 512EB to a full PB) doesn't seem worth it. Simply > using a bigalloc cluster size of 8k would make the problem go away > (and arguably we'd probably want a large cluster size if someone > wanted to create a file system that big anyway). > > So maybe we should just check to see if the required number of block > groups is greater than 2**32, and if so, give an error. > > - Ted > As long as we have a consistent, predictable, well-designed and well-understood maximum (theoretical) size for the fs, I'm all for documenting & enforcing it. TBH I'm still trying to get all the moving parts together in my head, between meta_bg & bigalloc & whatnot, at these sizes. The initialization functions are looking pretty ad-hoc to me right now. :) -Eric ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: mkfs'ing a 48-bit fs... or not. 2011-10-04 4:00 ` Ted Ts'o 2011-10-04 4:26 ` [PATCH 1/2] Add "big" and "huge" types to mke2fs.conf Theodore Ts'o @ 2011-10-04 5:31 ` Andreas Dilger 1 sibling, 0 replies; 11+ messages in thread From: Andreas Dilger @ 2011-10-04 5:31 UTC (permalink / raw) To: Ted Ts'o; +Cc: Eric Sandeen, ext4 development On 2011-10-03, at 10:00 PM, Ted Ts'o <tytso@mit.edu> wrote: > On Mon, Oct 03, 2011 at 04:55:11PM -0500, Eric Sandeen wrote: >> Has anyone tried mke2fs at its limits? The latest git tree seems to fail in several ways. >> (Richard Jones reported the initial failure) >> >> # truncate --size 1152921504606846976 reallybigfile >> # mke2fs -t ext4 reallybigfile >> >> first, >> >> Warning: the fs_type huge is not defined in mke2fs.conf >> >> (when types "big" and "huge" got added, they never got a mke2fs.conf update?) > > It used to be that an undefined file system type didn't flag an error. > It now does, so we should have definitions for them in mke2fs.conf. > >> reallybigfile: Not enough space to build proposed filesystem while setting up superblock Isn't there also a problem with the number of block group descriptor blocks in the first group, if meta_bg is not used? With 64-byte group descriptors per 128MB group this is 1024 bytes of descriptors for 2GB of blocks, or 128MB of descriptors for 256TB of blocks. At this point group 0 is full of primary block descriptors and group 1 is full of backup descriptors, and we are out of luck to make a larger filesystem. That is only 2^48 bytes, not 2^48 blocks (2^60 bytes), so it means meta_bg needs to get into more testing, and online resize with flex_bg needs to move forward. >> because: >> >> fs->group_desc_count = (blk_t) ext2fs_div64_ceil( >> ext2fs_blocks_count(super) - super->s_first_data_block, >> EXT2_BLOCKS_PER_GROUP(super)); >> if (fs->group_desc_count == 0) { >> retval = EXT2_ET_TOOSMALL; >> >> The div64_ceil returns > 2^32 (2^33, actually), and the cast to blk_t >> (which should be dgrp_t?) turns that into a 0. > > Yep, that should be dgrp_t. Oops. > >> Trying it with "-O bigalloc" (which should be automatic at this size, >> I think?) just goes away for a very long time, I'm not sure what it's >> thinking about, or if it's in a loop somewhere (looking now). > > Well, we probably do want to engage bigalloc automatically, at some > point (I want to wait until bigalloc is in commonly used kernels, at > least for community distro's). I'm not sure what the best cluster > size to pick by default should be, though. 16k? 64k? > > - Ted > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: mkfs'ing a 48-bit fs... or not. 2011-10-03 21:55 mkfs'ing a 48-bit fs... or not Eric Sandeen 2011-10-04 4:00 ` Ted Ts'o @ 2011-10-04 4:03 ` Eric Sandeen 2011-10-04 4:28 ` Ted Ts'o 2011-10-04 7:06 ` Richard W.M. Jones 2 siblings, 1 reply; 11+ messages in thread From: Eric Sandeen @ 2011-10-04 4:03 UTC (permalink / raw) To: Eric Sandeen; +Cc: ext4 development On 10/3/11 4:55 PM, Eric Sandeen wrote: > Has anyone tried mke2fs at its limits? The latest git tree seems to fail in several ways. > (Richard Jones reported the initial failure) > > # truncate --size 1152921504606846976 reallybigfile > # mke2fs -t ext4 reallybigfile ... > Trying it with "-O bigalloc" (which should be automatic at this size, > I think?) just goes away for a very long time, I'm not sure what it's > thinking about, or if it's in a loop somewhere (looking now). It comes up with too many inodes, then tries to reduce the count, but the "waste not want not" logic bumps it back up... ipg eventually goes "below" 0 but it's unsigned so it goes on in this loop forever. Some of this is my fault... I put that retry logic in years ago. :( I'll see what I can do to fix it up. -Eric ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: mkfs'ing a 48-bit fs... or not. 2011-10-04 4:03 ` Eric Sandeen @ 2011-10-04 4:28 ` Ted Ts'o 0 siblings, 0 replies; 11+ messages in thread From: Ted Ts'o @ 2011-10-04 4:28 UTC (permalink / raw) To: Eric Sandeen; +Cc: Eric Sandeen, ext4 development On Mon, Oct 03, 2011 at 11:03:40PM -0500, Eric Sandeen wrote: > It comes up with too many inodes, then tries to reduce the count, > but the "waste not want not" logic bumps it back up... ipg eventually > goes "below" 0 but it's unsigned so it goes on in this loop forever. Oh, this is because of the fact that we can't have more than 2**32 inodes, right? Doh! > Some of this is my fault... I put that retry logic in years ago. :( > > I'll see what I can do to fix it up. Many thanks. I've fixed the other issues you've pointed out. Check out the next branch on github... - Ted ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: mkfs'ing a 48-bit fs... or not. 2011-10-03 21:55 mkfs'ing a 48-bit fs... or not Eric Sandeen 2011-10-04 4:00 ` Ted Ts'o 2011-10-04 4:03 ` Eric Sandeen @ 2011-10-04 7:06 ` Richard W.M. Jones 2 siblings, 0 replies; 11+ messages in thread From: Richard W.M. Jones @ 2011-10-04 7:06 UTC (permalink / raw) Cc: ext4 development Thanks Eric. Here is the original thread (see also the replies). https://lists.fedoraproject.org/pipermail/devel/2011-October/157618.html In theory I could test this up to ~ 2**63, but it requires a number of bugfixes and changes in qemu. Obviously that size is ridiculous :-) but it may reveal bugs that wouldn't be found by ordinary testing. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into Xen guests. http://et.redhat.com/~rjones/virt-p2v ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2011-10-04 18:15 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-10-03 21:55 mkfs'ing a 48-bit fs... or not Eric Sandeen 2011-10-04 4:00 ` Ted Ts'o 2011-10-04 4:26 ` [PATCH 1/2] Add "big" and "huge" types to mke2fs.conf Theodore Ts'o 2011-10-04 4:27 ` [PATCH 2/2] libext2fs: fix bad cast which causes problems for file systems > 512EB Theodore Ts'o 2011-10-04 11:47 ` Eric Sandeen 2011-10-04 18:05 ` Ted Ts'o 2011-10-04 18:15 ` Eric Sandeen 2011-10-04 5:31 ` mkfs'ing a 48-bit fs... or not Andreas Dilger 2011-10-04 4:03 ` Eric Sandeen 2011-10-04 4:28 ` Ted Ts'o 2011-10-04 7:06 ` Richard W.M. Jones
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.