From: Daniel Phillips <phillips@bonn-fries.net>
To: Andreas Dilger <adilger@clusterfs.com>, davidm@hpl.hp.com
Cc: Peter Chubb <peter@chubb.wattle.id.au>,
Jeremy Andrews <jeremy@kerneltrap.org>,
linux-kernel@vger.kernel.org, ext2-devel@lists.sourceforge.net
Subject: Re: [PATCH] remove 2TB block device limit
Date: Thu, 16 May 2002 22:22:31 +0200 [thread overview]
Message-ID: <E178Rlf-0008Tj-00@starship> (raw)
In-Reply-To: <15579.16423.930012.986750@wombat.chubb.wattle.id.au> <15580.24766.424170.333718@napali.hpl.hp.com> <20020515221733.GG12975@turbolinux.com>
On Thursday 16 May 2002 00:17, Andreas Dilger wrote:
> On May 10, 2002 17:07 -0700, David Mosberger wrote:
> >On Fri, 10 May 2002 17:46:23 -0600, Andreas Dilger <adilger@clusterfs.com>
said:
> > Andreas> For 64-bit systems like Alpha, it is relatively easy to use
> > Andreas> 8kB blocks for ext3. It has been discouraged because such
> > Andreas> a filesystem is non-portable to other (smaller page-sized)
> > Andreas> filesystems. Maybe this rationale should be re-examined -
> > Andreas> I could probably whip up a configure option for e2fsprogs
> > Andreas> to allow 8kB blocks in a few hours.
> >
> > If you do this, please consider allowing a block size up to 64KB.
> > The ia64 kernel offers a choice of 4, 8, 16, and 64KB page size.
>
> Well, taking a look at the ext2 code, there is a slight problem when
> trying to use block sizes > 8kB. This is in the group descriptors,
> where they only store a 16 bit could of free blocks and inodes for
> the group. Since the maximum number of blocks/inodes is 8*blocksize
> (the number of bits that can fit into a single block) you overflow
> these fields if you have more than 64k (8*8k) blocks in a group.
>
> Even 8kB blocks would theoretically overflow these fields, but you
> can't yet have a group _totally_ empty (there are always two bitmaps
> and at least one inode table block), so it would always have less
> than 65535 blocks free. Now I realize that this isn't true of the
> inode table in theory, but you normally also have less than the maximum
> number of inodes per group - need to check for that.
>
> This could be worked around temporarily by limiting the size of each
> group to at most 65535 free blocks/inodes. The permanent solution is
> to probably add an extra byte for each of these two fields to allow up
> to 16M blocks/inodes per group, which gives us a max block size of 2MB.
>
> This could be a compat ext2 feature, since at worst if we didn't take
> the high byte into account on a block free it could overflow this field
> and we wouldn't be able to allocate from this group until more blocks
> are freed. We couldn't underflow because the allocator would stop when
> the free block/inode count hit zero for that group, even if there were
> really more free blocks available.
>
> So, for now I think I'll stick to a maximum of 8kB blocks, and maybe
> we can slip in support for the high byte of the free blocks/inodes
> count when Ted adds in support for metagroups.
Hi Andreas,
Imposing an absolute upper limit of 2**16 blocks per group makes the most
sense for now, and may always make the most sense. Even with a cap on the
blocks per group group size still scales directly with block size. We
don't want it to scale quadratically. If it did, then a data block could
end up 32 GB away from the inode, still in the same group. This
effectively destroys the utility of block groups as a means of reducing
seek latency.
--
Daniel
next prev parent reply other threads:[~2002-05-16 20:23 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-05-10 3:36 [PATCH] remove 2TB block device limit Peter Chubb
2002-05-10 4:05 ` Andrew Morton
2002-05-10 8:43 ` Anton Altaparmakov
2002-05-10 9:04 ` Andrew Morton
2002-05-16 19:08 ` Daniel Phillips
2002-05-10 9:05 ` Jens Axboe
2002-05-10 9:53 ` Peter Chubb
2002-05-10 10:01 ` Jens Axboe
2002-05-10 11:43 ` Anton Altaparmakov
2002-05-10 4:51 ` Martin Dalecki
[not found] ` <20020510084713.43ce396e.jeremy@kerneltrap.org>
2002-05-10 19:12 ` Peter Chubb
2002-05-10 23:46 ` Andreas Dilger
2002-05-11 0:07 ` David Mosberger
2002-05-15 22:17 ` Andreas Dilger
2002-05-16 20:22 ` Daniel Phillips [this message]
2002-05-16 22:54 ` Andreas Dilger
2002-05-17 1:17 ` Daniel Phillips
2002-05-11 4:40 ` Peter Chubb
2002-05-15 13:49 ` Pavel Machek
2002-05-11 18:13 ` Padraig Brady
2002-05-10 3:53 Neil Brown
[not found] <1060250300@toto.iv>
2002-05-13 10:28 ` Peter Chubb
2002-05-13 12:13 ` Christoph Hellwig
2002-05-14 0:30 ` Peter Chubb
2002-05-14 1:36 ` Anton Altaparmakov
2002-05-16 20:32 ` Daniel Phillips
2002-05-14 2:09 ` Andrew Morton
2002-05-14 2:58 ` Peter Chubb
2002-05-14 7:22 ` Christoph Hellwig
2002-05-14 7:21 ` Christoph Hellwig
2002-05-15 9:41 Hirotaka Sasaki
2002-05-15 21:49 ` Steve Lord
[not found] <581856778@toto.iv>
2002-05-17 0:04 ` Peter Chubb
2002-05-17 0:18 ` Daniel Phillips
2002-05-17 13:32 ` Jesse Pollard
2002-05-17 18:02 ` Daniel Phillips
2002-05-17 18:26 ` Jesse Pollard
2002-05-17 18:36 ` Andreas Dilger
2002-05-17 19:52 ` Daniel Phillips
2002-05-17 20:25 ` Andrew Morton
2002-05-17 15:26 ` Jason L Tibbitts III
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=E178Rlf-0008Tj-00@starship \
--to=phillips@bonn-fries.net \
--cc=adilger@clusterfs.com \
--cc=davidm@hpl.hp.com \
--cc=ext2-devel@lists.sourceforge.net \
--cc=jeremy@kerneltrap.org \
--cc=linux-kernel@vger.kernel.org \
--cc=peter@chubb.wattle.id.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).