* s_first_meta_bg treatment incompatibility between kernel and e2fsprogs
@ 2009-11-11 12:09 Damien Guibouret
2009-11-15 4:20 ` Theodore Tso
0 siblings, 1 reply; 5+ messages in thread
From: Damien Guibouret @ 2009-11-11 12:09 UTC (permalink / raw)
To: linux-ext4
Hello,
I have taken a look at META_BG feature and think there is some
incoherency between kernel and e2fsprogs about s_first_meta_bg handling.
When considering initialisation of unitialised block bitmaps for groups
before first meta group one:
- kernel considers that descriptors blocks occupy
EXT4_SB(sb)->s_gdb_count blocks (see ext4_bg_num_gdb_nometa at
balloc.c:764 that is indirectly called from ext4_init_block_bitmap),
s_gdb_count being the number of blocks to store all descriptors
(computed from super.c).
- e2fsprogs considers that descriptors blocks occupy s_first_meta_bg
(see ext2fs_reserve_super_and_bgd at alloc_sb.c:55-68).
The difference of behaviour is wrong as e2fsck will certainly complain
there is a bitmap marked unused when it should be if the bitmap was
initialised by kernel and number of descriptors blocks is lower than
s_first_meta_bg or reverse if number of descriptors blocks is higher
than s_first_meta_bg.
So, the kernel behaviour seems to be wrong in the META_BG case when
s_first_meta_bg is not 0: ext4_bg_num_gdb returns either 1 (META_BG
feature present and group being one of the meta group) or s_gdb_count
(META_BG feature not present or group being one before first meta
group). As s_gdb_count is number of blocks for all groups, I think it
should returns either 1 (META_BG present and group being one of the meta
group) or s_first_meta_bg (META_BG present and group being one before
first meta group) or s_gdb_count (META_BG not set).
I see also that the resize2fs does not handle the s_first_meta_bg flag
in case a filesystem is shrunk such as number of descriptor blocks goes
below s_first_meta_bg. To what I looked, it does not seem to be a
problem (apart from the problem in kernel described above), but I did
not perform a complete check about that. At least there is some blocks
still allocated when there is no more need for that (but e2fsck does not
complain as it uses also the s_first_meta_bg value). I do not know if it
is desired behaviour. In case the s_first_meta_bg is lowered and blocks
freed, it will certainly be better to add a check into e2fsck to check
that s_first_meta_bg is coherent with number of descriptor blocks
(s_first_meta_bg <= fs->desc_blocks).
If you want to perform some tests on that, I modified tune2fs to allow
setting the META_BG flag on a filesystem that does not have it with
setting s_first_meta_bg to the current number of blocks for descriptors
(it is how I understand META_BG/s_first_meta_group should be used).
Regards,
Damien
*** misc/tune2fs.old 2009-11-11 12:20:33.698192912 +0100
--- misc/tune2fs.c 2009-11-11 11:38:20.265333248 +0100
***************
*** 121,126 ****
--- 121,127 ----
EXT2_FEATURE_COMPAT_DIR_INDEX,
/* Incompat */
EXT2_FEATURE_INCOMPAT_FILETYPE |
+ EXT2_FEATURE_INCOMPAT_META_BG |
EXT3_FEATURE_INCOMPAT_EXTENTS |
EXT4_FEATURE_INCOMPAT_FLEX_BG,
/* R/O compat */
***************
*** 418,423 ****
--- 419,440 ----
}
}
+ if (FEATURE_ON(E2P_FEATURE_INCOMPAT, EXT2_FEATURE_INCOMPAT_META_BG)) {
+ if (mount_flags & EXT2_MF_MOUNTED) {
+ fputs(_("The meta_bg feature may only be "
+ "set when the filesystem is\n"
+ "unmounted.\n"), stderr);
+ exit(1);
+ }
+ if (sb->s_feature_compat & EXT2_FEATURE_COMPAT_RESIZE_INODE) {
+ fputs(_("The meta_bg feature cannot be "
+ "set when the resize_inode is\n"
+ "set.\n"), stderr);
+ exit(1);
+ }
+ sb->s_first_meta_bg = fs->desc_blocks;
+ }
+
if (FEATURE_OFF(E2P_FEATURE_RO_INCOMPAT,
EXT4_FEATURE_RO_COMPAT_HUGE_FILE)) {
if ((mount_flags & EXT2_MF_MOUNTED) &&
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: s_first_meta_bg treatment incompatibility between kernel and e2fsprogs
2009-11-11 12:09 s_first_meta_bg treatment incompatibility between kernel and e2fsprogs Damien Guibouret
@ 2009-11-15 4:20 ` Theodore Tso
2009-11-15 10:28 ` Damien Guibouret
0 siblings, 1 reply; 5+ messages in thread
From: Theodore Tso @ 2009-11-15 4:20 UTC (permalink / raw)
To: Damien Guibouret; +Cc: linux-ext4
On Wed, Nov 11, 2009 at 01:09:00PM +0100, Damien Guibouret wrote:
> Hello,
>
> I have taken a look at META_BG feature and think there is some
> incoherency between kernel and e2fsprogs about s_first_meta_bg handling.
>
> When considering initialisation of unitialised block bitmaps for groups
> before first meta group one:
> - kernel considers that descriptors blocks occupy
> EXT4_SB(sb)->s_gdb_count blocks (see ext4_bg_num_gdb_nometa at
> balloc.c:764 that is indirectly called from ext4_init_block_bitmap),
> s_gdb_count being the number of blocks to store all descriptors
> (computed from super.c).
> - e2fsprogs considers that descriptors blocks occupy s_first_meta_bg
> (see ext2fs_reserve_super_and_bgd at alloc_sb.c:55-68).
Yup, you're right. Ouch. That is a kernel bug, and it means that if
we resize a filesystem to the point where we need to use meta_bg
(because we've run out of blocks to reserve), if there are
uninitialized block bitmaps, kernels that don't have a fix will
misbehave by reserving too many file system metadata blocks. This
will waste bit of disk space, which fsck will fix.
(s_first_meta_bg by definition is always less than or equal to
s_gdb_count.)
I think this patch should fix things up.
- Ted
commit b33c339814f97fc48a843f45f6068f84bc735141
Author: Theodore Ts'o <tytso@mit.edu>
Date: Sat Nov 14 23:20:30 2009 -0500
ext4: Fix uninit block bitmap initialization when s_meta_first_bg is non-zero
The number of old-style block group descriptor blocks is
s_meta_first_bg when the meta_bg feature flag is set.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c
index 1d04189..f3032c9 100644
--- a/fs/ext4/balloc.c
+++ b/fs/ext4/balloc.c
@@ -761,7 +761,13 @@ static unsigned long ext4_bg_num_gdb_meta(struct super_block *sb,
static unsigned long ext4_bg_num_gdb_nometa(struct super_block *sb,
ext4_group_t group)
{
- return ext4_bg_has_super(sb, group) ? EXT4_SB(sb)->s_gdb_count : 0;
+ if (!ext4_bg_has_super(sb, group))
+ return 0;
+
+ if (EXT4_HAS_INCOMPAT_FEATURE(sb,EXT4_FEATURE_INCOMPAT_META_BG))
+ return le32_to_cpu(EXT4_SB(sb)->s_es->s_first_meta_bg);
+ else
+ return EXT4_SB(sb)->s_gdb_count;
}
/**
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: s_first_meta_bg treatment incompatibility between kernel and e2fsprogs
2009-11-15 4:20 ` Theodore Tso
@ 2009-11-15 10:28 ` Damien Guibouret
2009-11-15 19:23 ` Theodore Tso
0 siblings, 1 reply; 5+ messages in thread
From: Damien Guibouret @ 2009-11-15 10:28 UTC (permalink / raw)
To: Theodore Tso; +Cc: linux-ext4
Theodore Tso wrote:
> On Wed, Nov 11, 2009 at 01:09:00PM +0100, Damien Guibouret wrote:
>
> Yup, you're right. Ouch. That is a kernel bug, and it means that if
> we resize a filesystem to the point where we need to use meta_bg
> (because we've run out of blocks to reserve), if there are
> uninitialized block bitmaps, kernels that don't have a fix will
> misbehave by reserving too many file system metadata blocks. This
> will waste bit of disk space, which fsck will fix.
>
> (s_first_meta_bg by definition is always less than or equal to
> s_gdb_count.)
>
> I think this patch should fix things up.
>
> - Ted
>
> commit b33c339814f97fc48a843f45f6068f84bc735141
> Author: Theodore Ts'o <tytso@mit.edu>
> Date: Sat Nov 14 23:20:30 2009 -0500
>
> ext4: Fix uninit block bitmap initialization when s_meta_first_bg is non-zero
>
> The number of old-style block group descriptor blocks is
> s_meta_first_bg when the meta_bg feature flag is set.
>
> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
>
> diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c
> index 1d04189..f3032c9 100644
> --- a/fs/ext4/balloc.c
> +++ b/fs/ext4/balloc.c
> @@ -761,7 +761,13 @@ static unsigned long ext4_bg_num_gdb_meta(struct super_block *sb,
> static unsigned long ext4_bg_num_gdb_nometa(struct super_block *sb,
> ext4_group_t group)
> {
> - return ext4_bg_has_super(sb, group) ? EXT4_SB(sb)->s_gdb_count : 0;
> + if (!ext4_bg_has_super(sb, group))
> + return 0;
> +
> + if (EXT4_HAS_INCOMPAT_FEATURE(sb,EXT4_FEATURE_INCOMPAT_META_BG))
> + return le32_to_cpu(EXT4_SB(sb)->s_es->s_first_meta_bg);
> + else
> + return EXT4_SB(sb)->s_gdb_count;
> }
>
> /**
>
>
Hello,
I've open a kernel bug since:
http://bugzilla.kernel.org/show_bug.cgi?id=14601
with a proposed patch (little different from yours but it is matter of
taste :)
And I think there is some other places where kernel should be fixed when
it uses s_gdb_count (but here my knowledge of the sources are not deep
enough to be sure on what shall be performed).
Regards,
Damien
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: s_first_meta_bg treatment incompatibility between kernel and e2fsprogs
2009-11-15 10:28 ` Damien Guibouret
@ 2009-11-15 19:23 ` Theodore Tso
2009-11-16 15:51 ` Damien Guibouret
0 siblings, 1 reply; 5+ messages in thread
From: Theodore Tso @ 2009-11-15 19:23 UTC (permalink / raw)
To: Damien Guibouret; +Cc: linux-ext4
On Sun, Nov 15, 2009 at 11:28:13AM +0100, Damien Guibouret wrote:
> I've open a kernel bug since:
> http://bugzilla.kernel.org/show_bug.cgi?id=14601
> with a proposed patch (little different from yours but it is matter of
> taste :)
For future references, patches are less likely to slip through the
cracks if they are sent to the linux-ext4 mailing list as opposed to
having a BZ bug opened. (Yeah, I know, that's unusual). The reason
for that is that patches are tracked via patchwork, here:
http://patchwork.ozlabs.org/project/linux-ext4
Basically, anything that looks like a patch which is sent to
linux-ext4 gets snagged by patchwork, and it's a good place to look
for stuff that hasn't yet been merged. In some cases there are good
reasons why a patch has been kept out, and in other cases patches have
been merged or definitely rejected and I don't get to getting that
status updated in patchwork, but overall I've found it to work very
well.
As far as the matter of taste issue is concerned, I think we already
have too many static functions with a single caller, and it actually
makes the code harder to understand. So adding yet another simple
static function I think is a bad thing, not a good thing.
> And I think there is some other places where kernel should be fixed when
> it uses s_gdb_count (but here my knowledge of the sources are not deep
> enough to be sure on what shall be performed).
I've looked through the other areas, and the one place where I see a
problem is in the block validity checks in ext4_iget() for the
extended attribute block and in block_validity.c. The former can and
should be fixed to use the latter.
Here's the fix that I plan to be using. Comments, anyone?
- Ted
ext4: fix block validity checks so they work correctly with meta_bg
The block validity checks used by ext4_data_block_valid() wasn't
correctly written to check file systems with the meta_bg feature. Fix
this.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
---
fs/ext4/block_validity.c | 2 +-
fs/ext4/inode.c | 5 +----
2 files changed, 2 insertions(+), 5 deletions(-)
diff --git a/fs/ext4/block_validity.c b/fs/ext4/block_validity.c
index 50784ef..dc79b75 100644
--- a/fs/ext4/block_validity.c
+++ b/fs/ext4/block_validity.c
@@ -160,7 +160,7 @@ int ext4_setup_system_zone(struct super_block *sb)
if (ext4_bg_has_super(sb, i) &&
((i < 5) || ((i % flex_size) == 0)))
add_system_zone(sbi, ext4_group_first_block_no(sb, i),
- sbi->s_gdb_count + 1);
+ ext4_bg_num_gdb(sb, i) + 1);
gdp = ext4_get_group_desc(sb, i, NULL);
ret = add_system_zone(sbi, ext4_block_bitmap(sb, gdp), 1);
if (ret)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index b5cdb88..c62ca93 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4886,10 +4886,7 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
ret = 0;
if (ei->i_file_acl &&
- ((ei->i_file_acl <
- (le32_to_cpu(EXT4_SB(sb)->s_es->s_first_data_block) +
- EXT4_SB(sb)->s_gdb_count)) ||
- (ei->i_file_acl >= ext4_blocks_count(EXT4_SB(sb)->s_es)))) {
+ !ext4_data_block_valid(EXT4_SB(sb), ei->i_file_acl, 1)) {
ext4_error(sb, __func__,
"bad extended attribute block %llu in inode #%lu",
ei->i_file_acl, inode->i_ino);
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: s_first_meta_bg treatment incompatibility between kernel and e2fsprogs
2009-11-15 19:23 ` Theodore Tso
@ 2009-11-16 15:51 ` Damien Guibouret
0 siblings, 0 replies; 5+ messages in thread
From: Damien Guibouret @ 2009-11-16 15:51 UTC (permalink / raw)
To: Theodore Tso; +Cc: linux-ext4
Hello,
Theodore Tso wrote:
> On Sun, Nov 15, 2009 at 11:28:13AM +0100, Damien Guibouret wrote:
[...]
> As far as the matter of taste issue is concerned, I think we already
> have too many static functions with a single caller, and it actually
> makes the code harder to understand. So adding yet another simple
> static function I think is a bad thing, not a good thing.
>
It was just to mimic the existing function, but I agree with you.
The other difference is that it shall be applied on ext3 also.
>
>>And I think there is some other places where kernel should be fixed when
>>it uses s_gdb_count (but here my knowledge of the sources are not deep
>>enough to be sure on what shall be performed).
>
>
> I've looked through the other areas, and the one place where I see a
> problem is in the block validity checks in ext4_iget() for the
> extended attribute block and in block_validity.c. The former can and
> should be fixed to use the latter.
>
> Here's the fix that I plan to be using. Comments, anyone?
>
For the first one (on block_validity.c), as far as I understand,
presence of superblock and descriptors blocks in a group are no more
related in case of meta_bg group, so shouldn't be the code divided into
2 distincts part: one to treat super block, second to treat descriptor
blocks (I do not understand the ((i < 5) || ((i % flex_size) == 0) part
into the test, so add it if it is trully needed), something as:
ext4_fsblk_t firstSystemBlock = ext4_group_first_block_no(sb, i);
unsigned long nbDescBlocks;
if (ext4_bg_has_super(sb, i)) {
add_system_zone(sbi, firstSystemBlock,
1);
firstSystemBlock++;
}
nbDescBlocks = ext4_bg_num_gdb(sb, i);
if (nbDescBlocks != 0)
add_system_zone(sbi, firstSystemBlock,
nbDescBlocks);
Regards,
Damien
> - Ted
>
> ext4: fix block validity checks so they work correctly with meta_bg
>
> The block validity checks used by ext4_data_block_valid() wasn't
> correctly written to check file systems with the meta_bg feature. Fix
> this.
>
> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
> ---
> fs/ext4/block_validity.c | 2 +-
> fs/ext4/inode.c | 5 +----
> 2 files changed, 2 insertions(+), 5 deletions(-)
>
> diff --git a/fs/ext4/block_validity.c b/fs/ext4/block_validity.c
> index 50784ef..dc79b75 100644
> --- a/fs/ext4/block_validity.c
> +++ b/fs/ext4/block_validity.c
> @@ -160,7 +160,7 @@ int ext4_setup_system_zone(struct super_block *sb)
> if (ext4_bg_has_super(sb, i) &&
> ((i < 5) || ((i % flex_size) == 0)))
> add_system_zone(sbi, ext4_group_first_block_no(sb, i),
> - sbi->s_gdb_count + 1);
> + ext4_bg_num_gdb(sb, i) + 1);
> gdp = ext4_get_group_desc(sb, i, NULL);
> ret = add_system_zone(sbi, ext4_block_bitmap(sb, gdp), 1);
> if (ret)
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index b5cdb88..c62ca93 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4886,10 +4886,7 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
>
> ret = 0;
> if (ei->i_file_acl &&
> - ((ei->i_file_acl <
> - (le32_to_cpu(EXT4_SB(sb)->s_es->s_first_data_block) +
> - EXT4_SB(sb)->s_gdb_count)) ||
> - (ei->i_file_acl >= ext4_blocks_count(EXT4_SB(sb)->s_es)))) {
> + !ext4_data_block_valid(EXT4_SB(sb), ei->i_file_acl, 1)) {
> ext4_error(sb, __func__,
> "bad extended attribute block %llu in inode #%lu",
> ei->i_file_acl, inode->i_ino);
>
>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2009-11-16 15:50 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-11-11 12:09 s_first_meta_bg treatment incompatibility between kernel and e2fsprogs Damien Guibouret
2009-11-15 4:20 ` Theodore Tso
2009-11-15 10:28 ` Damien Guibouret
2009-11-15 19:23 ` Theodore Tso
2009-11-16 15:51 ` Damien Guibouret
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.