* [PATCH v5 0/3] ext4: fix a inode checksum error
@ 2021-09-01 2:09 Zhang Yi
2021-09-01 2:09 ` [PATCH v5 1/3] ext4: factor out ext4_fill_raw_inode() Zhang Yi
` (4 more replies)
0 siblings, 5 replies; 9+ messages in thread
From: Zhang Yi @ 2021-09-01 2:09 UTC (permalink / raw)
To: linux-ext4; +Cc: tytso, adilger.kernel, jack, yi.zhang, yukuai3
We find a checksum error and a inode corruption problem while doing
stress test, this 3 patches address to fix them. The first two patches
are prepare to do the fix, the last patch fix these two issue.
- Checksum error
EXT4-fs error (device sda): ext4_lookup:1784: inode #131074: comm cat: iget: checksum invalid
- Inode corruption
e2fsck 1.46.0 (29-Jan-2020)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Entry 'foo' in / (2) has deleted/unused inode 17. Clear<y>? yes
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Inode bitmap differences: -17
Fix<y>? yes
Free inodes count wrong for group #0 (32750, counted=32751).
Fix<y>? yes
Free inodes count wrong (32750, counted=32751).
Fix<y>? yes
Changes since v4:
- Drop first three already applied patches.
- Remove 'in_mem' parameter passing __ext4_get_inode_loc() in the last
patch.
Changes since v3:
- Postpone initialization to ext4_do_update_inode() may cause zeroout
newly set xattr entry. So switch to do initialization in
__ext4_get_inode_loc().
Changes since v2:
- Instead of using WARN_ON_ONCE to prevent ext4_do_update_inode()
return before filling the inode buffer, keep the error and postpone
the report after the updating in the third patch.
- Fix some language mistacks in the last patch.
Changes since v1:
- Add a patch to prevent ext4_do_update_inode() return before filling
the inode buffer.
- Do not use BH_New flag to indicate the empty buffer, postpone the
zero and uptodate logic into ext4_do_update_inode() before filling
the inode buffer.
Thanks,
Yi.
Zhang Yi (3):
ext4: factor out ext4_fill_raw_inode()
ext4: move ext4_fill_raw_inode() related functions
ext4: prevent getting empty inode buffer
fs/ext4/inode.c | 316 +++++++++++++++++++++++++-----------------------
1 file changed, 165 insertions(+), 151 deletions(-)
--
2.31.1
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v5 1/3] ext4: factor out ext4_fill_raw_inode()
2021-09-01 2:09 [PATCH v5 0/3] ext4: fix a inode checksum error Zhang Yi
@ 2021-09-01 2:09 ` Zhang Yi
2021-09-20 13:57 ` Jan Kara
2021-09-01 2:09 ` [PATCH v5 2/3] ext4: move ext4_fill_raw_inode() related functions Zhang Yi
` (3 subsequent siblings)
4 siblings, 1 reply; 9+ messages in thread
From: Zhang Yi @ 2021-09-01 2:09 UTC (permalink / raw)
To: linux-ext4; +Cc: tytso, adilger.kernel, jack, yi.zhang, yukuai3
Factor out ext4_fill_raw_inode() from ext4_do_update_inode(), which is
use to fill the in-mem inode contents into the inode table buffer, in
preparation for initializing the exclusive inode buffer without reading
the block in __ext4_get_inode_loc().
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
---
fs/ext4/inode.c | 85 +++++++++++++++++++++++++++----------------------
1 file changed, 47 insertions(+), 38 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 8323d3e8f393..c7186460c14d 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4902,9 +4902,8 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
return ERR_PTR(ret);
}
-static int ext4_inode_blocks_set(handle_t *handle,
- struct ext4_inode *raw_inode,
- struct ext4_inode_info *ei)
+static int ext4_inode_blocks_set(struct ext4_inode *raw_inode,
+ struct ext4_inode_info *ei)
{
struct inode *inode = &(ei->vfs_inode);
u64 i_blocks = READ_ONCE(inode->i_blocks);
@@ -5007,37 +5006,16 @@ static void ext4_update_other_inodes_time(struct super_block *sb,
rcu_read_unlock();
}
-/*
- * Post the struct inode info into an on-disk inode location in the
- * buffer-cache. This gobbles the caller's reference to the
- * buffer_head in the inode location struct.
- *
- * The caller must have write access to iloc->bh.
- */
-static int ext4_do_update_inode(handle_t *handle,
- struct inode *inode,
- struct ext4_iloc *iloc)
+static int ext4_fill_raw_inode(struct inode *inode, struct ext4_inode *raw_inode)
{
- struct ext4_inode *raw_inode = ext4_raw_inode(iloc);
struct ext4_inode_info *ei = EXT4_I(inode);
- struct buffer_head *bh = iloc->bh;
- struct super_block *sb = inode->i_sb;
- int err = 0, block;
- int need_datasync = 0, set_large_file = 0;
uid_t i_uid;
gid_t i_gid;
projid_t i_projid;
+ int block;
+ int err;
- spin_lock(&ei->i_raw_lock);
-
- /*
- * For fields not tracked in the in-memory inode, initialise them
- * to zero for new inodes.
- */
- if (ext4_test_inode_state(inode, EXT4_STATE_NEW))
- memset(raw_inode, 0, EXT4_SB(inode->i_sb)->s_inode_size);
-
- err = ext4_inode_blocks_set(handle, raw_inode, ei);
+ err = ext4_inode_blocks_set(raw_inode, ei);
raw_inode->i_mode = cpu_to_le16(inode->i_mode);
i_uid = i_uid_read(inode);
@@ -5079,16 +5057,8 @@ static int ext4_do_update_inode(handle_t *handle,
raw_inode->i_file_acl_high =
cpu_to_le16(ei->i_file_acl >> 32);
raw_inode->i_file_acl_lo = cpu_to_le32(ei->i_file_acl);
- if (READ_ONCE(ei->i_disksize) != ext4_isize(inode->i_sb, raw_inode)) {
- ext4_isize_set(raw_inode, ei->i_disksize);
- need_datasync = 1;
- }
- if (ei->i_disksize > 0x7fffffffULL) {
- if (!ext4_has_feature_large_file(sb) ||
- EXT4_SB(sb)->s_es->s_rev_level ==
- cpu_to_le32(EXT4_GOOD_OLD_REV))
- set_large_file = 1;
- }
+ ext4_isize_set(raw_inode, ei->i_disksize);
+
raw_inode->i_generation = cpu_to_le32(inode->i_generation);
if (S_ISCHR(inode->i_mode) || S_ISBLK(inode->i_mode)) {
if (old_valid_dev(inode->i_rdev)) {
@@ -5128,6 +5098,45 @@ static int ext4_do_update_inode(handle_t *handle,
raw_inode->i_projid = cpu_to_le32(i_projid);
ext4_inode_csum_set(inode, raw_inode, ei);
+ return err;
+}
+
+/*
+ * Post the struct inode info into an on-disk inode location in the
+ * buffer-cache. This gobbles the caller's reference to the
+ * buffer_head in the inode location struct.
+ *
+ * The caller must have write access to iloc->bh.
+ */
+static int ext4_do_update_inode(handle_t *handle,
+ struct inode *inode,
+ struct ext4_iloc *iloc)
+{
+ struct ext4_inode *raw_inode = ext4_raw_inode(iloc);
+ struct ext4_inode_info *ei = EXT4_I(inode);
+ struct buffer_head *bh = iloc->bh;
+ struct super_block *sb = inode->i_sb;
+ int err;
+ int need_datasync = 0, set_large_file = 0;
+
+ spin_lock(&ei->i_raw_lock);
+
+ /*
+ * For fields not tracked in the in-memory inode, initialise them
+ * to zero for new inodes.
+ */
+ if (ext4_test_inode_state(inode, EXT4_STATE_NEW))
+ memset(raw_inode, 0, EXT4_SB(inode->i_sb)->s_inode_size);
+
+ if (READ_ONCE(ei->i_disksize) != ext4_isize(inode->i_sb, raw_inode))
+ need_datasync = 1;
+ if (ei->i_disksize > 0x7fffffffULL) {
+ if (!ext4_has_feature_large_file(sb) ||
+ EXT4_SB(sb)->s_es->s_rev_level == cpu_to_le32(EXT4_GOOD_OLD_REV))
+ set_large_file = 1;
+ }
+
+ err = ext4_fill_raw_inode(inode, raw_inode);
spin_unlock(&ei->i_raw_lock);
if (err) {
EXT4_ERROR_INODE(inode, "corrupted inode contents");
--
2.31.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v5 2/3] ext4: move ext4_fill_raw_inode() related functions
2021-09-01 2:09 [PATCH v5 0/3] ext4: fix a inode checksum error Zhang Yi
2021-09-01 2:09 ` [PATCH v5 1/3] ext4: factor out ext4_fill_raw_inode() Zhang Yi
@ 2021-09-01 2:09 ` Zhang Yi
2021-09-20 13:57 ` Jan Kara
2021-09-01 2:09 ` [PATCH v5 3/3] ext4: prevent getting empty inode buffer Zhang Yi
` (2 subsequent siblings)
4 siblings, 1 reply; 9+ messages in thread
From: Zhang Yi @ 2021-09-01 2:09 UTC (permalink / raw)
To: linux-ext4; +Cc: tytso, adilger.kernel, jack, yi.zhang, yukuai3
In preparation for calling ext4_fill_raw_inode() in
__ext4_get_inode_loc(), move three related functions before
__ext4_get_inode_loc(), no logical change.
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
---
fs/ext4/inode.c | 293 ++++++++++++++++++++++++------------------------
1 file changed, 147 insertions(+), 146 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index c7186460c14d..3c36e701e30e 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4292,6 +4292,153 @@ int ext4_truncate(struct inode *inode)
return err;
}
+static inline u64 ext4_inode_peek_iversion(const struct inode *inode)
+{
+ if (unlikely(EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL))
+ return inode_peek_iversion_raw(inode);
+ else
+ return inode_peek_iversion(inode);
+}
+
+static int ext4_inode_blocks_set(struct ext4_inode *raw_inode,
+ struct ext4_inode_info *ei)
+{
+ struct inode *inode = &(ei->vfs_inode);
+ u64 i_blocks = READ_ONCE(inode->i_blocks);
+ struct super_block *sb = inode->i_sb;
+
+ if (i_blocks <= ~0U) {
+ /*
+ * i_blocks can be represented in a 32 bit variable
+ * as multiple of 512 bytes
+ */
+ raw_inode->i_blocks_lo = cpu_to_le32(i_blocks);
+ raw_inode->i_blocks_high = 0;
+ ext4_clear_inode_flag(inode, EXT4_INODE_HUGE_FILE);
+ return 0;
+ }
+
+ /*
+ * This should never happen since sb->s_maxbytes should not have
+ * allowed this, sb->s_maxbytes was set according to the huge_file
+ * feature in ext4_fill_super().
+ */
+ if (!ext4_has_feature_huge_file(sb))
+ return -EFSCORRUPTED;
+
+ if (i_blocks <= 0xffffffffffffULL) {
+ /*
+ * i_blocks can be represented in a 48 bit variable
+ * as multiple of 512 bytes
+ */
+ raw_inode->i_blocks_lo = cpu_to_le32(i_blocks);
+ raw_inode->i_blocks_high = cpu_to_le16(i_blocks >> 32);
+ ext4_clear_inode_flag(inode, EXT4_INODE_HUGE_FILE);
+ } else {
+ ext4_set_inode_flag(inode, EXT4_INODE_HUGE_FILE);
+ /* i_block is stored in file system block size */
+ i_blocks = i_blocks >> (inode->i_blkbits - 9);
+ raw_inode->i_blocks_lo = cpu_to_le32(i_blocks);
+ raw_inode->i_blocks_high = cpu_to_le16(i_blocks >> 32);
+ }
+ return 0;
+}
+
+static int ext4_fill_raw_inode(struct inode *inode, struct ext4_inode *raw_inode)
+{
+ struct ext4_inode_info *ei = EXT4_I(inode);
+ uid_t i_uid;
+ gid_t i_gid;
+ projid_t i_projid;
+ int block;
+ int err;
+
+ err = ext4_inode_blocks_set(raw_inode, ei);
+
+ raw_inode->i_mode = cpu_to_le16(inode->i_mode);
+ i_uid = i_uid_read(inode);
+ i_gid = i_gid_read(inode);
+ i_projid = from_kprojid(&init_user_ns, ei->i_projid);
+ if (!(test_opt(inode->i_sb, NO_UID32))) {
+ raw_inode->i_uid_low = cpu_to_le16(low_16_bits(i_uid));
+ raw_inode->i_gid_low = cpu_to_le16(low_16_bits(i_gid));
+ /*
+ * Fix up interoperability with old kernels. Otherwise,
+ * old inodes get re-used with the upper 16 bits of the
+ * uid/gid intact.
+ */
+ if (ei->i_dtime && list_empty(&ei->i_orphan)) {
+ raw_inode->i_uid_high = 0;
+ raw_inode->i_gid_high = 0;
+ } else {
+ raw_inode->i_uid_high =
+ cpu_to_le16(high_16_bits(i_uid));
+ raw_inode->i_gid_high =
+ cpu_to_le16(high_16_bits(i_gid));
+ }
+ } else {
+ raw_inode->i_uid_low = cpu_to_le16(fs_high2lowuid(i_uid));
+ raw_inode->i_gid_low = cpu_to_le16(fs_high2lowgid(i_gid));
+ raw_inode->i_uid_high = 0;
+ raw_inode->i_gid_high = 0;
+ }
+ raw_inode->i_links_count = cpu_to_le16(inode->i_nlink);
+
+ EXT4_INODE_SET_XTIME(i_ctime, inode, raw_inode);
+ EXT4_INODE_SET_XTIME(i_mtime, inode, raw_inode);
+ EXT4_INODE_SET_XTIME(i_atime, inode, raw_inode);
+ EXT4_EINODE_SET_XTIME(i_crtime, ei, raw_inode);
+
+ raw_inode->i_dtime = cpu_to_le32(ei->i_dtime);
+ raw_inode->i_flags = cpu_to_le32(ei->i_flags & 0xFFFFFFFF);
+ if (likely(!test_opt2(inode->i_sb, HURD_COMPAT)))
+ raw_inode->i_file_acl_high =
+ cpu_to_le16(ei->i_file_acl >> 32);
+ raw_inode->i_file_acl_lo = cpu_to_le32(ei->i_file_acl);
+ ext4_isize_set(raw_inode, ei->i_disksize);
+
+ raw_inode->i_generation = cpu_to_le32(inode->i_generation);
+ if (S_ISCHR(inode->i_mode) || S_ISBLK(inode->i_mode)) {
+ if (old_valid_dev(inode->i_rdev)) {
+ raw_inode->i_block[0] =
+ cpu_to_le32(old_encode_dev(inode->i_rdev));
+ raw_inode->i_block[1] = 0;
+ } else {
+ raw_inode->i_block[0] = 0;
+ raw_inode->i_block[1] =
+ cpu_to_le32(new_encode_dev(inode->i_rdev));
+ raw_inode->i_block[2] = 0;
+ }
+ } else if (!ext4_has_inline_data(inode)) {
+ for (block = 0; block < EXT4_N_BLOCKS; block++)
+ raw_inode->i_block[block] = ei->i_data[block];
+ }
+
+ if (likely(!test_opt2(inode->i_sb, HURD_COMPAT))) {
+ u64 ivers = ext4_inode_peek_iversion(inode);
+
+ raw_inode->i_disk_version = cpu_to_le32(ivers);
+ if (ei->i_extra_isize) {
+ if (EXT4_FITS_IN_INODE(raw_inode, ei, i_version_hi))
+ raw_inode->i_version_hi =
+ cpu_to_le32(ivers >> 32);
+ raw_inode->i_extra_isize =
+ cpu_to_le16(ei->i_extra_isize);
+ }
+ }
+
+ if (i_projid != EXT4_DEF_PROJID &&
+ !ext4_has_feature_project(inode->i_sb))
+ err = err ?: -EFSCORRUPTED;
+
+ if (EXT4_INODE_SIZE(inode->i_sb) > EXT4_GOOD_OLD_INODE_SIZE &&
+ EXT4_FITS_IN_INODE(raw_inode, ei, i_projid))
+ raw_inode->i_projid = cpu_to_le32(i_projid);
+
+ ext4_inode_csum_set(inode, raw_inode, ei);
+ return err;
+}
+
/*
* ext4_get_inode_loc returns with an extra refcount against the inode's
* underlying buffer_head on success. If 'in_mem' is true, we have all
@@ -4580,13 +4727,6 @@ static inline void ext4_inode_set_iversion_queried(struct inode *inode, u64 val)
else
inode_set_iversion_queried(inode, val);
}
-static inline u64 ext4_inode_peek_iversion(const struct inode *inode)
-{
- if (unlikely(EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL))
- return inode_peek_iversion_raw(inode);
- else
- return inode_peek_iversion(inode);
-}
struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
ext4_iget_flags flags, const char *function,
@@ -4902,50 +5042,6 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
return ERR_PTR(ret);
}
-static int ext4_inode_blocks_set(struct ext4_inode *raw_inode,
- struct ext4_inode_info *ei)
-{
- struct inode *inode = &(ei->vfs_inode);
- u64 i_blocks = READ_ONCE(inode->i_blocks);
- struct super_block *sb = inode->i_sb;
-
- if (i_blocks <= ~0U) {
- /*
- * i_blocks can be represented in a 32 bit variable
- * as multiple of 512 bytes
- */
- raw_inode->i_blocks_lo = cpu_to_le32(i_blocks);
- raw_inode->i_blocks_high = 0;
- ext4_clear_inode_flag(inode, EXT4_INODE_HUGE_FILE);
- return 0;
- }
-
- /*
- * This should never happen since sb->s_maxbytes should not have
- * allowed this, sb->s_maxbytes was set according to the huge_file
- * feature in ext4_fill_super().
- */
- if (!ext4_has_feature_huge_file(sb))
- return -EFSCORRUPTED;
-
- if (i_blocks <= 0xffffffffffffULL) {
- /*
- * i_blocks can be represented in a 48 bit variable
- * as multiple of 512 bytes
- */
- raw_inode->i_blocks_lo = cpu_to_le32(i_blocks);
- raw_inode->i_blocks_high = cpu_to_le16(i_blocks >> 32);
- ext4_clear_inode_flag(inode, EXT4_INODE_HUGE_FILE);
- } else {
- ext4_set_inode_flag(inode, EXT4_INODE_HUGE_FILE);
- /* i_block is stored in file system block size */
- i_blocks = i_blocks >> (inode->i_blkbits - 9);
- raw_inode->i_blocks_lo = cpu_to_le32(i_blocks);
- raw_inode->i_blocks_high = cpu_to_le16(i_blocks >> 32);
- }
- return 0;
-}
-
static void __ext4_update_other_inode_time(struct super_block *sb,
unsigned long orig_ino,
unsigned long ino,
@@ -5006,101 +5102,6 @@ static void ext4_update_other_inodes_time(struct super_block *sb,
rcu_read_unlock();
}
-static int ext4_fill_raw_inode(struct inode *inode, struct ext4_inode *raw_inode)
-{
- struct ext4_inode_info *ei = EXT4_I(inode);
- uid_t i_uid;
- gid_t i_gid;
- projid_t i_projid;
- int block;
- int err;
-
- err = ext4_inode_blocks_set(raw_inode, ei);
-
- raw_inode->i_mode = cpu_to_le16(inode->i_mode);
- i_uid = i_uid_read(inode);
- i_gid = i_gid_read(inode);
- i_projid = from_kprojid(&init_user_ns, ei->i_projid);
- if (!(test_opt(inode->i_sb, NO_UID32))) {
- raw_inode->i_uid_low = cpu_to_le16(low_16_bits(i_uid));
- raw_inode->i_gid_low = cpu_to_le16(low_16_bits(i_gid));
- /*
- * Fix up interoperability with old kernels. Otherwise,
- * old inodes get re-used with the upper 16 bits of the
- * uid/gid intact.
- */
- if (ei->i_dtime && list_empty(&ei->i_orphan)) {
- raw_inode->i_uid_high = 0;
- raw_inode->i_gid_high = 0;
- } else {
- raw_inode->i_uid_high =
- cpu_to_le16(high_16_bits(i_uid));
- raw_inode->i_gid_high =
- cpu_to_le16(high_16_bits(i_gid));
- }
- } else {
- raw_inode->i_uid_low = cpu_to_le16(fs_high2lowuid(i_uid));
- raw_inode->i_gid_low = cpu_to_le16(fs_high2lowgid(i_gid));
- raw_inode->i_uid_high = 0;
- raw_inode->i_gid_high = 0;
- }
- raw_inode->i_links_count = cpu_to_le16(inode->i_nlink);
-
- EXT4_INODE_SET_XTIME(i_ctime, inode, raw_inode);
- EXT4_INODE_SET_XTIME(i_mtime, inode, raw_inode);
- EXT4_INODE_SET_XTIME(i_atime, inode, raw_inode);
- EXT4_EINODE_SET_XTIME(i_crtime, ei, raw_inode);
-
- raw_inode->i_dtime = cpu_to_le32(ei->i_dtime);
- raw_inode->i_flags = cpu_to_le32(ei->i_flags & 0xFFFFFFFF);
- if (likely(!test_opt2(inode->i_sb, HURD_COMPAT)))
- raw_inode->i_file_acl_high =
- cpu_to_le16(ei->i_file_acl >> 32);
- raw_inode->i_file_acl_lo = cpu_to_le32(ei->i_file_acl);
- ext4_isize_set(raw_inode, ei->i_disksize);
-
- raw_inode->i_generation = cpu_to_le32(inode->i_generation);
- if (S_ISCHR(inode->i_mode) || S_ISBLK(inode->i_mode)) {
- if (old_valid_dev(inode->i_rdev)) {
- raw_inode->i_block[0] =
- cpu_to_le32(old_encode_dev(inode->i_rdev));
- raw_inode->i_block[1] = 0;
- } else {
- raw_inode->i_block[0] = 0;
- raw_inode->i_block[1] =
- cpu_to_le32(new_encode_dev(inode->i_rdev));
- raw_inode->i_block[2] = 0;
- }
- } else if (!ext4_has_inline_data(inode)) {
- for (block = 0; block < EXT4_N_BLOCKS; block++)
- raw_inode->i_block[block] = ei->i_data[block];
- }
-
- if (likely(!test_opt2(inode->i_sb, HURD_COMPAT))) {
- u64 ivers = ext4_inode_peek_iversion(inode);
-
- raw_inode->i_disk_version = cpu_to_le32(ivers);
- if (ei->i_extra_isize) {
- if (EXT4_FITS_IN_INODE(raw_inode, ei, i_version_hi))
- raw_inode->i_version_hi =
- cpu_to_le32(ivers >> 32);
- raw_inode->i_extra_isize =
- cpu_to_le16(ei->i_extra_isize);
- }
- }
-
- if (i_projid != EXT4_DEF_PROJID &&
- !ext4_has_feature_project(inode->i_sb))
- err = err ?: -EFSCORRUPTED;
-
- if (EXT4_INODE_SIZE(inode->i_sb) > EXT4_GOOD_OLD_INODE_SIZE &&
- EXT4_FITS_IN_INODE(raw_inode, ei, i_projid))
- raw_inode->i_projid = cpu_to_le32(i_projid);
-
- ext4_inode_csum_set(inode, raw_inode, ei);
- return err;
-}
-
/*
* Post the struct inode info into an on-disk inode location in the
* buffer-cache. This gobbles the caller's reference to the
--
2.31.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v5 3/3] ext4: prevent getting empty inode buffer
2021-09-01 2:09 [PATCH v5 0/3] ext4: fix a inode checksum error Zhang Yi
2021-09-01 2:09 ` [PATCH v5 1/3] ext4: factor out ext4_fill_raw_inode() Zhang Yi
2021-09-01 2:09 ` [PATCH v5 2/3] ext4: move ext4_fill_raw_inode() related functions Zhang Yi
@ 2021-09-01 2:09 ` Zhang Yi
2021-09-20 13:58 ` Jan Kara
2021-10-14 10:57 ` [PATCH v5 0/3] ext4: fix a inode checksum error Zhang Yi
2021-10-28 14:47 ` Theodore Ts'o
4 siblings, 1 reply; 9+ messages in thread
From: Zhang Yi @ 2021-09-01 2:09 UTC (permalink / raw)
To: linux-ext4; +Cc: tytso, adilger.kernel, jack, yi.zhang, yukuai3
In ext4_get_inode_loc(), we may skip IO and get an zero && uptodate
inode buffer when the inode monopolize an inode block for performance
reason. For most cases, ext4_mark_iloc_dirty() will fill the inode
buffer to make it fine, but we could miss this call if something bad
happened. Finally, __ext4_get_inode_loc_noinmem() may probably get an
empty inode buffer and trigger ext4 error.
For example, if we remove a nonexistent xattr on inode A,
ext4_xattr_set_handle() will return ENODATA before invoking
ext4_mark_iloc_dirty(), it will left an uptodate but zero buffer. We
will get checksum error message in ext4_iget() when getting inode again.
EXT4-fs error (device sda): ext4_lookup:1784: inode #131074: comm cat: iget: checksum invalid
Even worse, if we allocate another inode B at the same inode block, it
will corrupt the inode A on disk when write back inode B.
So this patch initialize the inode buffer by filling the in-mem inode
contents if we skip read I/O, ensure that the buffer is really uptodate.
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
---
fs/ext4/inode.c | 24 ++++++++++++++----------
1 file changed, 14 insertions(+), 10 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 3c36e701e30e..a8388ec91f9f 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4441,12 +4441,12 @@ static int ext4_fill_raw_inode(struct inode *inode, struct ext4_inode *raw_inode
/*
* ext4_get_inode_loc returns with an extra refcount against the inode's
- * underlying buffer_head on success. If 'in_mem' is true, we have all
- * data in memory that is needed to recreate the on-disk version of this
- * inode.
+ * underlying buffer_head on success. If we pass 'inode' and it does not
+ * have in-inode xattr, we have all inode data in memory that is needed
+ * to recreate the on-disk version of this inode.
*/
static int __ext4_get_inode_loc(struct super_block *sb, unsigned long ino,
- struct ext4_iloc *iloc, int in_mem,
+ struct inode *inode, struct ext4_iloc *iloc,
ext4_fsblk_t *ret_block)
{
struct ext4_group_desc *gdp;
@@ -4486,7 +4486,7 @@ static int __ext4_get_inode_loc(struct super_block *sb, unsigned long ino,
* is the only valid inode in the block, we need not read the
* block.
*/
- if (in_mem) {
+ if (inode && !ext4_test_inode_state(inode, EXT4_STATE_XATTR)) {
struct buffer_head *bitmap_bh;
int i, start;
@@ -4514,8 +4514,13 @@ static int __ext4_get_inode_loc(struct super_block *sb, unsigned long ino,
}
brelse(bitmap_bh);
if (i == start + inodes_per_block) {
+ struct ext4_inode *raw_inode =
+ (struct ext4_inode *) (bh->b_data + iloc->offset);
+
/* all other inodes are free, so skip I/O */
memset(bh->b_data, 0, bh->b_size);
+ if (!ext4_test_inode_state(inode, EXT4_STATE_NEW))
+ ext4_fill_raw_inode(inode, raw_inode);
set_buffer_uptodate(bh);
unlock_buffer(bh);
goto has_buffer;
@@ -4576,7 +4581,7 @@ static int __ext4_get_inode_loc_noinmem(struct inode *inode,
ext4_fsblk_t err_blk;
int ret;
- ret = __ext4_get_inode_loc(inode->i_sb, inode->i_ino, iloc, 0,
+ ret = __ext4_get_inode_loc(inode->i_sb, inode->i_ino, NULL, iloc,
&err_blk);
if (ret == -EIO)
@@ -4591,9 +4596,8 @@ int ext4_get_inode_loc(struct inode *inode, struct ext4_iloc *iloc)
ext4_fsblk_t err_blk;
int ret;
- /* We have all inode data except xattrs in memory here. */
- ret = __ext4_get_inode_loc(inode->i_sb, inode->i_ino, iloc,
- !ext4_test_inode_state(inode, EXT4_STATE_XATTR), &err_blk);
+ ret = __ext4_get_inode_loc(inode->i_sb, inode->i_ino, inode, iloc,
+ &err_blk);
if (ret == -EIO)
ext4_error_inode_block(inode, err_blk, EIO,
@@ -4606,7 +4610,7 @@ int ext4_get_inode_loc(struct inode *inode, struct ext4_iloc *iloc)
int ext4_get_fc_inode_loc(struct super_block *sb, unsigned long ino,
struct ext4_iloc *iloc)
{
- return __ext4_get_inode_loc(sb, ino, iloc, 0, NULL);
+ return __ext4_get_inode_loc(sb, ino, NULL, iloc, NULL);
}
static bool ext4_should_enable_dax(struct inode *inode)
--
2.31.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v5 1/3] ext4: factor out ext4_fill_raw_inode()
2021-09-01 2:09 ` [PATCH v5 1/3] ext4: factor out ext4_fill_raw_inode() Zhang Yi
@ 2021-09-20 13:57 ` Jan Kara
0 siblings, 0 replies; 9+ messages in thread
From: Jan Kara @ 2021-09-20 13:57 UTC (permalink / raw)
To: Zhang Yi; +Cc: linux-ext4, tytso, adilger.kernel, jack, yukuai3
On Wed 01-09-21 10:09:53, Zhang Yi wrote:
> Factor out ext4_fill_raw_inode() from ext4_do_update_inode(), which is
> use to fill the in-mem inode contents into the inode table buffer, in
> preparation for initializing the exclusive inode buffer without reading
> the block in __ext4_get_inode_loc().
>
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Looks good. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/ext4/inode.c | 85 +++++++++++++++++++++++++++----------------------
> 1 file changed, 47 insertions(+), 38 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 8323d3e8f393..c7186460c14d 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4902,9 +4902,8 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
> return ERR_PTR(ret);
> }
>
> -static int ext4_inode_blocks_set(handle_t *handle,
> - struct ext4_inode *raw_inode,
> - struct ext4_inode_info *ei)
> +static int ext4_inode_blocks_set(struct ext4_inode *raw_inode,
> + struct ext4_inode_info *ei)
> {
> struct inode *inode = &(ei->vfs_inode);
> u64 i_blocks = READ_ONCE(inode->i_blocks);
> @@ -5007,37 +5006,16 @@ static void ext4_update_other_inodes_time(struct super_block *sb,
> rcu_read_unlock();
> }
>
> -/*
> - * Post the struct inode info into an on-disk inode location in the
> - * buffer-cache. This gobbles the caller's reference to the
> - * buffer_head in the inode location struct.
> - *
> - * The caller must have write access to iloc->bh.
> - */
> -static int ext4_do_update_inode(handle_t *handle,
> - struct inode *inode,
> - struct ext4_iloc *iloc)
> +static int ext4_fill_raw_inode(struct inode *inode, struct ext4_inode *raw_inode)
> {
> - struct ext4_inode *raw_inode = ext4_raw_inode(iloc);
> struct ext4_inode_info *ei = EXT4_I(inode);
> - struct buffer_head *bh = iloc->bh;
> - struct super_block *sb = inode->i_sb;
> - int err = 0, block;
> - int need_datasync = 0, set_large_file = 0;
> uid_t i_uid;
> gid_t i_gid;
> projid_t i_projid;
> + int block;
> + int err;
>
> - spin_lock(&ei->i_raw_lock);
> -
> - /*
> - * For fields not tracked in the in-memory inode, initialise them
> - * to zero for new inodes.
> - */
> - if (ext4_test_inode_state(inode, EXT4_STATE_NEW))
> - memset(raw_inode, 0, EXT4_SB(inode->i_sb)->s_inode_size);
> -
> - err = ext4_inode_blocks_set(handle, raw_inode, ei);
> + err = ext4_inode_blocks_set(raw_inode, ei);
>
> raw_inode->i_mode = cpu_to_le16(inode->i_mode);
> i_uid = i_uid_read(inode);
> @@ -5079,16 +5057,8 @@ static int ext4_do_update_inode(handle_t *handle,
> raw_inode->i_file_acl_high =
> cpu_to_le16(ei->i_file_acl >> 32);
> raw_inode->i_file_acl_lo = cpu_to_le32(ei->i_file_acl);
> - if (READ_ONCE(ei->i_disksize) != ext4_isize(inode->i_sb, raw_inode)) {
> - ext4_isize_set(raw_inode, ei->i_disksize);
> - need_datasync = 1;
> - }
> - if (ei->i_disksize > 0x7fffffffULL) {
> - if (!ext4_has_feature_large_file(sb) ||
> - EXT4_SB(sb)->s_es->s_rev_level ==
> - cpu_to_le32(EXT4_GOOD_OLD_REV))
> - set_large_file = 1;
> - }
> + ext4_isize_set(raw_inode, ei->i_disksize);
> +
> raw_inode->i_generation = cpu_to_le32(inode->i_generation);
> if (S_ISCHR(inode->i_mode) || S_ISBLK(inode->i_mode)) {
> if (old_valid_dev(inode->i_rdev)) {
> @@ -5128,6 +5098,45 @@ static int ext4_do_update_inode(handle_t *handle,
> raw_inode->i_projid = cpu_to_le32(i_projid);
>
> ext4_inode_csum_set(inode, raw_inode, ei);
> + return err;
> +}
> +
> +/*
> + * Post the struct inode info into an on-disk inode location in the
> + * buffer-cache. This gobbles the caller's reference to the
> + * buffer_head in the inode location struct.
> + *
> + * The caller must have write access to iloc->bh.
> + */
> +static int ext4_do_update_inode(handle_t *handle,
> + struct inode *inode,
> + struct ext4_iloc *iloc)
> +{
> + struct ext4_inode *raw_inode = ext4_raw_inode(iloc);
> + struct ext4_inode_info *ei = EXT4_I(inode);
> + struct buffer_head *bh = iloc->bh;
> + struct super_block *sb = inode->i_sb;
> + int err;
> + int need_datasync = 0, set_large_file = 0;
> +
> + spin_lock(&ei->i_raw_lock);
> +
> + /*
> + * For fields not tracked in the in-memory inode, initialise them
> + * to zero for new inodes.
> + */
> + if (ext4_test_inode_state(inode, EXT4_STATE_NEW))
> + memset(raw_inode, 0, EXT4_SB(inode->i_sb)->s_inode_size);
> +
> + if (READ_ONCE(ei->i_disksize) != ext4_isize(inode->i_sb, raw_inode))
> + need_datasync = 1;
> + if (ei->i_disksize > 0x7fffffffULL) {
> + if (!ext4_has_feature_large_file(sb) ||
> + EXT4_SB(sb)->s_es->s_rev_level == cpu_to_le32(EXT4_GOOD_OLD_REV))
> + set_large_file = 1;
> + }
> +
> + err = ext4_fill_raw_inode(inode, raw_inode);
> spin_unlock(&ei->i_raw_lock);
> if (err) {
> EXT4_ERROR_INODE(inode, "corrupted inode contents");
> --
> 2.31.1
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v5 2/3] ext4: move ext4_fill_raw_inode() related functions
2021-09-01 2:09 ` [PATCH v5 2/3] ext4: move ext4_fill_raw_inode() related functions Zhang Yi
@ 2021-09-20 13:57 ` Jan Kara
0 siblings, 0 replies; 9+ messages in thread
From: Jan Kara @ 2021-09-20 13:57 UTC (permalink / raw)
To: Zhang Yi; +Cc: linux-ext4, tytso, adilger.kernel, jack, yukuai3
On Wed 01-09-21 10:09:54, Zhang Yi wrote:
> In preparation for calling ext4_fill_raw_inode() in
> __ext4_get_inode_loc(), move three related functions before
> __ext4_get_inode_loc(), no logical change.
>
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Looks good. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/ext4/inode.c | 293 ++++++++++++++++++++++++------------------------
> 1 file changed, 147 insertions(+), 146 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index c7186460c14d..3c36e701e30e 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4292,6 +4292,153 @@ int ext4_truncate(struct inode *inode)
> return err;
> }
>
> +static inline u64 ext4_inode_peek_iversion(const struct inode *inode)
> +{
> + if (unlikely(EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL))
> + return inode_peek_iversion_raw(inode);
> + else
> + return inode_peek_iversion(inode);
> +}
> +
> +static int ext4_inode_blocks_set(struct ext4_inode *raw_inode,
> + struct ext4_inode_info *ei)
> +{
> + struct inode *inode = &(ei->vfs_inode);
> + u64 i_blocks = READ_ONCE(inode->i_blocks);
> + struct super_block *sb = inode->i_sb;
> +
> + if (i_blocks <= ~0U) {
> + /*
> + * i_blocks can be represented in a 32 bit variable
> + * as multiple of 512 bytes
> + */
> + raw_inode->i_blocks_lo = cpu_to_le32(i_blocks);
> + raw_inode->i_blocks_high = 0;
> + ext4_clear_inode_flag(inode, EXT4_INODE_HUGE_FILE);
> + return 0;
> + }
> +
> + /*
> + * This should never happen since sb->s_maxbytes should not have
> + * allowed this, sb->s_maxbytes was set according to the huge_file
> + * feature in ext4_fill_super().
> + */
> + if (!ext4_has_feature_huge_file(sb))
> + return -EFSCORRUPTED;
> +
> + if (i_blocks <= 0xffffffffffffULL) {
> + /*
> + * i_blocks can be represented in a 48 bit variable
> + * as multiple of 512 bytes
> + */
> + raw_inode->i_blocks_lo = cpu_to_le32(i_blocks);
> + raw_inode->i_blocks_high = cpu_to_le16(i_blocks >> 32);
> + ext4_clear_inode_flag(inode, EXT4_INODE_HUGE_FILE);
> + } else {
> + ext4_set_inode_flag(inode, EXT4_INODE_HUGE_FILE);
> + /* i_block is stored in file system block size */
> + i_blocks = i_blocks >> (inode->i_blkbits - 9);
> + raw_inode->i_blocks_lo = cpu_to_le32(i_blocks);
> + raw_inode->i_blocks_high = cpu_to_le16(i_blocks >> 32);
> + }
> + return 0;
> +}
> +
> +static int ext4_fill_raw_inode(struct inode *inode, struct ext4_inode *raw_inode)
> +{
> + struct ext4_inode_info *ei = EXT4_I(inode);
> + uid_t i_uid;
> + gid_t i_gid;
> + projid_t i_projid;
> + int block;
> + int err;
> +
> + err = ext4_inode_blocks_set(raw_inode, ei);
> +
> + raw_inode->i_mode = cpu_to_le16(inode->i_mode);
> + i_uid = i_uid_read(inode);
> + i_gid = i_gid_read(inode);
> + i_projid = from_kprojid(&init_user_ns, ei->i_projid);
> + if (!(test_opt(inode->i_sb, NO_UID32))) {
> + raw_inode->i_uid_low = cpu_to_le16(low_16_bits(i_uid));
> + raw_inode->i_gid_low = cpu_to_le16(low_16_bits(i_gid));
> + /*
> + * Fix up interoperability with old kernels. Otherwise,
> + * old inodes get re-used with the upper 16 bits of the
> + * uid/gid intact.
> + */
> + if (ei->i_dtime && list_empty(&ei->i_orphan)) {
> + raw_inode->i_uid_high = 0;
> + raw_inode->i_gid_high = 0;
> + } else {
> + raw_inode->i_uid_high =
> + cpu_to_le16(high_16_bits(i_uid));
> + raw_inode->i_gid_high =
> + cpu_to_le16(high_16_bits(i_gid));
> + }
> + } else {
> + raw_inode->i_uid_low = cpu_to_le16(fs_high2lowuid(i_uid));
> + raw_inode->i_gid_low = cpu_to_le16(fs_high2lowgid(i_gid));
> + raw_inode->i_uid_high = 0;
> + raw_inode->i_gid_high = 0;
> + }
> + raw_inode->i_links_count = cpu_to_le16(inode->i_nlink);
> +
> + EXT4_INODE_SET_XTIME(i_ctime, inode, raw_inode);
> + EXT4_INODE_SET_XTIME(i_mtime, inode, raw_inode);
> + EXT4_INODE_SET_XTIME(i_atime, inode, raw_inode);
> + EXT4_EINODE_SET_XTIME(i_crtime, ei, raw_inode);
> +
> + raw_inode->i_dtime = cpu_to_le32(ei->i_dtime);
> + raw_inode->i_flags = cpu_to_le32(ei->i_flags & 0xFFFFFFFF);
> + if (likely(!test_opt2(inode->i_sb, HURD_COMPAT)))
> + raw_inode->i_file_acl_high =
> + cpu_to_le16(ei->i_file_acl >> 32);
> + raw_inode->i_file_acl_lo = cpu_to_le32(ei->i_file_acl);
> + ext4_isize_set(raw_inode, ei->i_disksize);
> +
> + raw_inode->i_generation = cpu_to_le32(inode->i_generation);
> + if (S_ISCHR(inode->i_mode) || S_ISBLK(inode->i_mode)) {
> + if (old_valid_dev(inode->i_rdev)) {
> + raw_inode->i_block[0] =
> + cpu_to_le32(old_encode_dev(inode->i_rdev));
> + raw_inode->i_block[1] = 0;
> + } else {
> + raw_inode->i_block[0] = 0;
> + raw_inode->i_block[1] =
> + cpu_to_le32(new_encode_dev(inode->i_rdev));
> + raw_inode->i_block[2] = 0;
> + }
> + } else if (!ext4_has_inline_data(inode)) {
> + for (block = 0; block < EXT4_N_BLOCKS; block++)
> + raw_inode->i_block[block] = ei->i_data[block];
> + }
> +
> + if (likely(!test_opt2(inode->i_sb, HURD_COMPAT))) {
> + u64 ivers = ext4_inode_peek_iversion(inode);
> +
> + raw_inode->i_disk_version = cpu_to_le32(ivers);
> + if (ei->i_extra_isize) {
> + if (EXT4_FITS_IN_INODE(raw_inode, ei, i_version_hi))
> + raw_inode->i_version_hi =
> + cpu_to_le32(ivers >> 32);
> + raw_inode->i_extra_isize =
> + cpu_to_le16(ei->i_extra_isize);
> + }
> + }
> +
> + if (i_projid != EXT4_DEF_PROJID &&
> + !ext4_has_feature_project(inode->i_sb))
> + err = err ?: -EFSCORRUPTED;
> +
> + if (EXT4_INODE_SIZE(inode->i_sb) > EXT4_GOOD_OLD_INODE_SIZE &&
> + EXT4_FITS_IN_INODE(raw_inode, ei, i_projid))
> + raw_inode->i_projid = cpu_to_le32(i_projid);
> +
> + ext4_inode_csum_set(inode, raw_inode, ei);
> + return err;
> +}
> +
> /*
> * ext4_get_inode_loc returns with an extra refcount against the inode's
> * underlying buffer_head on success. If 'in_mem' is true, we have all
> @@ -4580,13 +4727,6 @@ static inline void ext4_inode_set_iversion_queried(struct inode *inode, u64 val)
> else
> inode_set_iversion_queried(inode, val);
> }
> -static inline u64 ext4_inode_peek_iversion(const struct inode *inode)
> -{
> - if (unlikely(EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL))
> - return inode_peek_iversion_raw(inode);
> - else
> - return inode_peek_iversion(inode);
> -}
>
> struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
> ext4_iget_flags flags, const char *function,
> @@ -4902,50 +5042,6 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
> return ERR_PTR(ret);
> }
>
> -static int ext4_inode_blocks_set(struct ext4_inode *raw_inode,
> - struct ext4_inode_info *ei)
> -{
> - struct inode *inode = &(ei->vfs_inode);
> - u64 i_blocks = READ_ONCE(inode->i_blocks);
> - struct super_block *sb = inode->i_sb;
> -
> - if (i_blocks <= ~0U) {
> - /*
> - * i_blocks can be represented in a 32 bit variable
> - * as multiple of 512 bytes
> - */
> - raw_inode->i_blocks_lo = cpu_to_le32(i_blocks);
> - raw_inode->i_blocks_high = 0;
> - ext4_clear_inode_flag(inode, EXT4_INODE_HUGE_FILE);
> - return 0;
> - }
> -
> - /*
> - * This should never happen since sb->s_maxbytes should not have
> - * allowed this, sb->s_maxbytes was set according to the huge_file
> - * feature in ext4_fill_super().
> - */
> - if (!ext4_has_feature_huge_file(sb))
> - return -EFSCORRUPTED;
> -
> - if (i_blocks <= 0xffffffffffffULL) {
> - /*
> - * i_blocks can be represented in a 48 bit variable
> - * as multiple of 512 bytes
> - */
> - raw_inode->i_blocks_lo = cpu_to_le32(i_blocks);
> - raw_inode->i_blocks_high = cpu_to_le16(i_blocks >> 32);
> - ext4_clear_inode_flag(inode, EXT4_INODE_HUGE_FILE);
> - } else {
> - ext4_set_inode_flag(inode, EXT4_INODE_HUGE_FILE);
> - /* i_block is stored in file system block size */
> - i_blocks = i_blocks >> (inode->i_blkbits - 9);
> - raw_inode->i_blocks_lo = cpu_to_le32(i_blocks);
> - raw_inode->i_blocks_high = cpu_to_le16(i_blocks >> 32);
> - }
> - return 0;
> -}
> -
> static void __ext4_update_other_inode_time(struct super_block *sb,
> unsigned long orig_ino,
> unsigned long ino,
> @@ -5006,101 +5102,6 @@ static void ext4_update_other_inodes_time(struct super_block *sb,
> rcu_read_unlock();
> }
>
> -static int ext4_fill_raw_inode(struct inode *inode, struct ext4_inode *raw_inode)
> -{
> - struct ext4_inode_info *ei = EXT4_I(inode);
> - uid_t i_uid;
> - gid_t i_gid;
> - projid_t i_projid;
> - int block;
> - int err;
> -
> - err = ext4_inode_blocks_set(raw_inode, ei);
> -
> - raw_inode->i_mode = cpu_to_le16(inode->i_mode);
> - i_uid = i_uid_read(inode);
> - i_gid = i_gid_read(inode);
> - i_projid = from_kprojid(&init_user_ns, ei->i_projid);
> - if (!(test_opt(inode->i_sb, NO_UID32))) {
> - raw_inode->i_uid_low = cpu_to_le16(low_16_bits(i_uid));
> - raw_inode->i_gid_low = cpu_to_le16(low_16_bits(i_gid));
> - /*
> - * Fix up interoperability with old kernels. Otherwise,
> - * old inodes get re-used with the upper 16 bits of the
> - * uid/gid intact.
> - */
> - if (ei->i_dtime && list_empty(&ei->i_orphan)) {
> - raw_inode->i_uid_high = 0;
> - raw_inode->i_gid_high = 0;
> - } else {
> - raw_inode->i_uid_high =
> - cpu_to_le16(high_16_bits(i_uid));
> - raw_inode->i_gid_high =
> - cpu_to_le16(high_16_bits(i_gid));
> - }
> - } else {
> - raw_inode->i_uid_low = cpu_to_le16(fs_high2lowuid(i_uid));
> - raw_inode->i_gid_low = cpu_to_le16(fs_high2lowgid(i_gid));
> - raw_inode->i_uid_high = 0;
> - raw_inode->i_gid_high = 0;
> - }
> - raw_inode->i_links_count = cpu_to_le16(inode->i_nlink);
> -
> - EXT4_INODE_SET_XTIME(i_ctime, inode, raw_inode);
> - EXT4_INODE_SET_XTIME(i_mtime, inode, raw_inode);
> - EXT4_INODE_SET_XTIME(i_atime, inode, raw_inode);
> - EXT4_EINODE_SET_XTIME(i_crtime, ei, raw_inode);
> -
> - raw_inode->i_dtime = cpu_to_le32(ei->i_dtime);
> - raw_inode->i_flags = cpu_to_le32(ei->i_flags & 0xFFFFFFFF);
> - if (likely(!test_opt2(inode->i_sb, HURD_COMPAT)))
> - raw_inode->i_file_acl_high =
> - cpu_to_le16(ei->i_file_acl >> 32);
> - raw_inode->i_file_acl_lo = cpu_to_le32(ei->i_file_acl);
> - ext4_isize_set(raw_inode, ei->i_disksize);
> -
> - raw_inode->i_generation = cpu_to_le32(inode->i_generation);
> - if (S_ISCHR(inode->i_mode) || S_ISBLK(inode->i_mode)) {
> - if (old_valid_dev(inode->i_rdev)) {
> - raw_inode->i_block[0] =
> - cpu_to_le32(old_encode_dev(inode->i_rdev));
> - raw_inode->i_block[1] = 0;
> - } else {
> - raw_inode->i_block[0] = 0;
> - raw_inode->i_block[1] =
> - cpu_to_le32(new_encode_dev(inode->i_rdev));
> - raw_inode->i_block[2] = 0;
> - }
> - } else if (!ext4_has_inline_data(inode)) {
> - for (block = 0; block < EXT4_N_BLOCKS; block++)
> - raw_inode->i_block[block] = ei->i_data[block];
> - }
> -
> - if (likely(!test_opt2(inode->i_sb, HURD_COMPAT))) {
> - u64 ivers = ext4_inode_peek_iversion(inode);
> -
> - raw_inode->i_disk_version = cpu_to_le32(ivers);
> - if (ei->i_extra_isize) {
> - if (EXT4_FITS_IN_INODE(raw_inode, ei, i_version_hi))
> - raw_inode->i_version_hi =
> - cpu_to_le32(ivers >> 32);
> - raw_inode->i_extra_isize =
> - cpu_to_le16(ei->i_extra_isize);
> - }
> - }
> -
> - if (i_projid != EXT4_DEF_PROJID &&
> - !ext4_has_feature_project(inode->i_sb))
> - err = err ?: -EFSCORRUPTED;
> -
> - if (EXT4_INODE_SIZE(inode->i_sb) > EXT4_GOOD_OLD_INODE_SIZE &&
> - EXT4_FITS_IN_INODE(raw_inode, ei, i_projid))
> - raw_inode->i_projid = cpu_to_le32(i_projid);
> -
> - ext4_inode_csum_set(inode, raw_inode, ei);
> - return err;
> -}
> -
> /*
> * Post the struct inode info into an on-disk inode location in the
> * buffer-cache. This gobbles the caller's reference to the
> --
> 2.31.1
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v5 3/3] ext4: prevent getting empty inode buffer
2021-09-01 2:09 ` [PATCH v5 3/3] ext4: prevent getting empty inode buffer Zhang Yi
@ 2021-09-20 13:58 ` Jan Kara
0 siblings, 0 replies; 9+ messages in thread
From: Jan Kara @ 2021-09-20 13:58 UTC (permalink / raw)
To: Zhang Yi; +Cc: linux-ext4, tytso, adilger.kernel, jack, yukuai3
On Wed 01-09-21 10:09:55, Zhang Yi wrote:
> In ext4_get_inode_loc(), we may skip IO and get an zero && uptodate
> inode buffer when the inode monopolize an inode block for performance
> reason. For most cases, ext4_mark_iloc_dirty() will fill the inode
> buffer to make it fine, but we could miss this call if something bad
> happened. Finally, __ext4_get_inode_loc_noinmem() may probably get an
> empty inode buffer and trigger ext4 error.
>
> For example, if we remove a nonexistent xattr on inode A,
> ext4_xattr_set_handle() will return ENODATA before invoking
> ext4_mark_iloc_dirty(), it will left an uptodate but zero buffer. We
> will get checksum error message in ext4_iget() when getting inode again.
>
> EXT4-fs error (device sda): ext4_lookup:1784: inode #131074: comm cat: iget: checksum invalid
>
> Even worse, if we allocate another inode B at the same inode block, it
> will corrupt the inode A on disk when write back inode B.
>
> So this patch initialize the inode buffer by filling the in-mem inode
> contents if we skip read I/O, ensure that the buffer is really uptodate.
>
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Looks good. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/ext4/inode.c | 24 ++++++++++++++----------
> 1 file changed, 14 insertions(+), 10 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 3c36e701e30e..a8388ec91f9f 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4441,12 +4441,12 @@ static int ext4_fill_raw_inode(struct inode *inode, struct ext4_inode *raw_inode
>
> /*
> * ext4_get_inode_loc returns with an extra refcount against the inode's
> - * underlying buffer_head on success. If 'in_mem' is true, we have all
> - * data in memory that is needed to recreate the on-disk version of this
> - * inode.
> + * underlying buffer_head on success. If we pass 'inode' and it does not
> + * have in-inode xattr, we have all inode data in memory that is needed
> + * to recreate the on-disk version of this inode.
> */
> static int __ext4_get_inode_loc(struct super_block *sb, unsigned long ino,
> - struct ext4_iloc *iloc, int in_mem,
> + struct inode *inode, struct ext4_iloc *iloc,
> ext4_fsblk_t *ret_block)
> {
> struct ext4_group_desc *gdp;
> @@ -4486,7 +4486,7 @@ static int __ext4_get_inode_loc(struct super_block *sb, unsigned long ino,
> * is the only valid inode in the block, we need not read the
> * block.
> */
> - if (in_mem) {
> + if (inode && !ext4_test_inode_state(inode, EXT4_STATE_XATTR)) {
> struct buffer_head *bitmap_bh;
> int i, start;
>
> @@ -4514,8 +4514,13 @@ static int __ext4_get_inode_loc(struct super_block *sb, unsigned long ino,
> }
> brelse(bitmap_bh);
> if (i == start + inodes_per_block) {
> + struct ext4_inode *raw_inode =
> + (struct ext4_inode *) (bh->b_data + iloc->offset);
> +
> /* all other inodes are free, so skip I/O */
> memset(bh->b_data, 0, bh->b_size);
> + if (!ext4_test_inode_state(inode, EXT4_STATE_NEW))
> + ext4_fill_raw_inode(inode, raw_inode);
> set_buffer_uptodate(bh);
> unlock_buffer(bh);
> goto has_buffer;
> @@ -4576,7 +4581,7 @@ static int __ext4_get_inode_loc_noinmem(struct inode *inode,
> ext4_fsblk_t err_blk;
> int ret;
>
> - ret = __ext4_get_inode_loc(inode->i_sb, inode->i_ino, iloc, 0,
> + ret = __ext4_get_inode_loc(inode->i_sb, inode->i_ino, NULL, iloc,
> &err_blk);
>
> if (ret == -EIO)
> @@ -4591,9 +4596,8 @@ int ext4_get_inode_loc(struct inode *inode, struct ext4_iloc *iloc)
> ext4_fsblk_t err_blk;
> int ret;
>
> - /* We have all inode data except xattrs in memory here. */
> - ret = __ext4_get_inode_loc(inode->i_sb, inode->i_ino, iloc,
> - !ext4_test_inode_state(inode, EXT4_STATE_XATTR), &err_blk);
> + ret = __ext4_get_inode_loc(inode->i_sb, inode->i_ino, inode, iloc,
> + &err_blk);
>
> if (ret == -EIO)
> ext4_error_inode_block(inode, err_blk, EIO,
> @@ -4606,7 +4610,7 @@ int ext4_get_inode_loc(struct inode *inode, struct ext4_iloc *iloc)
> int ext4_get_fc_inode_loc(struct super_block *sb, unsigned long ino,
> struct ext4_iloc *iloc)
> {
> - return __ext4_get_inode_loc(sb, ino, iloc, 0, NULL);
> + return __ext4_get_inode_loc(sb, ino, NULL, iloc, NULL);
> }
>
> static bool ext4_should_enable_dax(struct inode *inode)
> --
> 2.31.1
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v5 0/3] ext4: fix a inode checksum error
2021-09-01 2:09 [PATCH v5 0/3] ext4: fix a inode checksum error Zhang Yi
` (2 preceding siblings ...)
2021-09-01 2:09 ` [PATCH v5 3/3] ext4: prevent getting empty inode buffer Zhang Yi
@ 2021-10-14 10:57 ` Zhang Yi
2021-10-28 14:47 ` Theodore Ts'o
4 siblings, 0 replies; 9+ messages in thread
From: Zhang Yi @ 2021-10-14 10:57 UTC (permalink / raw)
To: tytso; +Cc: adilger.kernel, jack, yukuai3, linux-ext4
Hi, Ted. Do you consider to merge this remaining patch set?
Thanks,
Yi.
On 2021/9/1 10:09, Zhang Yi wrote:
> We find a checksum error and a inode corruption problem while doing
> stress test, this 3 patches address to fix them. The first two patches
> are prepare to do the fix, the last patch fix these two issue.
>
> - Checksum error
>
> EXT4-fs error (device sda): ext4_lookup:1784: inode #131074: comm cat: iget: checksum invalid
>
> - Inode corruption
>
> e2fsck 1.46.0 (29-Jan-2020)
> Pass 1: Checking inodes, blocks, and sizes
> Pass 2: Checking directory structure
> Entry 'foo' in / (2) has deleted/unused inode 17. Clear<y>? yes
> Pass 3: Checking directory connectivity
> Pass 4: Checking reference counts
> Pass 5: Checking group summary information
> Inode bitmap differences: -17
> Fix<y>? yes
> Free inodes count wrong for group #0 (32750, counted=32751).
> Fix<y>? yes
> Free inodes count wrong (32750, counted=32751).
> Fix<y>? yes
>
> Changes since v4:
> - Drop first three already applied patches.
> - Remove 'in_mem' parameter passing __ext4_get_inode_loc() in the last
> patch.
>
> Changes since v3:
> - Postpone initialization to ext4_do_update_inode() may cause zeroout
> newly set xattr entry. So switch to do initialization in
> __ext4_get_inode_loc().
>
> Changes since v2:
> - Instead of using WARN_ON_ONCE to prevent ext4_do_update_inode()
> return before filling the inode buffer, keep the error and postpone
> the report after the updating in the third patch.
> - Fix some language mistacks in the last patch.
>
> Changes since v1:
> - Add a patch to prevent ext4_do_update_inode() return before filling
> the inode buffer.
> - Do not use BH_New flag to indicate the empty buffer, postpone the
> zero and uptodate logic into ext4_do_update_inode() before filling
> the inode buffer.
>
> Thanks,
> Yi.
>
> Zhang Yi (3):
> ext4: factor out ext4_fill_raw_inode()
> ext4: move ext4_fill_raw_inode() related functions
> ext4: prevent getting empty inode buffer
>
> fs/ext4/inode.c | 316 +++++++++++++++++++++++++-----------------------
> 1 file changed, 165 insertions(+), 151 deletions(-)
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v5 0/3] ext4: fix a inode checksum error
2021-09-01 2:09 [PATCH v5 0/3] ext4: fix a inode checksum error Zhang Yi
` (3 preceding siblings ...)
2021-10-14 10:57 ` [PATCH v5 0/3] ext4: fix a inode checksum error Zhang Yi
@ 2021-10-28 14:47 ` Theodore Ts'o
4 siblings, 0 replies; 9+ messages in thread
From: Theodore Ts'o @ 2021-10-28 14:47 UTC (permalink / raw)
To: linux-ext4, Zhang Yi; +Cc: Theodore Ts'o, jack, adilger.kernel, yukuai3
On Wed, 1 Sep 2021 10:09:52 +0800, Zhang Yi wrote:
> We find a checksum error and a inode corruption problem while doing
> stress test, this 3 patches address to fix them. The first two patches
> are prepare to do the fix, the last patch fix these two issue.
>
> - Checksum error
>
> EXT4-fs error (device sda): ext4_lookup:1784: inode #131074: comm cat: iget: checksum invalid
>
> [...]
Applied, thanks!
[1/3] ext4: factor out ext4_fill_raw_inode()
commit: 6a6af6b5ee4363e29fc86552ddfd94c5000d845d
[2/3] ext4: move ext4_fill_raw_inode() related functions
commit: 4bc2b511975a6fe5ce9f2b2dcc84b152354d1f90
[3/3] ext4: prevent getting empty inode buffer
commit: fb5becb151d54652ae613e34157f1aadb8447ed8
Best regards,
--
Theodore Ts'o <tytso@mit.edu>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2021-10-28 14:47 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-01 2:09 [PATCH v5 0/3] ext4: fix a inode checksum error Zhang Yi
2021-09-01 2:09 ` [PATCH v5 1/3] ext4: factor out ext4_fill_raw_inode() Zhang Yi
2021-09-20 13:57 ` Jan Kara
2021-09-01 2:09 ` [PATCH v5 2/3] ext4: move ext4_fill_raw_inode() related functions Zhang Yi
2021-09-20 13:57 ` Jan Kara
2021-09-01 2:09 ` [PATCH v5 3/3] ext4: prevent getting empty inode buffer Zhang Yi
2021-09-20 13:58 ` Jan Kara
2021-10-14 10:57 ` [PATCH v5 0/3] ext4: fix a inode checksum error Zhang Yi
2021-10-28 14:47 ` Theodore Ts'o
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.