lustre-devel-lustre.org archive mirror
 help / color / mirror / Atom feed
* [lustre-devel] LDISKFS-fs error: osd_iget: special inode unallocated, Remounting filesystem read-only
@ 2023-11-28 19:47 Cyrus Ramavarapu via lustre-devel
  2023-11-28 20:03 ` Andreas Dilger via lustre-devel
  0 siblings, 1 reply; 4+ messages in thread
From: Cyrus Ramavarapu via lustre-devel @ 2023-11-28 19:47 UTC (permalink / raw)
  To: lustre-devel

Hello,
 
I have recently started seeing sanity-lfsck failures in tests 18g, 23b, and 23c on Ubuntu 20.04 5.15.0-1051-azure due to the MDT filesystem going readonly preventing either the start of LFSCK or LFSCK operations. In all cases logs on the MDS show the following:
 
Nov 20 20:06:59 e72f0907-59ba-4ffd-9528-2e3ad47050e4-mdsmgs-a0-vm kernel: LDISKFS-fs error (device dm-0): osd_iget:500: inode #195: comm mdt03_003: iget: special inode unallocated Nov 20 20:06:59 e72f0907-59ba-4ffd-9528-2e3ad47050e4-mdsmgs-a0-vm kernel: Aborting journal on device dm-0-8.
Nov 20 20:06:59 e72f0907-59ba-4ffd-9528-2e3ad47050e4-mdsmgs-a0-vm kernel: LustreError: 29024:0:(osd_handler.c:1787:osd_trans_commit_cb()) transaction @0x00000000a2d278af commit error: 2 Nov 20 20:06:59 e72f0907-59ba-4ffd-9528-2e3ad47050e4-mdsmgs-a0-vm kernel: LDISKFS-fs (dm-0): Remounting filesystem read-only
 
LFSCK operations if they start will fail with error code 117 (EFSCORRUPTED):
 
00000020:00000001:8.0:1700166322.660540:0:43212:0:(lu_object.c:908:lu_object_find_at()) Process leaving (rc=18446744073709551499 : -117 : ffffffffffffff8b)
00100000:00000001:8.0:1700166322.660541:0:43212:0:(lfsck_layout.c:3241:lfsck_layout_scan_orphan_one()) Process leaving via out (rc=18446744073709551499 : -117 : 0xffffffffffffff8b)
 
In both cases, the error comes from an ldiskfs_iget operation which passes the LDISKFS_IGET_SPECIAL flag to __ext4_iget. A recent ext4 patch started checking for this flag and will return EFSCORRUPTED if the inode is unallocated (https://lkml.kernel.org/stable/20230320145452.175177331@linuxfoundation.org/ ).
 
Adding LDISKFS_IGET_SPECIAL always to ldiskfs_iget was done as part of LU-13166 (https://review.whamcloud.com/c/fs/lustre-release/+/37421 ) and feels broad to me in the context of the upstream ext4 change. At the moment I am investigating removing the LDISKFS_IGET_SPECIAL flag from ldiskfs_iget to see how it impacts the LFSCK tests and to determine if a more targeted change can be made to satisfy the intent of LU-13166.
 
Any suggestions or thoughts on how to approach this problem would be greatly appreciated. Additional logs or debugging information can be provided if needed.
 
Thank you and best,
Cyrus Ramavarapu
_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [lustre-devel] LDISKFS-fs error: osd_iget: special inode unallocated, Remounting filesystem read-only
  2023-11-28 19:47 [lustre-devel] LDISKFS-fs error: osd_iget: special inode unallocated, Remounting filesystem read-only Cyrus Ramavarapu via lustre-devel
@ 2023-11-28 20:03 ` Andreas Dilger via lustre-devel
  2023-12-04 14:19   ` Cyrus Ramavarapu via lustre-devel
  0 siblings, 1 reply; 4+ messages in thread
From: Andreas Dilger via lustre-devel @ 2023-11-28 20:03 UTC (permalink / raw)
  To: Cyrus Ramavarapu; +Cc: lustre-devel

I would suggest to patch the ext4 iget() to print the requested inode number, and lu_object_find_at() to print the FID. 

I suspect that the fix would be to make lu_object_find_at() just handle the -EFSCORRUPTED error like -ENOENT, and consider the FID bad.

The iget() error needs to be avoided as well (ideally with flags instead of a patch), so the bad inode lookup doesn't cause the filesystem to go read-only.  

AFAIK, this is "legal" for knfsd to do inode lookups with bad inode numbers, so possibly we need to filter "special" inode numbers in osd-ldiskfs, except root, to avoid the error?  Knowing which inode number is being accessed would help here. 

Cheers, Andreas

> On Nov 28, 2023, at 12:47, Cyrus Ramavarapu via lustre-devel <lustre-devel@lists.lustre.org> wrote:
> 
> Hello,
> 
> I have recently started seeing sanity-lfsck failures in tests 18g, 23b, and 23c on Ubuntu 20.04 5.15.0-1051-azure due to the MDT filesystem going readonly preventing either the start of LFSCK or LFSCK operations. In all cases logs on the MDS show the following:
> 
> Nov 20 20:06:59 e72f0907-59ba-4ffd-9528-2e3ad47050e4-mdsmgs-a0-vm kernel: LDISKFS-fs error (device dm-0): osd_iget:500: inode #195: comm mdt03_003: iget: special inode unallocated Nov 20 20:06:59 e72f0907-59ba-4ffd-9528-2e3ad47050e4-mdsmgs-a0-vm kernel: Aborting journal on device dm-0-8.
> Nov 20 20:06:59 e72f0907-59ba-4ffd-9528-2e3ad47050e4-mdsmgs-a0-vm kernel: LustreError: 29024:0:(osd_handler.c:1787:osd_trans_commit_cb()) transaction @0x00000000a2d278af commit error: 2 Nov 20 20:06:59 e72f0907-59ba-4ffd-9528-2e3ad47050e4-mdsmgs-a0-vm kernel: LDISKFS-fs (dm-0): Remounting filesystem read-only
> 
> LFSCK operations if they start will fail with error code 117 (EFSCORRUPTED):
> 
> 00000020:00000001:8.0:1700166322.660540:0:43212:0:(lu_object.c:908:lu_object_find_at()) Process leaving (rc=18446744073709551499 : -117 : ffffffffffffff8b)
> 00100000:00000001:8.0:1700166322.660541:0:43212:0:(lfsck_layout.c:3241:lfsck_layout_scan_orphan_one()) Process leaving via out (rc=18446744073709551499 : -117 : 0xffffffffffffff8b)
> 
> In both cases, the error comes from an ldiskfs_iget operation which passes the LDISKFS_IGET_SPECIAL flag to __ext4_iget. A recent ext4 patch started checking for this flag and will return EFSCORRUPTED if the inode is unallocated (https://lkml.kernel.org/stable/20230320145452.175177331@linuxfoundation.org/ ).
> 
> Adding LDISKFS_IGET_SPECIAL always to ldiskfs_iget was done as part of LU-13166 (https://review.whamcloud.com/c/fs/lustre-release/+/37421 ) and feels broad to me in the context of the upstream ext4 change. At the moment I am investigating removing the LDISKFS_IGET_SPECIAL flag from ldiskfs_iget to see how it impacts the LFSCK tests and to determine if a more targeted change can be made to satisfy the intent of LU-13166.
> 
> Any suggestions or thoughts on how to approach this problem would be greatly appreciated. Additional logs or debugging information can be provided if needed.
> 
> Thank you and best,
> Cyrus Ramavarapu
> _______________________________________________
> lustre-devel mailing list
> lustre-devel@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org
_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [lustre-devel] LDISKFS-fs error: osd_iget: special inode unallocated, Remounting filesystem read-only
  2023-11-28 20:03 ` Andreas Dilger via lustre-devel
@ 2023-12-04 14:19   ` Cyrus Ramavarapu via lustre-devel
  2023-12-04 18:57     ` Cyrus Ramavarapu via lustre-devel
  0 siblings, 1 reply; 4+ messages in thread
From: Cyrus Ramavarapu via lustre-devel @ 2023-12-04 14:19 UTC (permalink / raw)
  To: lustre-devel

Hi Andreas,

Thank you very much for your suggestions. I definitely agree that we need to filter the LDISKFS inodes depending on if they are reserved or not, and selectively pass in the IGET_SPECIAL flag. I produced the small patch below which is currently passing my tests and undergoing additional testing. I will work on getting a JIRA ticket opened and submitting this patch through proper channels.

At the moment, I don't believe we need additional handling for -EFSCORRUPTED in lu_object_find_at() since the inode loading path within the this function propagates any error to callers; however, it is quite possible I am missing something and will investigate further.

Best,
Cyrus

From d0a514d52bee6105eabd67aeeb440bba08822a35 Mon Sep 17 00:00:00 2001
From: Cyrus Ramavarapu <cramavarapu@microsoft.com>
Date: Tue, 28 Nov 2023 16:44:13 +0000
Subject: [PATCH] Filter inodes during ldiskfs_iget to selectively apply the  LDISKFS_IGET_SPECIAL flag only when a reserved inode is being retrieved.
 Reserved inodes are defined as inodes with a number less than  LDISKFS_FIRST_INO(sb).

---
 lustre/osd-ldiskfs/osd_internal.h | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/lustre/osd-ldiskfs/osd_internal.h b/lustre/osd-ldiskfs/osd_internal.h
index ac0169cbde..1e5b529709 100644
--- a/lustre/osd-ldiskfs/osd_internal.h
+++ b/lustre/osd-ldiskfs/osd_internal.h
@@ -949,9 +949,14 @@ static inline void i_projid_write(struct inode *inode, __u32 projid)  #endif

 #ifdef HAVE_LDISKFS_IGET_WITH_FLAGS
-# define osd_ldiskfs_iget(sb, ino) \
-               ldiskfs_iget((sb), (ino), \
-                            LDISKFS_IGET_HANDLE | LDISKFS_IGET_SPECIAL)
+static inline struct inode *osd_ldiskfs_iget(struct super_block *sb,
+unsigned long ino) {
+       ldiskfs_iget_flags flags = LDISKFS_IGET_HANDLE;
+
+       if (ino < LDISKFS_FIRST_INO(sb))
+               flags |= LDISKFS_IGET_SPECIAL;
+       return ldiskfs_iget(sb, ino, flags); }
 #else
 # define osd_ldiskfs_iget(sb, ino) ldiskfs_iget((sb), (ino))  #endif
--
2.25.1

-----Original Message-----
From: Andreas Dilger <adilger@whamcloud.com>
Sent: Tuesday, November 28, 2023 3:04 PM
To: Cyrus Ramavarapu <cramavarapu@microsoft.com>
Cc: lustre-devel@lists.lustre.org
Subject: [EXTERNAL] Re: [lustre-devel] LDISKFS-fs error: osd_iget: special inode unallocated, Remounting filesystem read-only

[You don't often get email from adilger@whamcloud.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]

I would suggest to patch the ext4 iget() to print the requested inode number, and lu_object_find_at() to print the FID.

I suspect that the fix would be to make lu_object_find_at() just handle the -EFSCORRUPTED error like -ENOENT, and consider the FID bad.

The iget() error needs to be avoided as well (ideally with flags instead of a patch), so the bad inode lookup doesn't cause the filesystem to go read-only.

AFAIK, this is "legal" for knfsd to do inode lookups with bad inode numbers, so possibly we need to filter "special" inode numbers in osd-ldiskfs, except root, to avoid the error?  Knowing which inode number is being accessed would help here.

Cheers, Andreas

> On Nov 28, 2023, at 12:47, Cyrus Ramavarapu via lustre-devel <lustre-devel@lists.lustre.org> wrote:
>
> Hello,
>
> I have recently started seeing sanity-lfsck failures in tests 18g, 23b, and 23c on Ubuntu 20.04 5.15.0-1051-azure due to the MDT filesystem going readonly preventing either the start of LFSCK or LFSCK operations. In all cases logs on the MDS show the following:
>
> Nov 20 20:06:59 e72f0907-59ba-4ffd-9528-2e3ad47050e4-mdsmgs-a0-vm kernel: LDISKFS-fs error (device dm-0): osd_iget:500: inode #195: comm mdt03_003: iget: special inode unallocated Nov 20 20:06:59 e72f0907-59ba-4ffd-9528-2e3ad47050e4-mdsmgs-a0-vm kernel: Aborting journal on device dm-0-8.
> Nov 20 20:06:59 e72f0907-59ba-4ffd-9528-2e3ad47050e4-mdsmgs-a0-vm kernel: LustreError: 29024:0:(osd_handler.c:1787:osd_trans_commit_cb()) transaction @0x00000000a2d278af commit error: 2 Nov 20 20:06:59 e72f0907-59ba-4ffd-9528-2e3ad47050e4-mdsmgs-a0-vm kernel: LDISKFS-fs (dm-0): Remounting filesystem read-only
>
> LFSCK operations if they start will fail with error code 117 (EFSCORRUPTED):
>
> 00000020:00000001:8.0:1700166322.660540:0:43212:0:(lu_object.c:908:lu_object_find_at()) Process leaving (rc=18446744073709551499 : -117 : ffffffffffffff8b)
> 00100000:00000001:8.0:1700166322.660541:0:43212:0:(lfsck_layout.c:3241:lfsck_layout_scan_orphan_one()) Process leaving via out (rc=18446744073709551499 : -117 : 0xffffffffffffff8b)
>
> In both cases, the error comes from an ldiskfs_iget operation which passes the LDISKFS_IGET_SPECIAL flag to __ext4_iget. A recent ext4 patch started checking for this flag and will return EFSCORRUPTED if the inode is unallocated (https://lkml.kernel.org/stable/20230320145452.175177331@linuxfoundation.org/ ).
>
> Adding LDISKFS_IGET_SPECIAL always to ldiskfs_iget was done as part of LU-13166 (https://review.whamcloud.com/c/fs/lustre-release/+/37421 ) and feels broad to me in the context of the upstream ext4 change. At the moment I am investigating removing the LDISKFS_IGET_SPECIAL flag from ldiskfs_iget to see how it impacts the LFSCK tests and to determine if a more targeted change can be made to satisfy the intent of LU-13166.
>
> Any suggestions or thoughts on how to approach this problem would be greatly appreciated. Additional logs or debugging information can be provided if needed.
>
> Thank you and best,
> Cyrus Ramavarapu
> _______________________________________________
> lustre-devel mailing list
> lustre-devel@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org
_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [lustre-devel] LDISKFS-fs error: osd_iget: special inode unallocated, Remounting filesystem read-only
  2023-12-04 14:19   ` Cyrus Ramavarapu via lustre-devel
@ 2023-12-04 18:57     ` Cyrus Ramavarapu via lustre-devel
  0 siblings, 0 replies; 4+ messages in thread
From: Cyrus Ramavarapu via lustre-devel @ 2023-12-04 18:57 UTC (permalink / raw)
  To: lustre-devel

There is a bug in the patch I posted below related to the handling of project quotas since the prj_quota_inum is greater than s_first_ino. I will update when I have a working patch.

Best,
Cyrus Ramavarapu

-----Original Message-----
From: Cyrus Ramavarapu
Sent: Monday, December 4, 2023 9:20 AM
To: lustre-devel@lists.lustre.org
Subject: RE: [lustre-devel] LDISKFS-fs error: osd_iget: special inode unallocated, Remounting filesystem read-only

Hi Andreas,

Thank you very much for your suggestions. I definitely agree that we need to filter the LDISKFS inodes depending on if they are reserved or not, and selectively pass in the IGET_SPECIAL flag. I produced the small patch below which is currently passing my tests and undergoing additional testing. I will work on getting a JIRA ticket opened and submitting this patch through proper channels.

At the moment, I don't believe we need additional handling for -EFSCORRUPTED in lu_object_find_at() since the inode loading path within the this function propagates any error to callers; however, it is quite possible I am missing something and will investigate further.

Best,
Cyrus

From d0a514d52bee6105eabd67aeeb440bba08822a35 Mon Sep 17 00:00:00 2001
From: Cyrus Ramavarapu <cramavarapu@microsoft.com>
Date: Tue, 28 Nov 2023 16:44:13 +0000
Subject: [PATCH] Filter inodes during ldiskfs_iget to selectively apply the  LDISKFS_IGET_SPECIAL flag only when a reserved inode is being retrieved.
 Reserved inodes are defined as inodes with a number less than  LDISKFS_FIRST_INO(sb).

---
 lustre/osd-ldiskfs/osd_internal.h | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/lustre/osd-ldiskfs/osd_internal.h b/lustre/osd-ldiskfs/osd_internal.h
index ac0169cbde..1e5b529709 100644
--- a/lustre/osd-ldiskfs/osd_internal.h
+++ b/lustre/osd-ldiskfs/osd_internal.h
@@ -949,9 +949,14 @@ static inline void i_projid_write(struct inode *inode, __u32 projid)  #endif

 #ifdef HAVE_LDISKFS_IGET_WITH_FLAGS
-# define osd_ldiskfs_iget(sb, ino) \
-               ldiskfs_iget((sb), (ino), \
-                            LDISKFS_IGET_HANDLE | LDISKFS_IGET_SPECIAL)
+static inline struct inode *osd_ldiskfs_iget(struct super_block *sb,
+unsigned long ino) {
+       ldiskfs_iget_flags flags = LDISKFS_IGET_HANDLE;
+
+       if (ino < LDISKFS_FIRST_INO(sb))
+               flags |= LDISKFS_IGET_SPECIAL;
+       return ldiskfs_iget(sb, ino, flags); }
 #else
 # define osd_ldiskfs_iget(sb, ino) ldiskfs_iget((sb), (ino))  #endif
--
2.25.1

-----Original Message-----
From: Andreas Dilger <adilger@whamcloud.com>
Sent: Tuesday, November 28, 2023 3:04 PM
To: Cyrus Ramavarapu <cramavarapu@microsoft.com>
Cc: lustre-devel@lists.lustre.org
Subject: [EXTERNAL] Re: [lustre-devel] LDISKFS-fs error: osd_iget: special inode unallocated, Remounting filesystem read-only

[You don't often get email from adilger@whamcloud.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]

I would suggest to patch the ext4 iget() to print the requested inode number, and lu_object_find_at() to print the FID.

I suspect that the fix would be to make lu_object_find_at() just handle the -EFSCORRUPTED error like -ENOENT, and consider the FID bad.

The iget() error needs to be avoided as well (ideally with flags instead of a patch), so the bad inode lookup doesn't cause the filesystem to go read-only.

AFAIK, this is "legal" for knfsd to do inode lookups with bad inode numbers, so possibly we need to filter "special" inode numbers in osd-ldiskfs, except root, to avoid the error?  Knowing which inode number is being accessed would help here.

Cheers, Andreas

> On Nov 28, 2023, at 12:47, Cyrus Ramavarapu via lustre-devel <lustre-devel@lists.lustre.org> wrote:
>
> Hello,
>
> I have recently started seeing sanity-lfsck failures in tests 18g, 23b, and 23c on Ubuntu 20.04 5.15.0-1051-azure due to the MDT filesystem going readonly preventing either the start of LFSCK or LFSCK operations. In all cases logs on the MDS show the following:
>
> Nov 20 20:06:59 e72f0907-59ba-4ffd-9528-2e3ad47050e4-mdsmgs-a0-vm kernel: LDISKFS-fs error (device dm-0): osd_iget:500: inode #195: comm mdt03_003: iget: special inode unallocated Nov 20 20:06:59 e72f0907-59ba-4ffd-9528-2e3ad47050e4-mdsmgs-a0-vm kernel: Aborting journal on device dm-0-8.
> Nov 20 20:06:59 e72f0907-59ba-4ffd-9528-2e3ad47050e4-mdsmgs-a0-vm
> kernel: LustreError:
> 29024:0:(osd_handler.c:1787:osd_trans_commit_cb()) transaction
> @0x00000000a2d278af commit error: 2 Nov 20 20:06:59
> e72f0907-59ba-4ffd-9528-2e3ad47050e4-mdsmgs-a0-vm kernel: LDISKFS-fs
> (dm-0): Remounting filesystem read-only
>
> LFSCK operations if they start will fail with error code 117 (EFSCORRUPTED):
>
> 00000020:00000001:8.0:1700166322.660540:0:43212:0:(lu_object.c:908:lu_
> object_find_at()) Process leaving (rc=18446744073709551499 : -117 :
> ffffffffffffff8b)
> 00100000:00000001:8.0:1700166322.660541:0:43212:0:(lfsck_layout.c:3241
> :lfsck_layout_scan_orphan_one()) Process leaving via out
> (rc=18446744073709551499 : -117 : 0xffffffffffffff8b)
>
> In both cases, the error comes from an ldiskfs_iget operation which passes the LDISKFS_IGET_SPECIAL flag to __ext4_iget. A recent ext4 patch started checking for this flag and will return EFSCORRUPTED if the inode is unallocated (https://lkml.kernel.org/stable/20230320145452.175177331@linuxfoundation.org/ ).
>
> Adding LDISKFS_IGET_SPECIAL always to ldiskfs_iget was done as part of LU-13166 (https://review.whamcloud.com/c/fs/lustre-release/+/37421 ) and feels broad to me in the context of the upstream ext4 change. At the moment I am investigating removing the LDISKFS_IGET_SPECIAL flag from ldiskfs_iget to see how it impacts the LFSCK tests and to determine if a more targeted change can be made to satisfy the intent of LU-13166.
>
> Any suggestions or thoughts on how to approach this problem would be greatly appreciated. Additional logs or debugging information can be provided if needed.
>
> Thank you and best,
> Cyrus Ramavarapu
> _______________________________________________
> lustre-devel mailing list
> lustre-devel@lists.lustre.org
> http://lists/
> .lustre.org%2Flistinfo.cgi%2Flustre-devel-lustre.org&data=05%7C01%7Ccr
> amavarapu%40microsoft.com%7C23f61b1fb8424745253908dbf04d263d%7C72f988b
> f86f141af91ab2d7cd011db47%7C1%7C0%7C638367986380260092%7CUnknown%7CTWF
> pbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6M
> n0%3D%7C3000%7C%7C%7C&sdata=L5OhoBkTskKay4YRJpNALggb9YG8D%2BGEvc%2FQYF
> wv6wk%3D&reserved=0
_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-12-04 18:57 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-28 19:47 [lustre-devel] LDISKFS-fs error: osd_iget: special inode unallocated, Remounting filesystem read-only Cyrus Ramavarapu via lustre-devel
2023-11-28 20:03 ` Andreas Dilger via lustre-devel
2023-12-04 14:19   ` Cyrus Ramavarapu via lustre-devel
2023-12-04 18:57     ` Cyrus Ramavarapu via lustre-devel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).