All of lore.kernel.org
 help / color / mirror / Atom feed
From: shrikanth hegde <sshegde@linux.vnet.ibm.com>
To: djwong@kernel.org, dchinner@redhat.com
Cc: linux-xfs@vger.kernel.org,
	Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
	ojaswin@linux.ibm.com,
	shrikanth hegde <sshegde@linux.vnet.ibm.com>
Subject: xfs: system fails to boot up due to Internal error xfs_trans_cancel
Date: Fri, 17 Feb 2023 16:45:12 +0530	[thread overview]
Message-ID: <e5004868-4a03-93e5-5077-e7ed0e533996@linux.vnet.ibm.com> (raw)

We are observing panic on boot upon loading the latest stable tree(v6.2-rc4) in 
one of our systems. System fails to come up. System was booting well 
with v5.17, v5.19 kernel. We started seeing this issue when loading v6.0 kernel.

Panic Log is below.
[  333.390539] ------------[ cut here ]------------
[  333.390552] WARNING: CPU: 56 PID: 12450 at fs/xfs/xfs_inode.c:1839 xfs_iunlink_lookup+0x58/0x80 [xfs]
[  333.390615] Modules linked in: nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink rfkill sunrpc pseries_rng xts vmx_crypto xfs libcrc32c sd_mod sg ibmvscsi ibmveth scsi_transport_srp nvme nvme_core t10_pi crc64_rocksoft crc64 dm_mirror dm_region_hash dm_log dm_mod
[  333.390645] CPU: 56 PID: 12450 Comm: rm Not tainted 6.2.0-rc4ssh+ #4
[  333.390649] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1010.22 (NH1010_122) hv:phyp pSeries
[  333.390652] NIP:  c0080000004bfa80 LR: c0080000004bfa4c CTR: c000000000ea28d0
[  333.390655] REGS: c0000000442bb8c0 TRAP: 0700   Not tainted  (6.2.0-rc4ssh+)
[  333.390658] MSR:  8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 24002842  XER: 00000000
[  333.390666] CFAR: c0080000004bfa54 IRQMASK: 0
[  333.390666] GPR00: c00000003b69c0c8 c0000000442bbb60 c008000000568300 0000000000000000
[  333.390666] GPR04: 00000000002ec44d 0000000000000000 0000000000000000 c000000004b27d78
[  333.390666] GPR08: 0000000000000000 c000000004b27e28 0000000000000000 fffffffffffffffd
[  333.390666] GPR12: 0000000000000040 c000004afecc5880 0000000106620918 0000000000000001
[  333.390666] GPR16: 000000010bd36e10 0000000106620dc8 0000000106620e58 0000000106620e90
[  333.390666] GPR20: 0000000106620e30 c0000000880ba938 0000000000200000 00000000002ec44d
[  333.390666] GPR24: 000000000008170d 000000000000000d c0000000519f4800 00000000002ec44d
[  333.390666] GPR28: c0000000880ba800 c00000003b69c000 c0000000833edd20 000000000008170d
[  333.390702] NIP [c0080000004bfa80] xfs_iunlink_lookup+0x58/0x80 [xfs]
[  333.390756] LR [c0080000004bfa4c] xfs_iunlink_lookup+0x24/0x80 [xfs]
[  333.390810] Call Trace:
[  333.390811] [c0000000442bbb60] [c0000000833edd20] 0xc0000000833edd20 (unreliable)
[  333.390816] [c0000000442bbb80] [c0080000004c0094] xfs_iunlink+0x1bc/0x280 [xfs]
[  333.390869] [c0000000442bbc00] [c0080000004c3f84] xfs_remove+0x1dc/0x310 [xfs]
[  333.390922] [c0000000442bbc70] [c0080000004be180] xfs_vn_unlink+0x68/0xf0 [xfs]
[  333.390975] [c0000000442bbcd0] [c000000000576b24] vfs_unlink+0x1b4/0x3d0
[  333.390981] [c0000000442bbd20] [c00000000057e5d8] do_unlinkat+0x2b8/0x390
[  333.390985] [c0000000442bbde0] [c00000000057e708] sys_unlinkat+0x58/0xb0
[  333.390989] [c0000000442bbe10] [c0000000000335d0] system_call_exception+0x150/0x3b0
[  333.390994] [c0000000442bbe50] [c00000000000c554] system_call_common+0xf4/0x258
[  333.390999] --- interrupt: c00 at 0x7fffa47230a0
[  333.391001] NIP:  00007fffa47230a0 LR: 00000001066138ac CTR: 0000000000000000
[  333.391004] REGS: c0000000442bbe80 TRAP: 0c00   Not tainted  (6.2.0-rc4ssh+)
[  333.391007] MSR:  800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 22002202  XER: 00000000
[  333.391016] IRQMASK: 0
[  333.391016] GPR00: 0000000000000124 00007fffdb9330b0 00007fffa4807300 0000000000000008
[  333.391016] GPR04: 000000010bd36f18 0000000000000000 0000000000000000 0000000000000003
[  333.391016] GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[  333.391016] GPR12: 0000000000000000 00007fffa48ba340 0000000106620918 0000000000000001
[  333.391016] GPR16: 000000010bd36e10 0000000106620dc8 0000000106620e58 0000000106620e90
[  333.391016] GPR20: 0000000106620e30 0000000106620e00 0000000106620c40 0000000000000002
[  333.391016] GPR24: 0000000106620c38 00000001066208d8 0000000000000000 0000000106620d20
[  333.391016] GPR28: 00007fffdb933408 000000010bd24cec 00007fffdb933408 000000010bd36e10
[  333.391050] NIP [00007fffa47230a0] 0x7fffa47230a0
[  333.391052] LR [00000001066138ac] 0x1066138ac
[  333.391054] --- interrupt: c00
[  333.391056] Code: 2c230000 4182002c e9230020 2fa90000 419e0020 38210020 e8010010 7c0803a6 4e800020 60000000 60000000 60000000 <0fe00000> 60000000 60000000 60000000
[  333.391069] ---[ end trace 0000000000000000 ]---
[  333.391072] XFS (dm-0): Internal error xfs_trans_cancel at line 1097 of file fs/xfs/xfs_trans.c.  Caller xfs_remove+0x1a0/0x310 [xfs]
[  333.391128] CPU: 56 PID: 12450 Comm: rm Tainted: G        W          6.2.0-rc4ssh+ #4
[  333.391131] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1010.22 (NH1010_122) hv:phyp pSeries
[  333.391135] Call Trace:
[  333.391136] [c0000000442bbb10] [c000000000e84f4c] dump_stack_lvl+0x70/0xa4 (unreliable)
[  333.391142] [c0000000442bbb50] [c0080000004a6a84] xfs_error_report+0x5c/0x80 [xfs]
[  333.391194] [c0000000442bbbb0] [c0080000004d67b0] xfs_trans_cancel+0x178/0x1b0 [xfs]
[  333.391249] [c0000000442bbc00] [c0080000004c3f48] xfs_remove+0x1a0/0x310 [xfs]
[  333.391302] [c0000000442bbc70] [c0080000004be180] xfs_vn_unlink+0x68/0xf0 [xfs]
[  333.391355] [c0000000442bbcd0] [c000000000576b24] vfs_unlink+0x1b4/0x3d0
[  333.391359] [c0000000442bbd20] [c00000000057e5d8] do_unlinkat+0x2b8/0x390
[  333.391363] [c0000000442bbde0] [c00000000057e708] sys_unlinkat+0x58/0xb0
[  333.391367] [c0000000442bbe10] [c0000000000335d0] system_call_exception+0x150/0x3b0
[  333.391371] [c0000000442bbe50] [c00000000000c554] system_call_common+0xf4/0x258
[  333.391376] --- interrupt: c00 at 0x7fffa47230a0
[  333.391378] NIP:  00007fffa47230a0 LR: 00000001066138ac CTR: 0000000000000000
[  333.391381] REGS: c0000000442bbe80 TRAP: 0c00   Tainted: G        W           (6.2.0-rc4ssh+)
[  333.391385] MSR:  800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 22002202  XER: 00000000
[  333.391393] IRQMASK: 0
[  333.391393] GPR00: 0000000000000124 00007fffdb9330b0 00007fffa4807300 0000000000000008
[  333.391393] GPR04: 000000010bd36f18 0000000000000000 0000000000000000 0000000000000003
[  333.391393] GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[  333.391393] GPR12: 0000000000000000 00007fffa48ba340 0000000106620918 0000000000000001
[  333.391393] GPR16: 000000010bd36e10 0000000106620dc8 0000000106620e58 0000000106620e90
[  333.391393] GPR20: 0000000106620e30 0000000106620e00 0000000106620c40 0000000000000002
[  333.391393] GPR24: 0000000106620c38 00000001066208d8 0000000000000000 0000000106620d20
[  333.391393] GPR28: 00007fffdb933408 000000010bd24cec 00007fffdb933408 000000010bd36e10
[  333.391427] NIP [00007fffa47230a0] 0x7fffa47230a0
[  333.391429] LR [00000001066138ac] 0x1066138ac
[  333.391431] --- interrupt: c00
[  333.394067] XFS (dm-0): Corruption of in-memory data (0x8) detected at xfs_trans_cancel+0x190/0x1b0 [xfs] (fs/xfs/xfs_trans.c:1098).  Shutting down filesystem.
[  333.394125] XFS (dm-0): Please unmount the filesystem and rectify the problem(s)



we did a git bisect between 5.17 and 6.0. Bisect points to commit 04755d2e5821 
as the bad commit.
Short description of commit:
commit 04755d2e5821b3afbaadd09fe5df58d04de36484 (refs/bisect/bad)
Author: Dave Chinner <dchinner@redhat.com>
Date:   Thu Jul 14 11:42:39 2022 +1000

    xfs: refactor xlog_recover_process_iunlinks()


Git bisect log:
git bisect start
# good: [26291c54e111ff6ba87a164d85d4a4e134b7315c] Linux 5.17-rc2
git bisect good 26291c54e111ff6ba87a164d85d4a4e134b7315c
# bad: [4fe89d07dcc2804c8b562f6c7896a45643d34b2f] Linux 6.0
git bisect bad 4fe89d07dcc2804c8b562f6c7896a45643d34b2f
# good: [d7227785e384d4422b3ca189aa5bf19f462337cc] Merge tag 'sound-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
git bisect good d7227785e384d4422b3ca189aa5bf19f462337cc
# good: [526942b8134cc34d25d27f95dfff98b8ce2f6fcd] Merge tag 'ata-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata
git bisect good 526942b8134cc34d25d27f95dfff98b8ce2f6fcd
# good: [328141e51e6fc79d21168bfd4e356dddc2ec7491] Merge tag 'mmc-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
git bisect good 328141e51e6fc79d21168bfd4e356dddc2ec7491
# bad: [eb555cb5b794f4e12a9897f3d46d5a72104cd4a7] Merge tag '5.20-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd
git bisect bad eb555cb5b794f4e12a9897f3d46d5a72104cd4a7
# bad: [f20c95b46b8fa3ad34b3ea2e134337f88591468b] Merge tag 'tpmdd-next-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd
git bisect bad f20c95b46b8fa3ad34b3ea2e134337f88591468b
# bad: [fad235ed4338749a66ddf32971d4042b9ef47f44] Merge tag 'arm-late-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
git bisect bad fad235ed4338749a66ddf32971d4042b9ef47f44
# good: [e495274793ea602415d050452088a496abcd9e6c] Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
git bisect good e495274793ea602415d050452088a496abcd9e6c
# good: [9daee913dc8d15eb65e0ff560803ab1c28bb480b] Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
git bisect good 9daee913dc8d15eb65e0ff560803ab1c28bb480b
# bad: [29b1d469f3f6842ee4115f0b21f018fc44176468] Merge tag 'trace-rtla-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
git bisect bad 29b1d469f3f6842ee4115f0b21f018fc44176468
# good: [932b42c66cb5d0ca9800b128415b4ad6b1952b3e] xfs: replace XFS_IFORK_Q with a proper predicate function
git bisect good 932b42c66cb5d0ca9800b128415b4ad6b1952b3e
# bad: [35c5a09f5346e690df7ff2c9075853e340ee10b3] Merge tag 'xfs-buf-lockless-lookup-5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs into xfs-5.20-mergeB
git bisect bad 35c5a09f5346e690df7ff2c9075853e340ee10b3
# bad: [fad743d7cd8bd92d03c09e71f29eace860f50415] xfs: add log item precommit operation
git bisect bad fad743d7cd8bd92d03c09e71f29eace860f50415
# bad: [04755d2e5821b3afbaadd09fe5df58d04de36484] xfs: refactor xlog_recover_process_iunlinks()
git bisect bad 04755d2e5821b3afbaadd09fe5df58d04de36484
# good: [a4454cd69c66bf3e3bbda352b049732f836fc6b2] xfs: factor the xfs_iunlink functions
git bisect good a4454cd69c66bf3e3bbda352b049732f836fc6b2
Bisecting: 0 revisions left to test after this (roughly 0 steps)
[4fcc94d653270fcc7800dbaf3b11f78cb462b293] xfs: track the iunlink list pointer in the xfs_inode


Please reach out, in case any more details are needed. sent with very limited
knowledge of xfs system. these logs are from 5.19 kernel.

# xfs_info /home
meta-data=/dev/nvme0n1p1         isize=512    agcount=4, agsize=13107200 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1    bigtime=0 inobtcount=0
data     =                       bsize=4096   blocks=52428800, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=25600, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

# xfs_info -V
xfs_info version 5.0.0

# uname -a
5.19.0-rc2

             reply	other threads:[~2023-02-17 11:15 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-17 11:15 shrikanth hegde [this message]
2023-02-17 11:25 ` xfs: system fails to boot up due to Internal error xfs_trans_cancel shrikanth hegde
2023-02-17 11:30 ` shrikanth hegde
2023-02-17 15:03 ` Linux regression tracking #adding (Thorsten Leemhuis)
2023-02-17 16:53 ` Darrick J. Wong
2023-02-17 20:25   ` Dave Chinner
2023-02-18  7:17   ` shrikanth hegde
2023-02-22 16:41     ` Darrick J. Wong
2023-02-24  8:04       ` shrikanth hegde
2023-02-24 21:18         ` Darrick J. Wong
2023-03-09 14:26       ` Ritesh Harjani
2023-03-09 17:27         ` Darrick J. Wong
2023-03-16  4:46           ` Ritesh Harjani
2023-03-16  5:20             ` Darrick J. Wong
2023-03-17 20:44               ` Darrick J. Wong
2023-03-18 16:50                 ` Ritesh Harjani
2023-03-18 19:20                   ` Darrick J. Wong
2023-03-20  5:20                     ` Ritesh Harjani
2023-04-17 11:16                       ` Linux regression tracking (Thorsten Leemhuis)
2023-04-18  4:56                         ` Darrick J. Wong
2023-04-21 13:04                           ` Linux regression tracking (Thorsten Leemhuis)
2023-06-05 13:27                           ` Thorsten Leemhuis
2023-06-05 21:57                             ` Darrick J. Wong
2023-06-06  2:46                               ` Dave Chinner
2023-06-06  3:22                                 ` Darrick J. Wong
2023-06-06 11:23                                 ` Thorsten Leemhuis
2023-03-10  0:29         ` Dave Chinner
2023-03-16  4:48           ` Ritesh Harjani

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e5004868-4a03-93e5-5077-e7ed0e533996@linux.vnet.ibm.com \
    --to=sshegde@linux.vnet.ibm.com \
    --cc=dchinner@redhat.com \
    --cc=djwong@kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=ojaswin@linux.ibm.com \
    --cc=srikar@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.