All of lore.kernel.org
 help / color / mirror / Atom feed
* XFS new panic after new merges
@ 2021-09-09  3:04 Murphy Zhou
  2021-09-09 12:17 ` Murphy Zhou
  0 siblings, 1 reply; 3+ messages in thread
From: Murphy Zhou @ 2021-09-09  3:04 UTC (permalink / raw)
  To: linux-xfs; +Cc: zlang

Hi,

After this batch of updates in Linus tree,

eceae1e Merge tag 'configfs-5.15' of git://git.infradead.org/users/hch/configfs
265113f Merge tag 'dlm-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm
111c1aa Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
815409a Merge tag 'ovl-update-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
412106c Merge tag 'erofs-for-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
89594c7 Merge tag 'fscache-next-20210829' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
90c90cd Merge tag 'xfs-5.15-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
57c78a2 Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
bcfeebb Merge branch 'exit-cleanups-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
4898370 Merge branch 'siginfo-si_trapno-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
65266a7 Merge remote-tracking branch 'tip/sched/arm64' into for-next/core
61dc131 Merge tag 'iomap-5.15-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git


xfstests on xfs start to panic some times when running xfs/006 or xfs/264.

It's reproducible but not always. Sometimes on some servers 100 loops of
these tests pass. Sometimes it panics on the first run of xfs/006.

Due to it's not always reproducible, bisecting is not an option. And it's
hard to say it can not be reproduced on any condition. So far it can be
reproduced via these mkfs options:

crc=1 finobt=1,sparse=1,rmapbt=0 reflink=1,bigtime=0,inobtcount=0
or
crc=1 finobt=1,sparse=1,rmapbt=1 reflink=1,bigtime=0,inobtcount=0
or
crc=1 finobt=1,sparse=1,rmapbt=0 reflink=1,bigtime=1,inobtcount=1
or
crc=1 finobt=1,sparse=1,rmapbt=1 reflink=1,bigtime=1,inobtcount=1

Thanks zorro's help on debugging.

Attaching some debug info, I can send more if you need.

Thanks,
Murphy

# xfs_info output

meta-data=/dev/sda5              isize=512    agcount=16, agsize=245696 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1    bigtime=0 inobtcount=0
data     =                       bsize=4096   blocks=3931136, imaxpct=25
         =                       sunit=64     swidth=192 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=5184, version=2
         =                       sectsz=512   sunit=64 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
meta-data=/dev/sda4              isize=512    agcount=16, agsize=245696 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1    bigtime=0 inobtcount=0
data     =                       bsize=4096   blocks=3931136, imaxpct=25
         =                       sunit=64     swidth=192 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=5184, version=2
         =                       sectsz=512   sunit=64 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

# panic console output

[  514.174583] run fstests xfs/006 at 2021-09-08 05:07:23 
[  515.075816] device-mapper: core: CONFIG_IMA_DISABLE_HTABLE is disabled. Duplicate IMA measurements will not be recorded in the IMA log. 
[  515.139517] device-mapper: uevent: version 1.0.3 
[  515.160591] device-mapper: ioctl: 4.45.0-ioctl (2021-03-22) initialised: dm-devel@redhat.com 
[  515.595944] XFS (dm-0): Mounting V5 Filesystem 
[  515.642351] XFS (dm-0): Ending clean mount 
[  515.667439] xfs filesystem being mounted at /mnt/xfstests/scratch supports timestamps until 2038 (0x7fffffff) 
[  515.899072] XFS (dm-0): xlog_verify_grant_tail: space > BBTOB(tail_blocks) 
[  516.439917] Buffer I/O error on dev dm-0, logical block 31457152, async page read 
[  516.476888] Buffer I/O error on dev dm-0, logical block 31457153, async page read 
[  516.513741] Buffer I/O error on dev dm-0, logical block 31457154, async page read 
[  516.550635] Buffer I/O error on dev dm-0, logical block 31457155, async page read 
[  516.584744] Buffer I/O error on dev dm-0, logical block 31457156, async page read 
[  516.618025] Buffer I/O error on dev dm-0, logical block 31457157, async page read 
[  516.652346] Buffer I/O error on dev dm-0, logical block 31457158, async page read 
[  516.686287] Buffer I/O error on dev dm-0, logical block 31457159, async page read 
[  516.817762] XFS (dm-0): Unmounting Filesystem 
[  516.870615] XFS (dm-0): log I/O error -5 
[  516.894791] BUG: unable to handle page fault for address: ffffffffb804b420 
[  516.925708] #PF: supervisor write access in kernel mode 
[  516.949221] #PF: error_code(0x0002) - not-present page 
[  516.973270] PGD 825015067 P4D 825015067 PUD 825016063 PMD 10b528063 PTE 800ffff7da5b4062 
[  517.014986] Oops: 0002 [#1] SMP PTI 
[  517.030764] CPU: 29 PID: 983 Comm: kworker/29:1H Kdump: loaded Tainted: G          I       5.14.0+ #1 
[  517.072364] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 07/01/2015 
[  517.102164] Workqueue: xfs-log/dm-0 xlog_ioend_work [xfs] 
[  517.126689] RIP: 0010:native_queued_spin_lock_slowpath.part.0+0x15f/0x190 
[  517.157762] Code: f3 90 48 8b 31 48 85 f6 74 f6 eb d9 c1 ee 12 83 e0 03 83 ee 01 48 c1 e0 05 48 63 f6 48 05 00 94 02 00 48 03 04 f5 80 5a 40 b7 <48> 89 08 8b 41 08 85 c0 75 09 f3 90 8b 41 08 85 c0 74 f7 48 8b 31 
[  517.242726] RSP: 0018:ffffa30ec522fb90 EFLAGS: 00010282 
[  517.266249] RAX: ffffffffb804b420 RBX: 0000000100009000 RCX: ffff91f2af969400 
[  517.298608] RDX: ffff91ebc12b18c0 RSI: 000000000000135b RDI: ffff91ebc12b18c0 
[  517.330983] RBP: ffff91ef112af4a0 R08: 0000000000780000 R09: ffff91eec0d34b28 
[  517.363119] R10: 00000000000001ad R11: 000000000086a1d9 R12: ffff91ebc12b1880 
[  517.395630] R13: ffff91ebc12b18c0 R14: 0000000000000008 R15: ffff91ef111ae000 
[  517.427924] FS:  0000000000000000(0000) GS:ffff91f2af940000(0000) knlGS:0000000000000000 
[  517.464344] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 
[  517.491411] CR2: ffffffffb804b420 CR3: 0000000480814003 CR4: 00000000001706e0 
[  517.526491] Call Trace: 
[  517.538343]  _raw_spin_lock+0x1a/0x20 
[  517.556233]  xfs_trans_ail_delete+0x29/0x100 [xfs] 
[  517.577945]  xfs_buf_item_done+0x24/0x30 [xfs] 
[  517.598086]  xfs_buf_ioend+0x72/0x1d0 [xfs] 
[  517.616961]  xfs_trans_committed_bulk+0x18e/0x300 [xfs] 
[  517.640970]  ? select_idle_sibling+0x3f4/0x4d0 
[  517.661445]  ? enqueue_task+0x4b/0x140 
[  517.678393]  ? check_preempt_curr+0x2f/0x70 
[  517.697236]  ? ttwu_do_wakeup+0x17/0x150 
[  517.714940]  ? try_to_wake_up+0x1cd/0x4d0 
[  517.732945]  ? __wake_up_common_lock+0x8a/0xc0 
[  517.752906]  ? __raw_callee_save___native_queued_spin_unlock+0x11/0x1e 
[  517.782397]  xlog_cil_committed+0x15d/0x190 [xfs] 
[  517.803609]  xlog_cil_process_committed+0x57/0x80 [xfs] 
[  517.827306]  xlog_state_shutdown_callbacks+0xd0/0xf0 [xfs] 
[  517.852174]  xlog_force_shutdown+0xe5/0x160 [xfs] 
[  517.873427]  xfs_do_force_shutdown+0x51/0x130 [xfs] 
[  517.895689]  xlog_ioend_work+0x60/0xb0 [xfs] 
[  517.914951]  process_one_work+0x1eb/0x390 
[  517.932968]  worker_thread+0x53/0x3d0 
[  517.949055]  ? process_one_work+0x390/0x390 
[  517.967971]  kthread+0x10f/0x130 
[  517.982565]  ? set_kthread_struct+0x40/0x40 
[  518.003240]  ret_from_fork+0x22/0x30 
[  518.021094] Modules linked in: dm_mod tls rfkill iTCO_wdt iTCO_vendor_support intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm mgag200 i2c_algo_bit irqbypass drm_kms_helper ipmi_ssif rapl intel_cstate syscopyarea intel_uncore sysfillrect acpi_ipmi sunrpc pcspkr sysimgblt fb_sys_fops lpc_ich cec hpilo ipmi_si ipmi_devintf ioatdma ipmi_msghandler acpi_power_meter dca drm fuse xfs libcrc32c sr_mod cdrom ata_generic sd_mod t10_pi sg ata_piix crct10dif_pclmul crc32_pclmul crc32c_intel libata ghash_clmulni_intel serio_raw tg3 hpsa hpwdt scsi_transport_sas 
[  518.265394] CR2: ffffffffb804b420 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: XFS new panic after new merges
  2021-09-09  3:04 XFS new panic after new merges Murphy Zhou
@ 2021-09-09 12:17 ` Murphy Zhou
  2021-09-13 21:15   ` Darrick J. Wong
  0 siblings, 1 reply; 3+ messages in thread
From: Murphy Zhou @ 2021-09-09 12:17 UTC (permalink / raw)
  To: linux-xfs, zlang

On Thu, Sep 09, 2021 at 11:04:25AM +0800, Murphy Zhou wrote:
> Hi,
> 
> After this batch of updates in Linus tree,
> 
> eceae1e Merge tag 'configfs-5.15' of git://git.infradead.org/users/hch/configfs
> 265113f Merge tag 'dlm-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm
> 111c1aa Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
> 815409a Merge tag 'ovl-update-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
> 412106c Merge tag 'erofs-for-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
> 89594c7 Merge tag 'fscache-next-20210829' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
> 90c90cd Merge tag 'xfs-5.15-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
> 57c78a2 Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
> bcfeebb Merge branch 'exit-cleanups-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
> 4898370 Merge branch 'siginfo-si_trapno-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
> 65266a7 Merge remote-tracking branch 'tip/sched/arm64' into for-next/core
> 61dc131 Merge tag 'iomap-5.15-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git
> 
> 
> xfstests on xfs start to panic some times when running xfs/006 or xfs/264.
> 
> It's reproducible but not always. Sometimes on some servers 100 loops of
> these tests pass. Sometimes it panics on the first run of xfs/006.
> 
> Due to it's not always reproducible, bisecting is not an option. And it's
> hard to say it can not be reproduced on any condition. So far it can be
> reproduced via these mkfs options:
> 
> crc=1 finobt=1,sparse=1,rmapbt=0 reflink=1,bigtime=0,inobtcount=0
> or
> crc=1 finobt=1,sparse=1,rmapbt=1 reflink=1,bigtime=0,inobtcount=0
> or
> crc=1 finobt=1,sparse=1,rmapbt=0 reflink=1,bigtime=1,inobtcount=1
> or
> crc=1 finobt=1,sparse=1,rmapbt=1 reflink=1,bigtime=1,inobtcount=1

Reproduced with:
crc=0 finobt=0,sparse=0,rmapbt=0 reflink=0,bigtime=0,inobtcount=0

#xfs info:
meta-data=/dev/sda4              isize=256    agcount=16, agsize=245696 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=0        finobt=0, sparse=0, rmapbt=0
         =                       reflink=0    bigtime=0 inobtcount=0
data     =                       bsize=4096   blocks=3931136, imaxpct=25
         =                       sunit=64     swidth=192 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=64 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

# call traces:
[  936.833349] run fstests xfs/264 at 2021-09-09 05:30:38 
[  939.619013] XFS (dm-0): Mounting V4 Filesystem 
[  939.754638] XFS (dm-0): Ending clean mount 
[  939.773517] xfs filesystem being mounted at /mnt/xfstests/scratch supports timestamps until 2038 (0x7fffffff) 
[  940.054513] XFS (dm-0): xlog_verify_grant_tail: space > BBTOB(tail_blocks) 
[  947.346567] Buffer I/O error on dev dm-0, logical block 31457152, async page read 
[  947.382053] Buffer I/O error on dev dm-0, logical block 31457153, async page read 
[  947.416477] Buffer I/O error on dev dm-0, logical block 31457154, async page read 
[  947.450734] Buffer I/O error on dev dm-0, logical block 31457155, async page read 
[  947.484872] Buffer I/O error on dev dm-0, logical block 31457156, async page read 
[  947.518948] Buffer I/O error on dev dm-0, logical block 31457157, async page read 
[  947.553457] Buffer I/O error on dev dm-0, logical block 31457158, async page read 
[  947.587913] Buffer I/O error on dev dm-0, logical block 31457159, async page read 
[  947.711538] XFS (dm-0): Unmounting Filesystem 
[  947.775594] XFS (dm-0): log I/O error -5 
[  947.793587] XFS (dm-0): Log I/O Error (0x2) detected at xlog_ioend_work+0x60/0xb0 [xfs] (fs/xfs/xfs_log.c:1377).  Shutting down filesystem. 
[  947.850642] XFS (dm-0): Please unmount the filesystem and rectify the problem(s) 
[  947.969751] XFS (dm-0): Mounting V4 Filesystem 
[  948.281026] XFS (dm-0): Starting recovery (logdev: internal) 
[  948.415502] XFS (dm-0): Ending recovery (logdev: internal) 
[  948.440958] xfs filesystem being mounted at /mnt/xfstests/scratch supports timestamps until 2038 (0x7fffffff) 
[  948.490144] XFS (dm-0): Unmounting Filesystem 
[  948.547763] XFS (dm-0): Mounting V4 Filesystem 
[  948.697482] XFS (dm-0): Ending clean mount 
[  948.716031] xfs filesystem being mounted at /mnt/xfstests/scratch supports timestamps until 2038 (0x7fffffff) 
[  948.956457] XFS (dm-0): xlog_verify_grant_tail: space > BBTOB(tail_blocks) 
[  959.464801] Buffer I/O error on dev dm-0, logical block 31457152, async page read 
[  959.499451] Buffer I/O error on dev dm-0, logical block 31457153, async page read 
[  959.533860] Buffer I/O error on dev dm-0, logical block 31457154, async page read 
[  959.568015] Buffer I/O error on dev dm-0, logical block 31457155, async page read 
[  959.601914] Buffer I/O error on dev dm-0, logical block 31457156, async page read 
[  959.636180] Buffer I/O error on dev dm-0, logical block 31457157, async page read 
[  959.670276] Buffer I/O error on dev dm-0, logical block 31457158, async page read 
[  959.704800] Buffer I/O error on dev dm-0, logical block 31457159, async page read 
[  959.828239] XFS (dm-0): Unmounting Filesystem 
[  959.884350] XFS (dm-0): log I/O error -5 
[  959.908497] BUG: unable to handle page fault for address: 0000000000784ca0 
[  959.939059] #PF: supervisor write access in kernel mode 
[  959.962998] #PF: error_code(0x0002) - not-present page 
[  959.987047] PGD 0 P4D 0  
[  959.998392] Oops: 0002 [#1] SMP PTI 
[  960.014227] CPU: 6 PID: 278 Comm: kworker/6:1H Kdump: loaded Not tainted 5.14.0+ #1 
[  960.049489] Hardware name: HP ProLiant DL388p Gen8, BIOS P70 09/18/2013 
[  960.079784] Workqueue: xfs-log/dm-0 xlog_ioend_work [xfs] 
[  960.104130] RIP: 0010:native_queued_spin_lock_slowpath.part.0+0x15f/0x190 
[  960.134590] Code: f3 90 48 8b 31 48 85 f6 74 f6 eb d9 c1 ee 12 83 e0 03 83 ee 01 48 c1 e0 05 48 63 f6 48 05 40 94 02 00 48 03 04 f5 80 2a 21 83 <48> 89 08 8b 41 08 85 c0 75 09 f3 90 8b 41 08 85 c0 74 f7 48 8b 31 
[  960.220279] RSP: 0018:ffffba7203a5fb90 EFLAGS: 00010206 
[  960.244282] RAX: 0000000000784ca0 RBX: 0000000b00003800 RCX: ffff9827ef629440 
[  960.278921] RDX: ffff982432186cc0 RSI: 0000000000003aa5 RDI: ffff982432186cc0 
[  960.311690] RBP: ffff982430072b58 R08: 00000000001c0000 R09: ffff98240052e3e8 
[  960.344228] R10: 0000000000000035 R11: 00000000006b2112 R12: ffff982432186c80 
[  960.377161] R13: ffff982432186cc0 R14: 0000000000000008 R15: ffff9820c424f000 
[  960.409632] FS:  0000000000000000(0000) GS:ffff9827ef600000(0000) knlGS:0000000000000000 
[  960.446075] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 
[  960.472285] CR2: 0000000000784ca0 CR3: 00000004431cc002 CR4: 00000000001706e0 
[  960.505219] Call Trace: 
[  960.516170]  _raw_spin_lock+0x1a/0x20 
[  960.532960]  xfs_trans_ail_delete+0x29/0x100 [xfs] 
[  960.554808]  xfs_buf_item_done+0x24/0x30 [xfs] 
[  960.575268]  xfs_buf_ioend+0x72/0x1d0 [xfs] 
[  960.594423]  xfs_trans_committed_bulk+0x18e/0x300 [xfs] 
[  960.618208]  ? select_idle_sibling+0x3f4/0x4d0 
[  960.638689]  ? enqueue_task+0x4b/0x140 
[  960.655919]  ? check_preempt_curr+0x2f/0x70 
[  960.675101]  ? ttwu_do_wakeup+0x17/0x150 
[  960.693280]  ? try_to_wake_up+0x1cd/0x4d0 
[  960.711314]  ? __wake_up_common_lock+0x8a/0xc0 
[  960.731621]  ? __raw_callee_save___native_queued_spin_unlock+0x11/0x1e 
[  960.761348]  xlog_cil_committed+0x15d/0x190 [xfs] 
[  960.782956]  xlog_cil_process_committed+0x57/0x80 [xfs] 
[  960.806288]  xlog_state_shutdown_callbacks+0xd0/0xf0 [xfs] 
[  960.831350]  xlog_force_shutdown+0xe5/0x160 [xfs] 
[  960.852604]  xfs_do_force_shutdown+0x51/0x130 [xfs] 
[  960.874967]  xlog_ioend_work+0x60/0xb0 [xfs] 
[  960.894218]  process_one_work+0x1eb/0x390 
[  960.912442]  worker_thread+0x53/0x3d0 
[  960.928900]  ? process_one_work+0x390/0x390 
[  960.947735]  kthread+0x10f/0x130 
[  960.962145]  ? set_kthread_struct+0x40/0x40 
[  960.980946]  ret_from_fork+0x22/0x30 
[  960.996965] Modules linked in: dm_mod tls rfkill intel_rapl_msr intel_rapl_common iTCO_wdt sb_edac iTCO_vendor_support x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass rapl mgag200 intel_cstate i2c_algo_bit drm_kms_helper intel_uncore pcspkr syscopyarea sysfillrect sysimgblt sunrpc ipmi_ssif fb_sys_fops lpc_ich cec hpilo ioatdma acpi_ipmi dca ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter drm fuse xfs libcrc32c ata_generic sd_mod t10_pi sg ata_piix crct10dif_pclmul crc32_pclmul crc32c_intel libata ghash_clmulni_intel tg3 serio_raw hpsa hpwdt scsi_transport_sas 
[  961.246844] CR2: 0000000000784ca0 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: XFS new panic after new merges
  2021-09-09 12:17 ` Murphy Zhou
@ 2021-09-13 21:15   ` Darrick J. Wong
  0 siblings, 0 replies; 3+ messages in thread
From: Darrick J. Wong @ 2021-09-13 21:15 UTC (permalink / raw)
  To: Murphy Zhou; +Cc: linux-xfs, zlang

On Thu, Sep 09, 2021 at 08:17:04PM +0800, Murphy Zhou wrote:
> On Thu, Sep 09, 2021 at 11:04:25AM +0800, Murphy Zhou wrote:
> > Hi,
> > 
> > After this batch of updates in Linus tree,
> > 
> > eceae1e Merge tag 'configfs-5.15' of git://git.infradead.org/users/hch/configfs
> > 265113f Merge tag 'dlm-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm
> > 111c1aa Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
> > 815409a Merge tag 'ovl-update-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
> > 412106c Merge tag 'erofs-for-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
> > 89594c7 Merge tag 'fscache-next-20210829' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
> > 90c90cd Merge tag 'xfs-5.15-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
> > 57c78a2 Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
> > bcfeebb Merge branch 'exit-cleanups-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
> > 4898370 Merge branch 'siginfo-si_trapno-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
> > 65266a7 Merge remote-tracking branch 'tip/sched/arm64' into for-next/core
> > 61dc131 Merge tag 'iomap-5.15-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git
> > 
> > 
> > xfstests on xfs start to panic some times when running xfs/006 or xfs/264.
> > 
> > It's reproducible but not always. Sometimes on some servers 100 loops of
> > these tests pass. Sometimes it panics on the first run of xfs/006.
> > 
> > Due to it's not always reproducible, bisecting is not an option. And it's
> > hard to say it can not be reproduced on any condition. So far it can be
> > reproduced via these mkfs options:
> > 
> > crc=1 finobt=1,sparse=1,rmapbt=0 reflink=1,bigtime=0,inobtcount=0
> > or
> > crc=1 finobt=1,sparse=1,rmapbt=1 reflink=1,bigtime=0,inobtcount=0
> > or
> > crc=1 finobt=1,sparse=1,rmapbt=0 reflink=1,bigtime=1,inobtcount=1
> > or
> > crc=1 finobt=1,sparse=1,rmapbt=1 reflink=1,bigtime=1,inobtcount=1
> 
> Reproduced with:
> crc=0 finobt=0,sparse=0,rmapbt=0 reflink=0,bigtime=0,inobtcount=0
> 
> #xfs info:
> meta-data=/dev/sda4              isize=256    agcount=16, agsize=245696 blks
>          =                       sectsz=512   attr=2, projid32bit=1
>          =                       crc=0        finobt=0, sparse=0, rmapbt=0
>          =                       reflink=0    bigtime=0 inobtcount=0
> data     =                       bsize=4096   blocks=3931136, imaxpct=25
>          =                       sunit=64     swidth=192 blks
> naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
> log      =internal log           bsize=4096   blocks=2560, version=2
>          =                       sectsz=512   sunit=64 blks, lazy-count=1
> realtime =none                   extsz=4096   blocks=0, rtextents=0
> 
> # call traces:
> [  936.833349] run fstests xfs/264 at 2021-09-09 05:30:38 
> [  939.619013] XFS (dm-0): Mounting V4 Filesystem 
> [  939.754638] XFS (dm-0): Ending clean mount 
> [  939.773517] xfs filesystem being mounted at /mnt/xfstests/scratch supports timestamps until 2038 (0x7fffffff) 
> [  940.054513] XFS (dm-0): xlog_verify_grant_tail: space > BBTOB(tail_blocks) 
> [  947.346567] Buffer I/O error on dev dm-0, logical block 31457152, async page read 
> [  947.382053] Buffer I/O error on dev dm-0, logical block 31457153, async page read 
> [  947.416477] Buffer I/O error on dev dm-0, logical block 31457154, async page read 
> [  947.450734] Buffer I/O error on dev dm-0, logical block 31457155, async page read 
> [  947.484872] Buffer I/O error on dev dm-0, logical block 31457156, async page read 
> [  947.518948] Buffer I/O error on dev dm-0, logical block 31457157, async page read 
> [  947.553457] Buffer I/O error on dev dm-0, logical block 31457158, async page read 
> [  947.587913] Buffer I/O error on dev dm-0, logical block 31457159, async page read 
> [  947.711538] XFS (dm-0): Unmounting Filesystem 
> [  947.775594] XFS (dm-0): log I/O error -5 
> [  947.793587] XFS (dm-0): Log I/O Error (0x2) detected at xlog_ioend_work+0x60/0xb0 [xfs] (fs/xfs/xfs_log.c:1377).  Shutting down filesystem. 
> [  947.850642] XFS (dm-0): Please unmount the filesystem and rectify the problem(s) 
> [  947.969751] XFS (dm-0): Mounting V4 Filesystem 
> [  948.281026] XFS (dm-0): Starting recovery (logdev: internal) 
> [  948.415502] XFS (dm-0): Ending recovery (logdev: internal) 
> [  948.440958] xfs filesystem being mounted at /mnt/xfstests/scratch supports timestamps until 2038 (0x7fffffff) 
> [  948.490144] XFS (dm-0): Unmounting Filesystem 
> [  948.547763] XFS (dm-0): Mounting V4 Filesystem 
> [  948.697482] XFS (dm-0): Ending clean mount 
> [  948.716031] xfs filesystem being mounted at /mnt/xfstests/scratch supports timestamps until 2038 (0x7fffffff) 
> [  948.956457] XFS (dm-0): xlog_verify_grant_tail: space > BBTOB(tail_blocks) 
> [  959.464801] Buffer I/O error on dev dm-0, logical block 31457152, async page read 
> [  959.499451] Buffer I/O error on dev dm-0, logical block 31457153, async page read 
> [  959.533860] Buffer I/O error on dev dm-0, logical block 31457154, async page read 
> [  959.568015] Buffer I/O error on dev dm-0, logical block 31457155, async page read 
> [  959.601914] Buffer I/O error on dev dm-0, logical block 31457156, async page read 
> [  959.636180] Buffer I/O error on dev dm-0, logical block 31457157, async page read 
> [  959.670276] Buffer I/O error on dev dm-0, logical block 31457158, async page read 
> [  959.704800] Buffer I/O error on dev dm-0, logical block 31457159, async page read 
> [  959.828239] XFS (dm-0): Unmounting Filesystem 
> [  959.884350] XFS (dm-0): log I/O error -5 
> [  959.908497] BUG: unable to handle page fault for address: 0000000000784ca0 

Interesting... if you enable KASAN, does it have anything helpful to
report?  This kind of looks like the xfs_log_item could have been
recycled before we finished shutting down the log...?

--D

> [  959.939059] #PF: supervisor write access in kernel mode 
> [  959.962998] #PF: error_code(0x0002) - not-present page 
> [  959.987047] PGD 0 P4D 0  
> [  959.998392] Oops: 0002 [#1] SMP PTI 
> [  960.014227] CPU: 6 PID: 278 Comm: kworker/6:1H Kdump: loaded Not tainted 5.14.0+ #1 
> [  960.049489] Hardware name: HP ProLiant DL388p Gen8, BIOS P70 09/18/2013 
> [  960.079784] Workqueue: xfs-log/dm-0 xlog_ioend_work [xfs] 
> [  960.104130] RIP: 0010:native_queued_spin_lock_slowpath.part.0+0x15f/0x190 
> [  960.134590] Code: f3 90 48 8b 31 48 85 f6 74 f6 eb d9 c1 ee 12 83 e0 03 83 ee 01 48 c1 e0 05 48 63 f6 48 05 40 94 02 00 48 03 04 f5 80 2a 21 83 <48> 89 08 8b 41 08 85 c0 75 09 f3 90 8b 41 08 85 c0 74 f7 48 8b 31 
> [  960.220279] RSP: 0018:ffffba7203a5fb90 EFLAGS: 00010206 
> [  960.244282] RAX: 0000000000784ca0 RBX: 0000000b00003800 RCX: ffff9827ef629440 
> [  960.278921] RDX: ffff982432186cc0 RSI: 0000000000003aa5 RDI: ffff982432186cc0 
> [  960.311690] RBP: ffff982430072b58 R08: 00000000001c0000 R09: ffff98240052e3e8 
> [  960.344228] R10: 0000000000000035 R11: 00000000006b2112 R12: ffff982432186c80 
> [  960.377161] R13: ffff982432186cc0 R14: 0000000000000008 R15: ffff9820c424f000 
> [  960.409632] FS:  0000000000000000(0000) GS:ffff9827ef600000(0000) knlGS:0000000000000000 
> [  960.446075] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 
> [  960.472285] CR2: 0000000000784ca0 CR3: 00000004431cc002 CR4: 00000000001706e0 
> [  960.505219] Call Trace: 
> [  960.516170]  _raw_spin_lock+0x1a/0x20 
> [  960.532960]  xfs_trans_ail_delete+0x29/0x100 [xfs] 
> [  960.554808]  xfs_buf_item_done+0x24/0x30 [xfs] 
> [  960.575268]  xfs_buf_ioend+0x72/0x1d0 [xfs] 
> [  960.594423]  xfs_trans_committed_bulk+0x18e/0x300 [xfs] 
> [  960.618208]  ? select_idle_sibling+0x3f4/0x4d0 
> [  960.638689]  ? enqueue_task+0x4b/0x140 
> [  960.655919]  ? check_preempt_curr+0x2f/0x70 
> [  960.675101]  ? ttwu_do_wakeup+0x17/0x150 
> [  960.693280]  ? try_to_wake_up+0x1cd/0x4d0 
> [  960.711314]  ? __wake_up_common_lock+0x8a/0xc0 
> [  960.731621]  ? __raw_callee_save___native_queued_spin_unlock+0x11/0x1e 
> [  960.761348]  xlog_cil_committed+0x15d/0x190 [xfs] 
> [  960.782956]  xlog_cil_process_committed+0x57/0x80 [xfs] 
> [  960.806288]  xlog_state_shutdown_callbacks+0xd0/0xf0 [xfs] 
> [  960.831350]  xlog_force_shutdown+0xe5/0x160 [xfs] 
> [  960.852604]  xfs_do_force_shutdown+0x51/0x130 [xfs] 
> [  960.874967]  xlog_ioend_work+0x60/0xb0 [xfs] 
> [  960.894218]  process_one_work+0x1eb/0x390 
> [  960.912442]  worker_thread+0x53/0x3d0 
> [  960.928900]  ? process_one_work+0x390/0x390 
> [  960.947735]  kthread+0x10f/0x130 
> [  960.962145]  ? set_kthread_struct+0x40/0x40 
> [  960.980946]  ret_from_fork+0x22/0x30 
> [  960.996965] Modules linked in: dm_mod tls rfkill intel_rapl_msr intel_rapl_common iTCO_wdt sb_edac iTCO_vendor_support x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass rapl mgag200 intel_cstate i2c_algo_bit drm_kms_helper intel_uncore pcspkr syscopyarea sysfillrect sysimgblt sunrpc ipmi_ssif fb_sys_fops lpc_ich cec hpilo ioatdma acpi_ipmi dca ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter drm fuse xfs libcrc32c ata_generic sd_mod t10_pi sg ata_piix crct10dif_pclmul crc32_pclmul crc32c_intel libata ghash_clmulni_intel tg3 serio_raw hpsa hpwdt scsi_transport_sas 
> [  961.246844] CR2: 0000000000784ca0 

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-09-13 21:15 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-09  3:04 XFS new panic after new merges Murphy Zhou
2021-09-09 12:17 ` Murphy Zhou
2021-09-13 21:15   ` Darrick J. Wong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.