XFS kernel errors bringing up OSD

* XFS kernel errors bringing up OSD
@ 2017-09-12 13:25 Wyllys Ingersoll
  2017-09-12 22:54 ` Brad Hubbard
  2017-09-13 15:01 ` Jeff Layton
  0 siblings, 2 replies; 7+ messages in thread
From: Wyllys Ingersoll @ 2017-09-12 13:25 UTC (permalink / raw)
  To: Ceph Development

Ceph 10.2.7
Kernel 4.12.10

We are seeing frequent kernel errors that cause the XFS based OSD
processes to crash and restart.  Has anyone seen or reported something
like this before?  Maybe due to bad or failing disks, but its hard to
tell.

[Tue Sep 12 09:18:32 2017] BUG: unable to handle kernel NULL pointer
dereference at 0000000000000090
[Tue Sep 12 09:18:32 2017] IP: xfs_da3_node_read+0x2e/0xb0 [xfs]
[Tue Sep 12 09:18:32 2017] PGD 0
[Tue Sep 12 09:18:32 2017] P4D 0

[Tue Sep 12 09:18:32 2017] Oops: 0000 [#23] SMP
[Tue Sep 12 09:18:32 2017] Modules linked in: binfmt_misc xfs
libcrc32c dm_crypt intel_rapl x86_pkg_temp_thermal ipmi_ssif
intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul
crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64
input_leds crypto_simd glue_helper cryptd shpchp intel_cstate
intel_rapl_perf lpc_ich mei_me mei mac_hid ipmi_si ipmi_devintf
ipmi_msghandler acpi_power_meter acpi_pad 8021q garp mrp stp llc
bonding autofs4 btrfs xor raid6_pq ses enclosure mlx4_en hid_generic
ttm usbhid hid drm_kms_helper syscopyarea igb sysfillrect e1000e dca
sysimgblt fb_sys_fops mlx4_core mpt3sas ptp ahci devlink drm
raid_class pps_core libahci scsi_transport_sas i2c_algo_bit
[Tue Sep 12 09:18:32 2017] CPU: 8 PID: 40382 Comm: tp_fstore_op
Tainted: G      D         4.12.10-041210-generic #201708300614
[Tue Sep 12 09:18:32 2017] Hardware name: AIC SB303-LB/LIBRA, BIOS
LIBKV070 08/03/2016
[Tue Sep 12 09:18:32 2017] task: ffff8f03b4220000 task.stack: ffff9a6a75ff0000
[Tue Sep 12 09:18:32 2017] RIP: 0010:xfs_da3_node_read+0x2e/0xb0 [xfs]
[Tue Sep 12 09:18:32 2017] RSP: 0018:ffff9a6a75ff3d30 EFLAGS: 00010282
[Tue Sep 12 09:18:32 2017] RAX: 0000000000000000 RBX: ffff8f08b8ce9d98
RCX: 0000000000000001
[Tue Sep 12 09:18:32 2017] RDX: ffffffffc0a37700 RSI: 0000000000000000
RDI: ffff9a6a75ff3cd8
[Tue Sep 12 09:18:32 2017] RBP: ffff9a6a75ff3d48 R08: 00000000ffffffff
R09: 0000000000000001
[Tue Sep 12 09:18:32 2017] R10: 0000000000000001 R11: 0000000000000001
R12: ffff9a6a75ff3d78
[Tue Sep 12 09:18:32 2017] R13: 0000000000000005 R14: 00000000894e93b5
R15: ffff8f1536502010
[Tue Sep 12 09:18:32 2017] FS:  00007f82c9b70700(0000)
GS:ffff8f26ffc00000(0000) knlGS:0000000000000000
[Tue Sep 12 09:18:32 2017] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Tue Sep 12 09:18:32 2017] CR2: 0000000000000090 CR3: 00000017cf710000
CR4: 00000000001406e0
[Tue Sep 12 09:18:32 2017] Call Trace:
[Tue Sep 12 09:18:32 2017]  xfs_attr3_node_inactive+0xd0/0x230 [xfs]
[Tue Sep 12 09:18:32 2017]  xfs_attr_inactive+0x267/0x280 [xfs]
[Tue Sep 12 09:18:32 2017]  xfs_inactive+0xe2/0x110 [xfs]
[Tue Sep 12 09:18:32 2017]  xfs_fs_destroy_inode+0x9f/0x200 [xfs]
[Tue Sep 12 09:18:32 2017]  destroy_inode+0x3b/0x60
[Tue Sep 12 09:18:32 2017]  evict+0x136/0x1a0
[Tue Sep 12 09:18:32 2017]  iput+0x14c/0x220
[Tue Sep 12 09:18:32 2017]  do_unlinkat+0x1a7/0x310
[Tue Sep 12 09:18:32 2017]  SyS_unlink+0x16/0x20
[Tue Sep 12 09:18:32 2017]  entry_SYSCALL_64_fastpath+0x1e/0xa9
[Tue Sep 12 09:18:32 2017] RIP: 0033:0x7f82d7753ea7
[Tue Sep 12 09:18:32 2017] RSP: 002b:00007f82c9b6d2e8 EFLAGS: 00000246
ORIG_RAX: 0000000000000057
[Tue Sep 12 09:18:32 2017] RAX: ffffffffffffffda RBX: 00005606b600e000
RCX: 00007f82d7753ea7
[Tue Sep 12 09:18:32 2017] RDX: 00007f82c9b6d2a0 RSI: 0000000000000000
RDI: 00005606bfd32a80
[Tue Sep 12 09:18:32 2017] RBP: 000056033335ab20 R08: 0000000000450000
R09: 0000000000000001
[Tue Sep 12 09:18:32 2017] R10: 0000000000000000 R11: 0000000000000246
R12: 00007f82da606c60
[Tue Sep 12 09:18:32 2017] R13: 00005606812ebd60 R14: 00000000040ffda5
R15: 00005606dfb64a60
[Tue Sep 12 09:18:32 2017] Code: 00 00 55 48 89 e5 41 54 53 4d 89 c4
48 89 fb 48 83 ec 08 68 00 77 a3 c0 e8 e0 fe ff ff 85 c0 5a 75 46 48
85 db 74 41 49 8b 34 24 <48> 8b 96 90 00 00 00 0f b7 52 08 66 c1 c2 08
66 81 fa be 3e 74
[Tue Sep 12 09:18:32 2017] RIP: xfs_da3_node_read+0x2e/0xb0 [xfs] RSP:
ffff9a6a75ff3d30
[Tue Sep 12 09:18:32 2017] CR2: 0000000000000090

^ permalink raw reply	[flat|nested] 7+ messages in thread