* cpu_needs_another_gp: unable to handle kernel paging request
@ 2017-09-06 9:53 Alex Lyakas
2017-09-06 15:02 ` Paul E. McKenney
0 siblings, 1 reply; 4+ messages in thread
From: Alex Lyakas @ 2017-09-06 9:53 UTC (permalink / raw)
To: paulmck, josh; +Cc: linux-kernel
Hello,
Kernel 3.18.19 hit the following panic[1]. Can you please advise on how to
debug this further, or if there is any known issue that you recognize.
Thanks,
Alex.
[1]
Sep 5 01:05:02.092499 vsa-0000000f-vc-0 kernel: [1294776.890064] BUG:
unable to handle kernel paging request at fffffffffffffeda
Sep 5 01:05:02.092517 vsa-0000000f-vc-0 kernel: [1294776.890892] IP:
[<ffffffff810d12e5>] cpu_needs_another_gp+0x25/0x80
Sep 5 01:05:02.092517 vsa-0000000f-vc-0 kernel: [1294776.891007] PGD
1c19067 PUD 1c1b067 PMD 0
Sep 5 01:05:02.092518 vsa-0000000f-vc-0 kernel: [1294776.891007] Oops: 0002
[#1] PREEMPT SMP
Sep 5 01:05:02.092520 vsa-0000000f-vc-0 kernel: [1294776.891007] Modules
linked in: xt_nat(E) veth(E) xt_addrtype(E) br_netfilter(E) xfrm_user(E)
xfrm4_tunnel(E) tunnel4(E) ipcomp(E) xfrm_ipcomp(E) esp4(E) ah4(E) 8021q(E)
garp(E) mrp(E) xt_multiport(E) sd_mod(E) bonding(E) ib_iser(OE)
iscsi_tcp(OE) libiscsi_tcp(OE) libiscsi(OE) scsi_transport_iscsi(OE)
dm_zcache(OE) xfs(OE) btrfs(OE) raid456(OE) async_raid6_recov(E)
async_memcpy(E) async_pq(E) async_xor(E) xor(E) async_tx(E) raid6_pq(E)
raid1(OE) md_mod(OE) rdma_ucm(OE) ib_uverbs(OE) mlx4_ib(OE) mlx4_en(OE)
ipt_MASQUERADE(E) nf_nat_masquerade_ipv4(E) iptable_nat(E) nf_nat_ipv4(E)
nf_nat(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) xt_conntrack(E)
nf_conntrack(E) ipt_REJECT(E) nf_reject_ipv4(E) xt_CHECKSUM(E)
iptable_mangle(E) xt_tcpudp(E) bridge(E) stp(E) llc(E) vxlan(E)
ip6_udp_tunnel(E) udp_tunnel(E) ptp(E) pps_core(E) ip6table_filter(E)
ip6_tables(E) iptable_filter(E) ip_tables(E) x_tables(E) mlx4_core(OE)
deflate(E) ctr(E) twofish_generic(E) twofish_avx_x86_64(E)
twofish_x86_64_3way(E) twofish_x86_64(E) twofish_common(E)
camellia_generic(E) camellia_aesni_avx2(E) camellia_aesni_avx_x86_64(E)
camellia_x86_64(E) serpent_avx2(E) serpent_avx_x86_64(E)
serpent_sse2_x86_64(E) xts(E) serpent_generic(E) blowfish_generic(E)
blowfish_x86_64(E) blowfish_common(E) cast5_avx_x86_64(E) cast5_generic(E)
cast_common(E) des3_ede_x86_64(E) des_generic(E) cmac(E) xcbc(E) rmd160(E)
isert_scst(OE) crypto_null(E) rdma_cm(OE) af_key(E) iw_cm(OE) xfrm_algo(E)
ib_cm(OE) ib_sa(OE) ib_mad(OE) ib_core(OE) ib_addr(OE) compat(OE)
iscsi_scst(OE) scst_utgt(OE) scst_vdisk(OE) libcrc32c(E) scst(OE)
nls_iso8859_1(E) kvm_intel(E) kvm(E) crct10dif_pclmul(E) crc32_pclmul(E)
ghash_clmulni_intel(E) aesni_intel(E) nfsd(OE) aes_x86_64(E) lrw(E)
gf128mul(E) glue_helper(E) ablk_helper(E) cryptd(E) auth_rpcgss(E)
nfs_acl(E) mac_hid(E) nfs(E) lockd(E) grace(E) sunrpc(E) fscache(E)
dm_multipath(OE) scsi_dh(E) ttm(E) drm_kms_helper(E) serio_raw(E) drm(E)
syscopyarea(E) sysfillrect(E) sysimgblt(E) i2c_piix4(E) i6300esb(E) lp(E)
parport(E) dm_iostat(OE) ata_generic(E) pata_acpi(E) ata_piix(E) libata(E)
psmouse(E) scsi_mod(OE)
Sep 5 01:05:02.092522 vsa-0000000f-vc-0 kernel: [1294776.892666] CPU: 5
PID: 14385 Comm: aws Tainted: G W OE 3.18.19-zadara05 #1
Sep 5 01:05:02.092523 vsa-0000000f-vc-0 kernel: [1294776.892666] Hardware
name: Bochs Bochs, BIOS Bochs 01/01/2011
Sep 5 01:05:02.092524 vsa-0000000f-vc-0 kernel: [1294776.892666] task:
ffff880022da6540 ti: ffff88000a9a4000 task.ti: ffff88000a9a4000
Sep 5 01:05:02.092525 vsa-0000000f-vc-0 kernel: [1294776.892666] RIP:
0010:[<ffffffff810d12e5>] [<ffffffff810d12e5>]
cpu_needs_another_gp+0x25/0x80
Sep 5 01:05:02.092525 vsa-0000000f-vc-0 kernel: [1294776.892666] RSP:
0000:ffff8808bfca3e88 EFLAGS: 00010097
Sep 5 01:05:02.092526 vsa-0000000f-vc-0 kernel: [1294776.892666] RAX:
0000000000000000 RBX: ffffffff81c55c40 RCX: fffffffffffffeda
Sep 5 01:05:02.092526 vsa-0000000f-vc-0 kernel: [1294776.892666] RDX:
fffffffffffffeda RSI: ffff8808bfcad600 RDI: ffffffff81c55c40
Sep 5 01:05:02.092527 vsa-0000000f-vc-0 kernel: [1294776.892666] RBP:
ffff8808bfca3e88 R08: 00000000000021ac R09: 0000000000000100
Sep 5 01:05:02.092527 vsa-0000000f-vc-0 kernel: [1294776.892666] R10:
0000000000000000 R11: 0000000000000005 R12: 0000000000000246
Sep 5 01:05:02.092529 vsa-0000000f-vc-0 kernel: [1294776.892666] R13:
0000000000000009 R14: 0000000000000100 R15: ffff8808bfcad600
Sep 5 01:05:02.092531 vsa-0000000f-vc-0 kernel: [1294776.892666] FS:
00007f158f7fe700(0000) GS:ffff8808bfca0000(0000) knlGS:0000000000000000
Sep 5 01:05:02.092531 vsa-0000000f-vc-0 kernel: [1294776.892666] CS: 0010
DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 5 01:05:02.092533 vsa-0000000f-vc-0 kernel: [1294776.892666] CR2:
fffffffffffffeda CR3: 0000000741e12000 CR4: 00000000003407e0
Sep 5 01:05:02.092554 vsa-0000000f-vc-0 kernel: [1294776.892666] DR0:
0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Sep 5 01:05:02.092566 vsa-0000000f-vc-0 kernel: [1294776.892666] DR3:
0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Sep 5 01:05:02.092568 vsa-0000000f-vc-0 kernel: [1294776.892666] Stack:
Sep 5 01:05:02.092569 vsa-0000000f-vc-0 kernel: [1294776.892666]
ffff8808bfca3ef8 ffffffff810d491c ffff88088e17d838 ffff88088e17d438
Sep 5 01:05:02.092571 vsa-0000000f-vc-0 kernel: [1294776.892666]
ffff880022da6540 ffff88000a9a7fd8 ffff8808bfca3eb8 ffff880799cad868
Sep 5 01:05:02.092572 vsa-0000000f-vc-0 kernel: [1294776.892666]
0000000000000004 0000000000000009 ffffffff81c0f0c8 0000000000000009
Sep 5 01:05:02.092572 vsa-0000000f-vc-0 kernel: [1294776.892666] Call
Trace:
Sep 5 01:05:02.092573 vsa-0000000f-vc-0 kernel: [1294776.892666] <IRQ>
Sep 5 01:05:02.092574 vsa-0000000f-vc-0 kernel: [1294776.892666]
[<ffffffff810d491c>] rcu_process_callbacks+0xcc/0x610
Sep 5 01:05:02.092576 vsa-0000000f-vc-0 kernel: [1294776.892666]
[<ffffffff81077025>] __do_softirq+0xf5/0x320
Sep 5 01:05:02.092578 vsa-0000000f-vc-0 kernel: [1294776.892666]
[<ffffffff81077575>] irq_exit+0x115/0x120
Sep 5 01:05:02.092579 vsa-0000000f-vc-0 kernel: [1294776.892666]
[<ffffffff8171a89a>] smp_apic_timer_interrupt+0x4a/0x60
Sep 5 01:05:02.092579 vsa-0000000f-vc-0 kernel: [1294776.892666]
[<ffffffff8171896d>] apic_timer_interrupt+0x6d/0x80
Sep 5 01:05:02.092580 vsa-0000000f-vc-0 kernel: [1294776.892666] <EOI>
Sep 5 01:05:02.092581 vsa-0000000f-vc-0 kernel: [1294776.892666]
[<ffffffff817179cd>] ? system_call_fastpath+0x16/0x1b
Sep 5 01:05:02.092582 vsa-0000000f-vc-0 kernel: [1294776.892666] Code: 84
00 00 00 00 00 0f 1f 44 00 00 55 48 8b 8f 50 11 00 00 31 c0 48 8b 97 48 11
00 00 48 89 e5 48 39 d1 74 02 5d c3 48 8b 47 10 83 <c0> 01 83 e0 01 48 83 c0
20 8b 44 87 20 85 c0 75 11 48 83 7e 48
Sep 5 01:05:02.092585 vsa-0000000f-vc-0 kernel: [1294776.892666] RIP
[<ffffffff810d12e5>] cpu_needs_another_gp+0x25/0x80
Sep 5 01:05:02.092586 vsa-0000000f-vc-0 kernel: [1294776.892666] RSP
<ffff8808bfca3e88>
Sep 5 01:05:02.092587 vsa-0000000f-vc-0 kernel: [1294776.892666] CR2:
fffffffffffffeda
Sep 5 01:05:02.092588 vsa-0000000f-vc-0 kernel: [1294776.892666] ---[ end
trace 9b3c5d4642bb89b5 ]---
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: cpu_needs_another_gp: unable to handle kernel paging request
2017-09-06 9:53 cpu_needs_another_gp: unable to handle kernel paging request Alex Lyakas
@ 2017-09-06 15:02 ` Paul E. McKenney
2017-09-07 7:47 ` Alex Lyakas
0 siblings, 1 reply; 4+ messages in thread
From: Paul E. McKenney @ 2017-09-06 15:02 UTC (permalink / raw)
To: Alex Lyakas; +Cc: josh, linux-kernel
On Wed, Sep 06, 2017 at 12:53:42PM +0300, Alex Lyakas wrote:
> Hello,
>
> Kernel 3.18.19 hit the following panic[1]. Can you please advise on
> how to debug this further, or if there is any known issue that you
> recognize.
>
> Thanks,
> Alex.
>
>
> [1]
> Sep 5 01:05:02.092499 vsa-0000000f-vc-0 kernel: [1294776.890064]
> BUG: unable to handle kernel paging request at fffffffffffffeda
> Sep 5 01:05:02.092517 vsa-0000000f-vc-0 kernel: [1294776.890892]
> IP: [<ffffffff810d12e5>] cpu_needs_another_gp+0x25/0x80
> Sep 5 01:05:02.092517 vsa-0000000f-vc-0 kernel: [1294776.891007]
> PGD 1c19067 PUD 1c1b067 PMD 0
> Sep 5 01:05:02.092518 vsa-0000000f-vc-0 kernel: [1294776.891007]
> Oops: 0002 [#1] PREEMPT SMP
> Sep 5 01:05:02.092520 vsa-0000000f-vc-0 kernel: [1294776.891007]
> Modules linked in: xt_nat(E) veth(E) xt_addrtype(E) br_netfilter(E)
> xfrm_user(E) xfrm4_tunnel(E) tunnel4(E) ipcomp(E) xfrm_ipcomp(E)
> esp4(E) ah4(E) 8021q(E) garp(E) mrp(E) xt_multiport(E) sd_mod(E)
> bonding(E) ib_iser(OE) iscsi_tcp(OE) libiscsi_tcp(OE) libiscsi(OE)
> scsi_transport_iscsi(OE) dm_zcache(OE) xfs(OE) btrfs(OE) raid456(OE)
> async_raid6_recov(E) async_memcpy(E) async_pq(E) async_xor(E) xor(E)
> async_tx(E) raid6_pq(E) raid1(OE) md_mod(OE) rdma_ucm(OE)
> ib_uverbs(OE) mlx4_ib(OE) mlx4_en(OE) ipt_MASQUERADE(E)
> nf_nat_masquerade_ipv4(E) iptable_nat(E) nf_nat_ipv4(E) nf_nat(E)
> nf_conntrack_ipv4(E) nf_defrag_ipv4(E) xt_conntrack(E)
> nf_conntrack(E) ipt_REJECT(E) nf_reject_ipv4(E) xt_CHECKSUM(E)
> iptable_mangle(E) xt_tcpudp(E) bridge(E) stp(E) llc(E) vxlan(E)
> ip6_udp_tunnel(E) udp_tunnel(E) ptp(E) pps_core(E)
> ip6table_filter(E) ip6_tables(E) iptable_filter(E) ip_tables(E)
> x_tables(E) mlx4_core(OE) deflate(E) ctr(E) twofish_generic(E)
> twofish_avx_x86_64(E) twofish_x86_64_3way(E) twofish_x86_64(E)
> twofish_common(E) camellia_generic(E) camellia_aesni_avx2(E)
> camellia_aesni_avx_x86_64(E) camellia_x86_64(E) serpent_avx2(E)
> serpent_avx_x86_64(E) serpent_sse2_x86_64(E) xts(E)
> serpent_generic(E) blowfish_generic(E) blowfish_x86_64(E)
> blowfish_common(E) cast5_avx_x86_64(E) cast5_generic(E)
> cast_common(E) des3_ede_x86_64(E) des_generic(E) cmac(E) xcbc(E)
> rmd160(E) isert_scst(OE) crypto_null(E) rdma_cm(OE) af_key(E)
> iw_cm(OE) xfrm_algo(E) ib_cm(OE) ib_sa(OE) ib_mad(OE) ib_core(OE)
> ib_addr(OE) compat(OE) iscsi_scst(OE) scst_utgt(OE) scst_vdisk(OE)
> libcrc32c(E) scst(OE) nls_iso8859_1(E) kvm_intel(E) kvm(E)
> crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E)
> aesni_intel(E) nfsd(OE) aes_x86_64(E) lrw(E) gf128mul(E)
> glue_helper(E) ablk_helper(E) cryptd(E) auth_rpcgss(E) nfs_acl(E)
> mac_hid(E) nfs(E) lockd(E) grace(E) sunrpc(E) fscache(E)
> dm_multipath(OE) scsi_dh(E) ttm(E) drm_kms_helper(E) serio_raw(E)
> drm(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) i2c_piix4(E)
> i6300esb(E) lp(E) parport(E) dm_iostat(OE) ata_generic(E)
> pata_acpi(E) ata_piix(E) libata(E) psmouse(E) scsi_mod(OE)
> Sep 5 01:05:02.092522 vsa-0000000f-vc-0 kernel: [1294776.892666]
> CPU: 5 PID: 14385 Comm: aws Tainted: G W OE
> 3.18.19-zadara05 #1
> Sep 5 01:05:02.092523 vsa-0000000f-vc-0 kernel: [1294776.892666]
> Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> Sep 5 01:05:02.092524 vsa-0000000f-vc-0 kernel: [1294776.892666]
> task: ffff880022da6540 ti: ffff88000a9a4000 task.ti:
> ffff88000a9a4000
> Sep 5 01:05:02.092525 vsa-0000000f-vc-0 kernel: [1294776.892666]
> RIP: 0010:[<ffffffff810d12e5>] [<ffffffff810d12e5>]
> cpu_needs_another_gp+0x25/0x80
> Sep 5 01:05:02.092525 vsa-0000000f-vc-0 kernel: [1294776.892666]
> RSP: 0000:ffff8808bfca3e88 EFLAGS: 00010097
> Sep 5 01:05:02.092526 vsa-0000000f-vc-0 kernel: [1294776.892666]
> RAX: 0000000000000000 RBX: ffffffff81c55c40 RCX: fffffffffffffeda
> Sep 5 01:05:02.092526 vsa-0000000f-vc-0 kernel: [1294776.892666]
> RDX: fffffffffffffeda RSI: ffff8808bfcad600 RDI: ffffffff81c55c40
> Sep 5 01:05:02.092527 vsa-0000000f-vc-0 kernel: [1294776.892666]
> RBP: ffff8808bfca3e88 R08: 00000000000021ac R09: 0000000000000100
> Sep 5 01:05:02.092527 vsa-0000000f-vc-0 kernel: [1294776.892666]
> R10: 0000000000000000 R11: 0000000000000005 R12: 0000000000000246
> Sep 5 01:05:02.092529 vsa-0000000f-vc-0 kernel: [1294776.892666]
> R13: 0000000000000009 R14: 0000000000000100 R15: ffff8808bfcad600
> Sep 5 01:05:02.092531 vsa-0000000f-vc-0 kernel: [1294776.892666]
> FS: 00007f158f7fe700(0000) GS:ffff8808bfca0000(0000)
> knlGS:0000000000000000
> Sep 5 01:05:02.092531 vsa-0000000f-vc-0 kernel: [1294776.892666]
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> Sep 5 01:05:02.092533 vsa-0000000f-vc-0 kernel: [1294776.892666]
> CR2: fffffffffffffeda CR3: 0000000741e12000 CR4: 00000000003407e0
> Sep 5 01:05:02.092554 vsa-0000000f-vc-0 kernel: [1294776.892666]
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Sep 5 01:05:02.092566 vsa-0000000f-vc-0 kernel: [1294776.892666]
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Sep 5 01:05:02.092568 vsa-0000000f-vc-0 kernel: [1294776.892666] Stack:
> Sep 5 01:05:02.092569 vsa-0000000f-vc-0 kernel: [1294776.892666]
> ffff8808bfca3ef8 ffffffff810d491c ffff88088e17d838 ffff88088e17d438
> Sep 5 01:05:02.092571 vsa-0000000f-vc-0 kernel: [1294776.892666]
> ffff880022da6540 ffff88000a9a7fd8 ffff8808bfca3eb8 ffff880799cad868
> Sep 5 01:05:02.092572 vsa-0000000f-vc-0 kernel: [1294776.892666]
> 0000000000000004 0000000000000009 ffffffff81c0f0c8 0000000000000009
> Sep 5 01:05:02.092572 vsa-0000000f-vc-0 kernel: [1294776.892666]
> Call Trace:
> Sep 5 01:05:02.092573 vsa-0000000f-vc-0 kernel: [1294776.892666] <IRQ>
> Sep 5 01:05:02.092574 vsa-0000000f-vc-0 kernel: [1294776.892666]
> [<ffffffff810d491c>] rcu_process_callbacks+0xcc/0x610
> Sep 5 01:05:02.092576 vsa-0000000f-vc-0 kernel: [1294776.892666]
> [<ffffffff81077025>] __do_softirq+0xf5/0x320
> Sep 5 01:05:02.092578 vsa-0000000f-vc-0 kernel: [1294776.892666]
> [<ffffffff81077575>] irq_exit+0x115/0x120
> Sep 5 01:05:02.092579 vsa-0000000f-vc-0 kernel: [1294776.892666]
> [<ffffffff8171a89a>] smp_apic_timer_interrupt+0x4a/0x60
> Sep 5 01:05:02.092579 vsa-0000000f-vc-0 kernel: [1294776.892666]
> [<ffffffff8171896d>] apic_timer_interrupt+0x6d/0x80
> Sep 5 01:05:02.092580 vsa-0000000f-vc-0 kernel: [1294776.892666] <EOI>
> Sep 5 01:05:02.092581 vsa-0000000f-vc-0 kernel: [1294776.892666]
> [<ffffffff817179cd>] ? system_call_fastpath+0x16/0x1b
> Sep 5 01:05:02.092582 vsa-0000000f-vc-0 kernel: [1294776.892666]
> Code: 84 00 00 00 00 00 0f 1f 44 00 00 55 48 8b 8f 50 11 00 00 31 c0
> 48 8b 97 48 11 00 00 48 89 e5 48 39 d1 74 02 5d c3 48 8b 47 10 83
> <c0> 01 83 e0 01 48 83 c0 20 8b 44 87 20 85 c0 75 11 48 83 7e 48
> Sep 5 01:05:02.092585 vsa-0000000f-vc-0 kernel: [1294776.892666]
> RIP [<ffffffff810d12e5>] cpu_needs_another_gp+0x25/0x80
> Sep 5 01:05:02.092586 vsa-0000000f-vc-0 kernel: [1294776.892666]
> RSP <ffff8808bfca3e88>
> Sep 5 01:05:02.092587 vsa-0000000f-vc-0 kernel: [1294776.892666]
> CR2: fffffffffffffeda
> Sep 5 01:05:02.092588 vsa-0000000f-vc-0 kernel: [1294776.892666]
> ---[ end trace 9b3c5d4642bb89b5 ]---
New one on me! If this is reproducible, and if you have some other
version where it is not happening, do a bisection. If you have a set
of patches that you carry on top of the stable kernel (for example, to
support some new hardware), try reproducing on hardware that is supported
natively by 3.18.19. Either way, CONFIG_DEBUG_OBJECTS_RCU_HEAD can be
helpful, as can any number of other debugging Kconfig options.
Thanx, Paul
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: cpu_needs_another_gp: unable to handle kernel paging request
2017-09-06 15:02 ` Paul E. McKenney
@ 2017-09-07 7:47 ` Alex Lyakas
2017-09-07 16:32 ` Paul E. McKenney
0 siblings, 1 reply; 4+ messages in thread
From: Alex Lyakas @ 2017-09-07 7:47 UTC (permalink / raw)
To: paulmck; +Cc: josh, linux-kernel
Hello Paul,
Thank you for your response.
Can you give us hint what does this panic indicate? A random kernel memory
corruption? An improper use of an RCU primitive? A hardware issue?
This happened only once in one of the production systems, and we don't have
a reproduction scenario unfortunately.
Thanks,
Alex.
-----Original Message-----
From: Paul E. McKenney
Sent: Wednesday, September 06, 2017 6:02 PM
To: Alex Lyakas
Cc: josh@joshtriplett.org ; linux-kernel@vger.kernel.org
Subject: Re: cpu_needs_another_gp: unable to handle kernel paging request
On Wed, Sep 06, 2017 at 12:53:42PM +0300, Alex Lyakas wrote:
> Hello,
>
> Kernel 3.18.19 hit the following panic[1]. Can you please advise on
> how to debug this further, or if there is any known issue that you
> recognize.
>
> Thanks,
> Alex.
>
>
> [1]
> Sep 5 01:05:02.092499 vsa-0000000f-vc-0 kernel: [1294776.890064]
> BUG: unable to handle kernel paging request at fffffffffffffeda
> Sep 5 01:05:02.092517 vsa-0000000f-vc-0 kernel: [1294776.890892]
> IP: [<ffffffff810d12e5>] cpu_needs_another_gp+0x25/0x80
> Sep 5 01:05:02.092517 vsa-0000000f-vc-0 kernel: [1294776.891007]
> PGD 1c19067 PUD 1c1b067 PMD 0
> Sep 5 01:05:02.092518 vsa-0000000f-vc-0 kernel: [1294776.891007]
> Oops: 0002 [#1] PREEMPT SMP
> Sep 5 01:05:02.092520 vsa-0000000f-vc-0 kernel: [1294776.891007]
> Modules linked in: xt_nat(E) veth(E) xt_addrtype(E) br_netfilter(E)
> xfrm_user(E) xfrm4_tunnel(E) tunnel4(E) ipcomp(E) xfrm_ipcomp(E)
> esp4(E) ah4(E) 8021q(E) garp(E) mrp(E) xt_multiport(E) sd_mod(E)
> bonding(E) ib_iser(OE) iscsi_tcp(OE) libiscsi_tcp(OE) libiscsi(OE)
> scsi_transport_iscsi(OE) dm_zcache(OE) xfs(OE) btrfs(OE) raid456(OE)
> async_raid6_recov(E) async_memcpy(E) async_pq(E) async_xor(E) xor(E)
> async_tx(E) raid6_pq(E) raid1(OE) md_mod(OE) rdma_ucm(OE)
> ib_uverbs(OE) mlx4_ib(OE) mlx4_en(OE) ipt_MASQUERADE(E)
> nf_nat_masquerade_ipv4(E) iptable_nat(E) nf_nat_ipv4(E) nf_nat(E)
> nf_conntrack_ipv4(E) nf_defrag_ipv4(E) xt_conntrack(E)
> nf_conntrack(E) ipt_REJECT(E) nf_reject_ipv4(E) xt_CHECKSUM(E)
> iptable_mangle(E) xt_tcpudp(E) bridge(E) stp(E) llc(E) vxlan(E)
> ip6_udp_tunnel(E) udp_tunnel(E) ptp(E) pps_core(E)
> ip6table_filter(E) ip6_tables(E) iptable_filter(E) ip_tables(E)
> x_tables(E) mlx4_core(OE) deflate(E) ctr(E) twofish_generic(E)
> twofish_avx_x86_64(E) twofish_x86_64_3way(E) twofish_x86_64(E)
> twofish_common(E) camellia_generic(E) camellia_aesni_avx2(E)
> camellia_aesni_avx_x86_64(E) camellia_x86_64(E) serpent_avx2(E)
> serpent_avx_x86_64(E) serpent_sse2_x86_64(E) xts(E)
> serpent_generic(E) blowfish_generic(E) blowfish_x86_64(E)
> blowfish_common(E) cast5_avx_x86_64(E) cast5_generic(E)
> cast_common(E) des3_ede_x86_64(E) des_generic(E) cmac(E) xcbc(E)
> rmd160(E) isert_scst(OE) crypto_null(E) rdma_cm(OE) af_key(E)
> iw_cm(OE) xfrm_algo(E) ib_cm(OE) ib_sa(OE) ib_mad(OE) ib_core(OE)
> ib_addr(OE) compat(OE) iscsi_scst(OE) scst_utgt(OE) scst_vdisk(OE)
> libcrc32c(E) scst(OE) nls_iso8859_1(E) kvm_intel(E) kvm(E)
> crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E)
> aesni_intel(E) nfsd(OE) aes_x86_64(E) lrw(E) gf128mul(E)
> glue_helper(E) ablk_helper(E) cryptd(E) auth_rpcgss(E) nfs_acl(E)
> mac_hid(E) nfs(E) lockd(E) grace(E) sunrpc(E) fscache(E)
> dm_multipath(OE) scsi_dh(E) ttm(E) drm_kms_helper(E) serio_raw(E)
> drm(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) i2c_piix4(E)
> i6300esb(E) lp(E) parport(E) dm_iostat(OE) ata_generic(E)
> pata_acpi(E) ata_piix(E) libata(E) psmouse(E) scsi_mod(OE)
> Sep 5 01:05:02.092522 vsa-0000000f-vc-0 kernel: [1294776.892666]
> CPU: 5 PID: 14385 Comm: aws Tainted: G W OE
> 3.18.19-zadara05 #1
> Sep 5 01:05:02.092523 vsa-0000000f-vc-0 kernel: [1294776.892666]
> Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> Sep 5 01:05:02.092524 vsa-0000000f-vc-0 kernel: [1294776.892666]
> task: ffff880022da6540 ti: ffff88000a9a4000 task.ti:
> ffff88000a9a4000
> Sep 5 01:05:02.092525 vsa-0000000f-vc-0 kernel: [1294776.892666]
> RIP: 0010:[<ffffffff810d12e5>] [<ffffffff810d12e5>]
> cpu_needs_another_gp+0x25/0x80
> Sep 5 01:05:02.092525 vsa-0000000f-vc-0 kernel: [1294776.892666]
> RSP: 0000:ffff8808bfca3e88 EFLAGS: 00010097
> Sep 5 01:05:02.092526 vsa-0000000f-vc-0 kernel: [1294776.892666]
> RAX: 0000000000000000 RBX: ffffffff81c55c40 RCX: fffffffffffffeda
> Sep 5 01:05:02.092526 vsa-0000000f-vc-0 kernel: [1294776.892666]
> RDX: fffffffffffffeda RSI: ffff8808bfcad600 RDI: ffffffff81c55c40
> Sep 5 01:05:02.092527 vsa-0000000f-vc-0 kernel: [1294776.892666]
> RBP: ffff8808bfca3e88 R08: 00000000000021ac R09: 0000000000000100
> Sep 5 01:05:02.092527 vsa-0000000f-vc-0 kernel: [1294776.892666]
> R10: 0000000000000000 R11: 0000000000000005 R12: 0000000000000246
> Sep 5 01:05:02.092529 vsa-0000000f-vc-0 kernel: [1294776.892666]
> R13: 0000000000000009 R14: 0000000000000100 R15: ffff8808bfcad600
> Sep 5 01:05:02.092531 vsa-0000000f-vc-0 kernel: [1294776.892666]
> FS: 00007f158f7fe700(0000) GS:ffff8808bfca0000(0000)
> knlGS:0000000000000000
> Sep 5 01:05:02.092531 vsa-0000000f-vc-0 kernel: [1294776.892666]
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> Sep 5 01:05:02.092533 vsa-0000000f-vc-0 kernel: [1294776.892666]
> CR2: fffffffffffffeda CR3: 0000000741e12000 CR4: 00000000003407e0
> Sep 5 01:05:02.092554 vsa-0000000f-vc-0 kernel: [1294776.892666]
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Sep 5 01:05:02.092566 vsa-0000000f-vc-0 kernel: [1294776.892666]
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Sep 5 01:05:02.092568 vsa-0000000f-vc-0 kernel: [1294776.892666] Stack:
> Sep 5 01:05:02.092569 vsa-0000000f-vc-0 kernel: [1294776.892666]
> ffff8808bfca3ef8 ffffffff810d491c ffff88088e17d838 ffff88088e17d438
> Sep 5 01:05:02.092571 vsa-0000000f-vc-0 kernel: [1294776.892666]
> ffff880022da6540 ffff88000a9a7fd8 ffff8808bfca3eb8 ffff880799cad868
> Sep 5 01:05:02.092572 vsa-0000000f-vc-0 kernel: [1294776.892666]
> 0000000000000004 0000000000000009 ffffffff81c0f0c8 0000000000000009
> Sep 5 01:05:02.092572 vsa-0000000f-vc-0 kernel: [1294776.892666]
> Call Trace:
> Sep 5 01:05:02.092573 vsa-0000000f-vc-0 kernel: [1294776.892666] <IRQ>
> Sep 5 01:05:02.092574 vsa-0000000f-vc-0 kernel: [1294776.892666]
> [<ffffffff810d491c>] rcu_process_callbacks+0xcc/0x610
> Sep 5 01:05:02.092576 vsa-0000000f-vc-0 kernel: [1294776.892666]
> [<ffffffff81077025>] __do_softirq+0xf5/0x320
> Sep 5 01:05:02.092578 vsa-0000000f-vc-0 kernel: [1294776.892666]
> [<ffffffff81077575>] irq_exit+0x115/0x120
> Sep 5 01:05:02.092579 vsa-0000000f-vc-0 kernel: [1294776.892666]
> [<ffffffff8171a89a>] smp_apic_timer_interrupt+0x4a/0x60
> Sep 5 01:05:02.092579 vsa-0000000f-vc-0 kernel: [1294776.892666]
> [<ffffffff8171896d>] apic_timer_interrupt+0x6d/0x80
> Sep 5 01:05:02.092580 vsa-0000000f-vc-0 kernel: [1294776.892666] <EOI>
> Sep 5 01:05:02.092581 vsa-0000000f-vc-0 kernel: [1294776.892666]
> [<ffffffff817179cd>] ? system_call_fastpath+0x16/0x1b
> Sep 5 01:05:02.092582 vsa-0000000f-vc-0 kernel: [1294776.892666]
> Code: 84 00 00 00 00 00 0f 1f 44 00 00 55 48 8b 8f 50 11 00 00 31 c0
> 48 8b 97 48 11 00 00 48 89 e5 48 39 d1 74 02 5d c3 48 8b 47 10 83
> <c0> 01 83 e0 01 48 83 c0 20 8b 44 87 20 85 c0 75 11 48 83 7e 48
> Sep 5 01:05:02.092585 vsa-0000000f-vc-0 kernel: [1294776.892666]
> RIP [<ffffffff810d12e5>] cpu_needs_another_gp+0x25/0x80
> Sep 5 01:05:02.092586 vsa-0000000f-vc-0 kernel: [1294776.892666]
> RSP <ffff8808bfca3e88>
> Sep 5 01:05:02.092587 vsa-0000000f-vc-0 kernel: [1294776.892666]
> CR2: fffffffffffffeda
> Sep 5 01:05:02.092588 vsa-0000000f-vc-0 kernel: [1294776.892666]
> ---[ end trace 9b3c5d4642bb89b5 ]---
New one on me! If this is reproducible, and if you have some other
version where it is not happening, do a bisection. If you have a set
of patches that you carry on top of the stable kernel (for example, to
support some new hardware), try reproducing on hardware that is supported
natively by 3.18.19. Either way, CONFIG_DEBUG_OBJECTS_RCU_HEAD can be
helpful, as can any number of other debugging Kconfig options.
Thanx, Paul
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: cpu_needs_another_gp: unable to handle kernel paging request
2017-09-07 7:47 ` Alex Lyakas
@ 2017-09-07 16:32 ` Paul E. McKenney
0 siblings, 0 replies; 4+ messages in thread
From: Paul E. McKenney @ 2017-09-07 16:32 UTC (permalink / raw)
To: Alex Lyakas; +Cc: josh, linux-kernel
On Thu, Sep 07, 2017 at 10:47:57AM +0300, Alex Lyakas wrote:
> Hello Paul,
>
> Thank you for your response.
>
> Can you give us hint what does this panic indicate? A random kernel
> memory corruption? An improper use of an RCU primitive? A hardware
> issue?
Could be any of those three. Running tests with debug Kconfig options
can help locate improper use of RCU primitives, in some cases with
much higher probability than any failure. So again, I strongly
encourage you to run tests as noted in my previous message.
> This happened only once in one of the production systems, and we
> don't have a reproduction scenario unfortunately.
That does make it harder to track down, and again pushes towards running
tests with debug Kconfig options enabled.
Thanx, Paul
> Thanks,
> Alex.
>
>
> -----Original Message----- From: Paul E. McKenney
> Sent: Wednesday, September 06, 2017 6:02 PM
> To: Alex Lyakas
> Cc: josh@joshtriplett.org ; linux-kernel@vger.kernel.org
> Subject: Re: cpu_needs_another_gp: unable to handle kernel paging request
>
> On Wed, Sep 06, 2017 at 12:53:42PM +0300, Alex Lyakas wrote:
> >Hello,
> >
> >Kernel 3.18.19 hit the following panic[1]. Can you please advise on
> >how to debug this further, or if there is any known issue that you
> >recognize.
> >
> >Thanks,
> >Alex.
> >
> >
> >[1]
> >Sep 5 01:05:02.092499 vsa-0000000f-vc-0 kernel: [1294776.890064]
> >BUG: unable to handle kernel paging request at fffffffffffffeda
> >Sep 5 01:05:02.092517 vsa-0000000f-vc-0 kernel: [1294776.890892]
> >IP: [<ffffffff810d12e5>] cpu_needs_another_gp+0x25/0x80
> >Sep 5 01:05:02.092517 vsa-0000000f-vc-0 kernel: [1294776.891007]
> >PGD 1c19067 PUD 1c1b067 PMD 0
> >Sep 5 01:05:02.092518 vsa-0000000f-vc-0 kernel: [1294776.891007]
> >Oops: 0002 [#1] PREEMPT SMP
> >Sep 5 01:05:02.092520 vsa-0000000f-vc-0 kernel: [1294776.891007]
> >Modules linked in: xt_nat(E) veth(E) xt_addrtype(E) br_netfilter(E)
> >xfrm_user(E) xfrm4_tunnel(E) tunnel4(E) ipcomp(E) xfrm_ipcomp(E)
> >esp4(E) ah4(E) 8021q(E) garp(E) mrp(E) xt_multiport(E) sd_mod(E)
> >bonding(E) ib_iser(OE) iscsi_tcp(OE) libiscsi_tcp(OE) libiscsi(OE)
> >scsi_transport_iscsi(OE) dm_zcache(OE) xfs(OE) btrfs(OE) raid456(OE)
> >async_raid6_recov(E) async_memcpy(E) async_pq(E) async_xor(E) xor(E)
> >async_tx(E) raid6_pq(E) raid1(OE) md_mod(OE) rdma_ucm(OE)
> >ib_uverbs(OE) mlx4_ib(OE) mlx4_en(OE) ipt_MASQUERADE(E)
> >nf_nat_masquerade_ipv4(E) iptable_nat(E) nf_nat_ipv4(E) nf_nat(E)
> >nf_conntrack_ipv4(E) nf_defrag_ipv4(E) xt_conntrack(E)
> >nf_conntrack(E) ipt_REJECT(E) nf_reject_ipv4(E) xt_CHECKSUM(E)
> >iptable_mangle(E) xt_tcpudp(E) bridge(E) stp(E) llc(E) vxlan(E)
> >ip6_udp_tunnel(E) udp_tunnel(E) ptp(E) pps_core(E)
> >ip6table_filter(E) ip6_tables(E) iptable_filter(E) ip_tables(E)
> >x_tables(E) mlx4_core(OE) deflate(E) ctr(E) twofish_generic(E)
> >twofish_avx_x86_64(E) twofish_x86_64_3way(E) twofish_x86_64(E)
> >twofish_common(E) camellia_generic(E) camellia_aesni_avx2(E)
> >camellia_aesni_avx_x86_64(E) camellia_x86_64(E) serpent_avx2(E)
> >serpent_avx_x86_64(E) serpent_sse2_x86_64(E) xts(E)
> >serpent_generic(E) blowfish_generic(E) blowfish_x86_64(E)
> >blowfish_common(E) cast5_avx_x86_64(E) cast5_generic(E)
> >cast_common(E) des3_ede_x86_64(E) des_generic(E) cmac(E) xcbc(E)
> >rmd160(E) isert_scst(OE) crypto_null(E) rdma_cm(OE) af_key(E)
> >iw_cm(OE) xfrm_algo(E) ib_cm(OE) ib_sa(OE) ib_mad(OE) ib_core(OE)
> >ib_addr(OE) compat(OE) iscsi_scst(OE) scst_utgt(OE) scst_vdisk(OE)
> >libcrc32c(E) scst(OE) nls_iso8859_1(E) kvm_intel(E) kvm(E)
> >crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E)
> >aesni_intel(E) nfsd(OE) aes_x86_64(E) lrw(E) gf128mul(E)
> >glue_helper(E) ablk_helper(E) cryptd(E) auth_rpcgss(E) nfs_acl(E)
> >mac_hid(E) nfs(E) lockd(E) grace(E) sunrpc(E) fscache(E)
> >dm_multipath(OE) scsi_dh(E) ttm(E) drm_kms_helper(E) serio_raw(E)
> >drm(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) i2c_piix4(E)
> >i6300esb(E) lp(E) parport(E) dm_iostat(OE) ata_generic(E)
> >pata_acpi(E) ata_piix(E) libata(E) psmouse(E) scsi_mod(OE)
> >Sep 5 01:05:02.092522 vsa-0000000f-vc-0 kernel: [1294776.892666]
> >CPU: 5 PID: 14385 Comm: aws Tainted: G W OE
> >3.18.19-zadara05 #1
> >Sep 5 01:05:02.092523 vsa-0000000f-vc-0 kernel: [1294776.892666]
> >Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> >Sep 5 01:05:02.092524 vsa-0000000f-vc-0 kernel: [1294776.892666]
> >task: ffff880022da6540 ti: ffff88000a9a4000 task.ti:
> >ffff88000a9a4000
> >Sep 5 01:05:02.092525 vsa-0000000f-vc-0 kernel: [1294776.892666]
> >RIP: 0010:[<ffffffff810d12e5>] [<ffffffff810d12e5>]
> >cpu_needs_another_gp+0x25/0x80
> >Sep 5 01:05:02.092525 vsa-0000000f-vc-0 kernel: [1294776.892666]
> >RSP: 0000:ffff8808bfca3e88 EFLAGS: 00010097
> >Sep 5 01:05:02.092526 vsa-0000000f-vc-0 kernel: [1294776.892666]
> >RAX: 0000000000000000 RBX: ffffffff81c55c40 RCX: fffffffffffffeda
> >Sep 5 01:05:02.092526 vsa-0000000f-vc-0 kernel: [1294776.892666]
> >RDX: fffffffffffffeda RSI: ffff8808bfcad600 RDI: ffffffff81c55c40
> >Sep 5 01:05:02.092527 vsa-0000000f-vc-0 kernel: [1294776.892666]
> >RBP: ffff8808bfca3e88 R08: 00000000000021ac R09: 0000000000000100
> >Sep 5 01:05:02.092527 vsa-0000000f-vc-0 kernel: [1294776.892666]
> >R10: 0000000000000000 R11: 0000000000000005 R12: 0000000000000246
> >Sep 5 01:05:02.092529 vsa-0000000f-vc-0 kernel: [1294776.892666]
> >R13: 0000000000000009 R14: 0000000000000100 R15: ffff8808bfcad600
> >Sep 5 01:05:02.092531 vsa-0000000f-vc-0 kernel: [1294776.892666]
> >FS: 00007f158f7fe700(0000) GS:ffff8808bfca0000(0000)
> >knlGS:0000000000000000
> >Sep 5 01:05:02.092531 vsa-0000000f-vc-0 kernel: [1294776.892666]
> >CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >Sep 5 01:05:02.092533 vsa-0000000f-vc-0 kernel: [1294776.892666]
> >CR2: fffffffffffffeda CR3: 0000000741e12000 CR4: 00000000003407e0
> >Sep 5 01:05:02.092554 vsa-0000000f-vc-0 kernel: [1294776.892666]
> >DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >Sep 5 01:05:02.092566 vsa-0000000f-vc-0 kernel: [1294776.892666]
> >DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >Sep 5 01:05:02.092568 vsa-0000000f-vc-0 kernel: [1294776.892666] Stack:
> >Sep 5 01:05:02.092569 vsa-0000000f-vc-0 kernel: [1294776.892666]
> >ffff8808bfca3ef8 ffffffff810d491c ffff88088e17d838 ffff88088e17d438
> >Sep 5 01:05:02.092571 vsa-0000000f-vc-0 kernel: [1294776.892666]
> >ffff880022da6540 ffff88000a9a7fd8 ffff8808bfca3eb8 ffff880799cad868
> >Sep 5 01:05:02.092572 vsa-0000000f-vc-0 kernel: [1294776.892666]
> >0000000000000004 0000000000000009 ffffffff81c0f0c8 0000000000000009
> >Sep 5 01:05:02.092572 vsa-0000000f-vc-0 kernel: [1294776.892666]
> >Call Trace:
> >Sep 5 01:05:02.092573 vsa-0000000f-vc-0 kernel: [1294776.892666] <IRQ>
> >Sep 5 01:05:02.092574 vsa-0000000f-vc-0 kernel: [1294776.892666]
> >[<ffffffff810d491c>] rcu_process_callbacks+0xcc/0x610
> >Sep 5 01:05:02.092576 vsa-0000000f-vc-0 kernel: [1294776.892666]
> >[<ffffffff81077025>] __do_softirq+0xf5/0x320
> >Sep 5 01:05:02.092578 vsa-0000000f-vc-0 kernel: [1294776.892666]
> >[<ffffffff81077575>] irq_exit+0x115/0x120
> >Sep 5 01:05:02.092579 vsa-0000000f-vc-0 kernel: [1294776.892666]
> >[<ffffffff8171a89a>] smp_apic_timer_interrupt+0x4a/0x60
> >Sep 5 01:05:02.092579 vsa-0000000f-vc-0 kernel: [1294776.892666]
> >[<ffffffff8171896d>] apic_timer_interrupt+0x6d/0x80
> >Sep 5 01:05:02.092580 vsa-0000000f-vc-0 kernel: [1294776.892666] <EOI>
> >Sep 5 01:05:02.092581 vsa-0000000f-vc-0 kernel: [1294776.892666]
> >[<ffffffff817179cd>] ? system_call_fastpath+0x16/0x1b
> >Sep 5 01:05:02.092582 vsa-0000000f-vc-0 kernel: [1294776.892666]
> >Code: 84 00 00 00 00 00 0f 1f 44 00 00 55 48 8b 8f 50 11 00 00 31 c0
> >48 8b 97 48 11 00 00 48 89 e5 48 39 d1 74 02 5d c3 48 8b 47 10 83
> ><c0> 01 83 e0 01 48 83 c0 20 8b 44 87 20 85 c0 75 11 48 83 7e 48
> >Sep 5 01:05:02.092585 vsa-0000000f-vc-0 kernel: [1294776.892666]
> >RIP [<ffffffff810d12e5>] cpu_needs_another_gp+0x25/0x80
> >Sep 5 01:05:02.092586 vsa-0000000f-vc-0 kernel: [1294776.892666]
> >RSP <ffff8808bfca3e88>
> >Sep 5 01:05:02.092587 vsa-0000000f-vc-0 kernel: [1294776.892666]
> >CR2: fffffffffffffeda
> >Sep 5 01:05:02.092588 vsa-0000000f-vc-0 kernel: [1294776.892666]
> >---[ end trace 9b3c5d4642bb89b5 ]---
>
> New one on me! If this is reproducible, and if you have some other
> version where it is not happening, do a bisection. If you have a set
> of patches that you carry on top of the stable kernel (for example, to
> support some new hardware), try reproducing on hardware that is supported
> natively by 3.18.19. Either way, CONFIG_DEBUG_OBJECTS_RCU_HEAD can be
> helpful, as can any number of other debugging Kconfig options.
>
> Thanx, Paul
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2017-09-07 16:32 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-06 9:53 cpu_needs_another_gp: unable to handle kernel paging request Alex Lyakas
2017-09-06 15:02 ` Paul E. McKenney
2017-09-07 7:47 ` Alex Lyakas
2017-09-07 16:32 ` Paul E. McKenney
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).