From: Tariq Toukan <tariqt@mellanox.com>
To: Thomas Gleixner <tglx@linutronix.de>, linux-kernel@vger.kernel.org
Cc: Maor Gottlieb <maorg@mellanox.com>
Subject: WARNING and PANIC in irq_matrix_free
Date: Tue, 20 Feb 2018 14:07:32 +0200 [thread overview]
Message-ID: <e2ec1a14-326a-57c2-56a6-7c2235e5b8b3@mellanox.com> (raw)
Hi Thomas,
We started seeing new issues in our net-device daily regression tests.
They are related to patch [1] introduced in kernel 4.15-rc1.
We frequently see a warning in dmesg [2]. Repro is not consistent, we
tried to narrow it down to a smaller run but couldn't.
In addition, sometimes (less frequent) the warning is followed by a
panic [3].
I can share all needed details to help analyze this bug.
If you suspect specific flows, we can do an educated narrow down.
Regards,
Tariq
[1] 2f75d9e1c905 genirq: Implement bitmap matrix allocator
[2]
[ 8664.868564] WARNING: CPU: 5 PID: 0 at kernel/irq/matrix.c:370
irq_matrix_free+0x30/0xd0
[ 8664.891905] Modules linked in: bonding rdma_ucm ib_ucm rdma_cm iw_cm
ib_ipoib ib_cm ib_uverbs ib_umad mlx5_ib mlx5_core mlxfw mlx4_ib ib_core
mlx4_en mlx4_core devlink macvlan vxlan ip6_udp_tunnel udp_tunnel 8021q
garp mrp stp llc mst_pciconf(OE) nfsv3 nfs fscache netconsole dm_mirror
dm_region_hash dm_log dm_mod dax kvm_intel kvm irqbypass pcspkr
i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables
ata_generic cirrus drm_kms_helper syscopyarea sysfillrect pata_acpi
sysimgblt fb_sys_fops ttm drm e1000 serio_raw virtio_console i2c_core
floppy ata_piix [last unloaded: mst_pci]
[ 8664.905117] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G OE
4.15.0-for-upstream-perf-2018-02-08_07-00-42-18 #1
[ 8664.907613] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS Ubuntu-1.8.2-1ubuntu2 04/01/2014
[ 8664.910144] RIP: 0010:irq_matrix_free+0x30/0xd0
[ 8664.912624] RSP: 0018:ffff88023fd43f70 EFLAGS: 00010002
[ 8664.915149] RAX: 0000000000026318 RBX: ffff880157a77ec0 RCX:
0000000000000000
[ 8664.917679] RDX: 0000000000000001 RSI: 0000000000000001 RDI:
ffff880237038400
[ 8664.920244] RBP: ffff880237038400 R08: 00000000e8ba3c69 R09:
0000000000000000
[ 8664.922813] R10: 00000000000003ff R11: 0000000000000ad9 R12:
ffff88023fc40000
[ 8664.925345] R13: 0000000000000000 R14: 0000000000000001 R15:
000000000000002b
[ 8664.927872] FS: 0000000000000000(0000) GS:ffff88023fd40000(0000)
knlGS:0000000000000000
[ 8664.930455] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8664.932996] CR2: 0000000000f2c030 CR3: 000000000220a000 CR4:
00000000000006e0
[ 8664.935557] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 8664.938051] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[ 8664.940541] Call Trace:
[ 8664.942980] <IRQ>
[ 8664.945399] free_moved_vector+0x4e/0x100
[ 8664.947787] smp_irq_move_cleanup_interrupt+0x89/0x9e
[ 8664.950134] irq_move_cleanup_interrupt+0x95/0xa0
[ 8664.952480] </IRQ>
[ 8664.954800] RIP: 0010:native_safe_halt+0x2/0x10
[ 8664.957052] RSP: 0018:ffffc90000ccfee0 EFLAGS: 00000246 ORIG_RAX:
ffffffffffffffdf
[ 8664.959186] RAX: ffffffff818ab6e0 RBX: ffff880236233f00 RCX:
0000000000000000
[ 8664.960499] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
0000000000000000
[ 8664.961774] RBP: 0000000000000005 R08: 0000000000000000 R09:
0000000000000000
[ 8664.963048] R10: 00000000000003ff R11: 0000000000000ad9 R12:
ffff880236233f00
[ 8664.964345] R13: ffff880236233f00 R14: 0000000000000000 R15:
0000000000000000
[ 8664.965579] ? __cpuidle_text_start+0x8/0x8
[ 8664.966808] default_idle+0x18/0xf0
[ 8664.968040] do_idle+0x150/0x1d0
[ 8664.969249] cpu_startup_entry+0x19/0x20
[ 8664.970477] start_secondary+0x133/0x170
[ 8664.971700] secondary_startup_64+0xa5/0xb0
[ 8664.972909] Code: 41 56 41 89 f6 41 55 41 89 d5 89 f2 41 54 4c 8b 24
d5 60 24 18 82 55 48 89 fd 53 48 8b 47 28 44 39 6f 04 77 06 44 3b 6f 08
72 0b <0f> ff 5b 5d 41 5c 41 5d 41 5e c3 49 01 c4 41 80 7c 24 0c 00 74
[ 8664.975420] ---[ end trace 8be4ba51cd83f4bd ]---
[3]
[ 8943.038767] BUG: unable to handle kernel paging request at
000000037a6b561b
[ 8943.040114] IP: free_moved_vector+0x61/0x100
[ 8943.041531] PGD 0 P4D 0
[ 8943.042855] Oops: 0002 [#1] SMP PTI
[ 8943.044128] Modules linked in: bonding rdma_ucm ib_ucm rdma_cm iw_cm
ib_ipoib ib_cm ib_uverbs ib_umad mlx5_ib mlx5_core mlxfw mlx4_ib ib_core
mlx4_en mlx4_core devlink iptable_filter fuse btrfs xor zstd_decompress
zstd_compress xxhash raid6_pq vfat msdos fat binfmt_misc bridge macvlan
vxlan ip6_udp_tunnel udp_tunnel 8021q garp mrp stp llc mst_pciconf(OE)
nfsv3 nfs fscache netconsole dm_mirror dm_region_hash dm_log dm_mod dax
kvm_intel kvm irqbypass pcspkr i2c_piix4 nfsd auth_rpcgss nfs_acl lockd
grace sunrpc ip_tables ata_generic cirrus drm_kms_helper syscopyarea
sysfillrect pata_acpi sysimgblt fb_sys_fops ttm drm e1000 serio_raw
virtio_console i2c_core floppy ata_piix [last unloaded: mst_pci]
[ 8943.052038] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G W OE
4.15.0-for-upstream-perf-2018-02-08_07-00-42-18 #1
[ 8943.053350] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS Ubuntu-1.8.2-1ubuntu2 04/01/2014
[ 8943.054654] RIP: 0010:free_moved_vector+0x61/0x100
[ 8943.055940] RSP: 0018:ffff88023fd43fa0 EFLAGS: 00010007
[ 8943.057233] RAX: 000000037a6b561b RBX: ffff880157a77ec0 RCX:
0000000000000001
[ 8943.058506] RDX: 00000000000155a8 RSI: 00000000000155a8 RDI:
ffff880237038400
[ 8943.059784] RBP: ffff880157a77ec0 R08: 00000000e8ba3c69 R09:
0000000000000000
[ 8943.061051] R10: 0000000000000000 R11: 0000000000000000 R12:
000000007f0c0001
[ 8943.062462] R13: 00000000000155a8 R14: 0000000000000001 R15:
0000000000cc620d
[ 8943.063726] FS: 0000000000000000(0000) GS:ffff88023fd40000(0000)
knlGS:0000000000000000
[ 8943.064993] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8943.066253] CR2: 000000037a6b561b CR3: 000000010badc000 CR4:
00000000000006e0
[ 8943.067522] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 8943.068771] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[ 8943.070029] Call Trace:
[ 8943.071273] <IRQ>
[ 8943.072503] smp_irq_move_cleanup_interrupt+0x89/0x9e
[ 8943.073794] irq_move_cleanup_interrupt+0x95/0xa0
[ 8943.075048] </IRQ>
[ 8943.076288] RIP: 0010:native_safe_halt+0x2/0x10
[ 8943.077530] RSP: 0018:ffffc90000ccfee0 EFLAGS: 00000246 ORIG_RAX:
ffffffffffffffdf
[ 8943.078795] RAX: ffffffff818ab6e0 RBX: ffff880236233f00 RCX:
0000000000000000
[ 8943.080077] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
0000000000000000
[ 8943.081435] RBP: 0000000000000005 R08: 00000000e8ba3c69 R09:
0000000000000000
[ 8943.082683] R10: 0000000000000000 R11: 0000000000000000 R12:
ffff880236233f00
[ 8943.083932] R13: ffff880236233f00 R14: 0000000000000000 R15:
0000000000000000
[ 8943.085185] ? __cpuidle_text_start+0x8/0x8
[ 8943.086438] default_idle+0x18/0xf0
[ 8943.087694] do_idle+0x150/0x1d0
[ 8943.088921] cpu_startup_entry+0x19/0x20
[ 8943.090163] start_secondary+0x133/0x170
[ 8943.091402] secondary_startup_64+0xa5/0xb0
[ 8943.092659] Code: 44 00 00 48 8b 3d c8 f7 9f 01 44 89 f1 44 89 e2 44
89 ee e8 e2 05 0b 00 48 c7 c0 20 50 01 00 4a 8d 04 e0 4a 03 04 ed 60 24
18 82 <48> c7 00 00 00 00 00 48 8b 45 28 48 85 c0 74 20 48 8b 55 20 48
[ 8943.095371] RIP: free_moved_vector+0x61/0x100 RSP: ffff88023fd43fa0
[ 8943.096685] CR2: 000000037a6b561b
[ 8943.098120] ---[ end trace 8be4ba51cd83f4c0 ]---
[ 8943.099387] Kernel panic - not syncing: Fatal exception in interrupt
[ 8943.101170] Kernel Offset: disabled
[ 8943.102410] ---[ end Kernel panic - not syncing: Fatal exception in
interrupt
next reply other threads:[~2018-02-20 12:07 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-20 12:07 Tariq Toukan [this message]
2018-02-20 18:11 ` WARNING and PANIC in irq_matrix_free Thomas Gleixner
2018-02-20 18:18 ` Thomas Gleixner
2018-02-21 11:27 ` Tariq Toukan
2018-02-22 21:38 ` Thomas Gleixner
2018-02-25 9:50 ` Tariq Toukan
2018-05-18 22:41 ` Dmitry Safonov
2018-05-18 23:43 ` Dmitry Safonov
2018-05-19 11:20 ` Thomas Gleixner
2018-05-23 7:16 ` Tariq Toukan
2018-05-23 8:49 ` Thomas Gleixner
2018-05-25 20:10 ` Song Liu
2018-05-25 21:29 ` Song Liu
2018-05-28 10:53 ` Thomas Gleixner
2018-05-28 11:17 ` Tariq Toukan
2018-05-28 14:27 ` Thomas Gleixner
2018-05-28 18:36 ` Song Liu
2018-05-28 18:34 ` Song Liu
2018-05-28 20:09 ` Thomas Gleixner
[not found] ` <3F47F523-64C5-422B-B9B0-73B8D105CF71@fb.com>
2018-05-29 8:35 ` Thomas Gleixner
2018-05-29 16:54 ` Song Liu
2018-05-30 21:56 ` Thomas Gleixner
2018-06-04 7:56 ` Dou Liyang
2018-06-04 11:17 ` Thomas Gleixner
2018-06-04 11:59 ` Dou Liyang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e2ec1a14-326a-57c2-56a6-7c2235e5b8b3@mellanox.com \
--to=tariqt@mellanox.com \
--cc=linux-kernel@vger.kernel.org \
--cc=maorg@mellanox.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).