All of lore.kernel.org
 help / color / mirror / Atom feed
* [help] host kernel panic in kvm's wakeup_handler()
@ 2017-05-24  3:57 Longpeng (Mike)
  2017-05-24  4:34 ` Alex Williamson
  0 siblings, 1 reply; 6+ messages in thread
From: Longpeng (Mike) @ 2017-05-24  3:57 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Radim Krčmář, alex.williamson
  Cc: Huangweidong (C), Gonglei, wangxin (U)

Hi guys,

We power-on/power-off 20 VMs(4 VMs with vfio passthrough NICs) concurrently so
many times, and then encounter a host-panic problem:

[152878.870508] general protection fault: 0000 [#1] SMP
[152878.878710] collected_len = 1048576, LOG_BUF_LEN_LOCAL = 1048576
[152878.886921] kbox current status: maintain, do not flush regions to devices.
[152878.893952] kbox: notify die begin
[152878.897453] kbox: no notify die func register. no need to notify
[152878.903533] do nothing after die!
[152878.906929] Modules linked in: ib_uverbs(OVE) vhost_scsi(OE)
target_core_pscsi target_core_file target_core_iblock target_core_mod
guest_kbox_ram(O) kbox_pci(OVE) igb(OVE) mlx4_ib(OVE) ib_sa(OVE) ib_mad(OVE)
ib_core(OVE) ib_addr(OVE) ib_netlink(OVE) mlx4_en(OVE) mlx4_core(OVE)
compat(OVE) vfio_pci vfio_iommu_type1 vfio(OVE) prio(O) nat(O) vport_vxlan(O)
openvswitch(O) nf_defrag_ipv6 gre libcrc32c ixgbe(O) ext3 mbcache jbd kbox(O)
pmcint(O) signo_catch(O) dm_mod vxlan ip6_udp_tunnel udp_tunnel sd_mod
crc_t10dif crct10dif_generic sg ipmi_devintf iTCO_wdt iTCO_vendor_support
kvm_intel(O) kvm(O) coretemp crct10dif_pclmul crct10dif_common crc32_pclmul
crc32c_intel ghash_clmulni_intel aesni_intel glue_helper lrw gf128mul
ablk_helper cryptd mpt2sas ahci i2c_algo_bit ptp libahci raid_class pps_core
i2c_i801 libata scsi_transport_sas dca lpc_ich i2c_core mfd_core shpchp ipmi_si
ipmi_msghandler nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack vhost_net(O)
tun(O) vhost(O) macvtap macvlan irqbypass ip_tables [last unloaded: igb]
[152878.998665] CPU: 10 PID: 0 Comm: swapper/10 Tainted: G        W  OE
----V-------   3.10.0-327.49.58.45_12.x86_64 #1
[152879.009245] Hardware name: HUAWEI TECHNOLOGIES CO.,LTD. CH80GPUB8/CH80GPUB8,
BIOS GPUBV201 06/18/2015
[152879.018881] task: ffff881fd2ce7300 ti: ffff881fd2d10000 task.ti:
ffff881fd2d10000
[152879.026803] RIP: 0010:[<ffffffffa1767ec1>]  [<ffffffffa1767ec1>]
wakeup_handler+0x71/0xb0 [kvm_intel]
[152879.036460] RSP: 0018:ffff883fff003f70  EFLAGS: 00010083
[152879.042024] RAX: dead000000100100 RBX: dead0000001000b0 RCX: ffff883fff0176f0
[152879.049595] RDX: ffff883fff000000 RSI: 0000000000000082 RDI: ffff881c9c7f0000
[152879.057139] RBP: ffff883fff003f90 R08: ffff881e522dfd90 R09: 0000000000000018
[152879.061675] mlx4_en: eth1: Port:2: removing fa:29:3e:2e:68:80
[152879.070720] R10: 000000000000039f R11: ffff881cfbf278f6 R12: 00000000000176e0
[152879.078282] R13: 000000000000000a R14: 00000000000176f0 R15: ffffffff81a13538
[152879.085845] FS:  0000000000000000(0000) GS:ffff883fff000000(0000)
knlGS:0000000000000000
[152879.094361] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[152879.100378] CR2: 0000000000605168 CR3: 000000000195e000 CR4: 00000000003427e0
[152879.107921] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[152879.115478] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[152879.123019] Stack:
[152879.125313]  0000000000000000 0000000000000004 00008b2da04a3938 0000000000000004
[152879.133227]  ffff883fff003fa8 ffffffff81016a28 ffffe8ffff800500 ffff881fd2d13e78
[152879.141121]  ffffffff81655cdd ffff881fd2d13dc8 <EOI>  ffff881fd2d13e78
00000000000003e8
[152879.149702]  ffff881cfbf278f6 000000000000039f 0000000000000018 00000000000003e8
[152879.157647]  00008b2da04f9b8e 0000000000000018 0000000225c17d03 ffff881fd2d13fd8
[152879.165597]  00008b2da04f9b8e ffffffffffffff0e ffffffff814e2b72 0000000000000010
[152879.173560]  0000000000000206 ffff881fd2d13e50 0000000000000018 ffffe8ffff800500
[152879.181401]  0000000000000004 0000000000000004 ffffffff81a133c0 0000000000000000
[152879.189297]  ffff881fd2d13eb8 ffffffff814e2cb9 0000000a00000000 ffff881fd2d10000
[152879.197183]  ffffffff81a7de20 ffff881fd2d10000 ffff881fd2d10000 0000000000000000
[152879.205069]  ffff881fd2d13ec8 ffffffff8101e68e ffff881fd2d13f20 ffffffff810d7535
[152879.212968]  ffff881fd2d13fd8 ffff881fd2d10000 a960cc5a1933ed1c ef90c751bae26ef0
[152879.220892]  ffff881fd2d13f30 0000000000000000 0000000000000000 0000000000000000
[152879.228792]  0000000000000000 ffff881fd2d13f48 ffffffff81047c1a ef90c751bae26ef0
[152879.236675]  f26ae3384c8900f4 0000000000000000 0000000000000000 0000000000000000
[152879.244597]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
[152879.252505]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
[152879.260490]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
[152879.268393]  0000000000000000 0000000000000000 0000000000000000 ffffffffffffffff
[152879.276269]  0000000000000000 0000000000000010 0000000000000202 ffff881fd2d13f58
[152879.284171]  0000000000000018
[152879.287489] Call Trace:
[152879.290205]  <IRQ>
[152879.292219]  [<ffffffff81016a28>] smp_kvm_posted_intr_wakeup_ipi+0x48/0x60
[152879.299762]  [<ffffffff81655cdd>] kvm_posted_intr_wakeup_ipi+0x6d/0x80
[152879.306567]  <EOI>
[152879.308598]  [<ffffffff814e2b72>] ? cpuidle_enter_state+0x52/0xc0
[152879.315359]  [<ffffffff814e2cb9>] cpuidle_idle_call+0xd9/0x210
[152879.321481]  [<ffffffff8101e68e>] arch_cpu_idle+0xe/0x30
[152879.327058]  [<ffffffff810d7535>] cpu_startup_entry+0x245/0x290
[152879.333224]  [<ffffffff81047c1a>] start_secondary+0x1ba/0x230
[152879.339222] Code: 4a 8d 0c 32 48 39 c8 48 8d 58 b0 75 1e eb 3b 0f 1f 00 4a
8b 14 ed a0 14 a7 81 48 8b 43 50 49 8d 0c 16 48 8d 58 b0 48 39 c8 74 1f <48> 8b
83 e0 30 00 00 a8 01 74 dc 48 89 df e8 1c 6d e5 fe eb d2
[152879.360254] RIP  [<ffffffffa1767ec1>] wakeup_handler+0x71/0xb0 [kvm_intel]
[152879.367436]  RSP <ffff883fff003f70>
[152879.371668] ---[ end trace 382c2b1701889417 ]---

There's no vmcore for some reason, but we disassembly the wakeup_handler():
    ......
    1e92:       4a 8b 04 32             mov    (%rdx,%r14,1),%rax <-- *Here*
    1e96:       4a 8d 0c 32             lea    (%rdx,%r14,1),%rcx
    1e9a:       48 39 c8                cmp    %rcx,%rax
    1e9d:       48 8d 58 b0             lea    -0x50(%rax),%rbx
    1ea1:       75 1e                   jne    1ec1 <wakeup_handler+0x71>
    1ea3:       eb 3b                   jmp    1ee0 <wakeup_handler+0x90>
    1ea5:       0f 1f 00                nopl   (%rax)
    1ea8:       4a 8b 14 ed 00 00 00    mov    0x0(,%r13,8),%rdx
    1eaf:       00
    1eb0:       48 8b 43 50             mov    0x50(%rbx),%rax
    1eb4:       49 8d 0c 16             lea    (%r14,%rdx,1),%rcx
    1eb8:       48 8d 58 b0             lea    -0x50(%rax),%rbx
    1ebc:       48 39 c8                cmp    %rcx,%rax
    1ebf:       74 1f                   je     1ee0 <wakeup_handler+0x90>
    1ec1:       48 8b 83 e0 30 00 00    mov    0x30e0(%rbx),%rax <-- *Here*
    ......
it crashed at *1ec1* and %rax get a wrong value(0xdead000000100100) at *1e92*,
it seems the *blocked_vcpu_on_cpu* list is corrupted, but kvm only access this
list in pre_block/post_block/wakeup_handler, and these three functions seems good.

kvm version is 4.4-stable.

Do you have any ideas? Any suggestion would be greatly appreciated, thanks!

-- 
Regards,
Longpeng(Mike)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [help] host kernel panic in kvm's wakeup_handler()
  2017-05-24  3:57 [help] host kernel panic in kvm's wakeup_handler() Longpeng (Mike)
@ 2017-05-24  4:34 ` Alex Williamson
  2017-05-24  5:04   ` Longpeng (Mike)
  0 siblings, 1 reply; 6+ messages in thread
From: Alex Williamson @ 2017-05-24  4:34 UTC (permalink / raw)
  To: Longpeng (Mike)
  Cc: kvm, Paolo Bonzini, Radim Krčmář, Huangweidong (C),
	Gonglei, wangxin (U)

On Wed, 24 May 2017 11:57:34 +0800
"Longpeng (Mike)" <longpeng2@huawei.com> wrote:

> Hi guys,
> 
> We power-on/power-off 20 VMs(4 VMs with vfio passthrough NICs) concurrently so
> many times, and then encounter a host-panic problem:
> 
> [152878.870508] general protection fault: 0000 [#1] SMP
> [152878.878710] collected_len = 1048576, LOG_BUF_LEN_LOCAL = 1048576
> [152878.886921] kbox current status: maintain, do not flush regions to devices.
> [152878.893952] kbox: notify die begin
> [152878.897453] kbox: no notify die func register. no need to notify
> [152878.903533] do nothing after die!
> [152878.906929] Modules linked in: ib_uverbs(OVE) vhost_scsi(OE)
> target_core_pscsi target_core_file target_core_iblock target_core_mod
> guest_kbox_ram(O) kbox_pci(OVE) igb(OVE) mlx4_ib(OVE) ib_sa(OVE) ib_mad(OVE)
> ib_core(OVE) ib_addr(OVE) ib_netlink(OVE) mlx4_en(OVE) mlx4_core(OVE)
> compat(OVE) vfio_pci vfio_iommu_type1 vfio(OVE) prio(O) nat(O) vport_vxlan(O)
> openvswitch(O) nf_defrag_ipv6 gre libcrc32c ixgbe(O) ext3 mbcache jbd kbox(O)
> pmcint(O) signo_catch(O) dm_mod vxlan ip6_udp_tunnel udp_tunnel sd_mod
> crc_t10dif crct10dif_generic sg ipmi_devintf iTCO_wdt iTCO_vendor_support
> kvm_intel(O) kvm(O) coretemp crct10dif_pclmul crct10dif_common crc32_pclmul
> crc32c_intel ghash_clmulni_intel aesni_intel glue_helper lrw gf128mul
> ablk_helper cryptd mpt2sas ahci i2c_algo_bit ptp libahci raid_class pps_core
> i2c_i801 libata scsi_transport_sas dca lpc_ich i2c_core mfd_core shpchp ipmi_si
> ipmi_msghandler nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack vhost_net(O)
> tun(O) vhost(O) macvtap macvlan irqbypass ip_tables [last unloaded: igb]
> [152878.998665] CPU: 10 PID: 0 Comm: swapper/10 Tainted: G        W  OE
> ----V-------   3.10.0-327.49.58.45_12.x86_64 #1
> [152879.009245] Hardware name: HUAWEI TECHNOLOGIES CO.,LTD. CH80GPUB8/CH80GPUB8,
> BIOS GPUBV201 06/18/2015
> [152879.018881] task: ffff881fd2ce7300 ti: ffff881fd2d10000 task.ti:
> ffff881fd2d10000
> [152879.026803] RIP: 0010:[<ffffffffa1767ec1>]  [<ffffffffa1767ec1>]
> wakeup_handler+0x71/0xb0 [kvm_intel]
> [152879.036460] RSP: 0018:ffff883fff003f70  EFLAGS: 00010083
> [152879.042024] RAX: dead000000100100 RBX: dead0000001000b0 RCX: ffff883fff0176f0
> [152879.049595] RDX: ffff883fff000000 RSI: 0000000000000082 RDI: ffff881c9c7f0000
> [152879.057139] RBP: ffff883fff003f90 R08: ffff881e522dfd90 R09: 0000000000000018
> [152879.061675] mlx4_en: eth1: Port:2: removing fa:29:3e:2e:68:80
> [152879.070720] R10: 000000000000039f R11: ffff881cfbf278f6 R12: 00000000000176e0
> [152879.078282] R13: 000000000000000a R14: 00000000000176f0 R15: ffffffff81a13538
> [152879.085845] FS:  0000000000000000(0000) GS:ffff883fff000000(0000)
> knlGS:0000000000000000
> [152879.094361] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [152879.100378] CR2: 0000000000605168 CR3: 000000000195e000 CR4: 00000000003427e0
> [152879.107921] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [152879.115478] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [152879.123019] Stack:
> [152879.125313]  0000000000000000 0000000000000004 00008b2da04a3938 0000000000000004
> [152879.133227]  ffff883fff003fa8 ffffffff81016a28 ffffe8ffff800500 ffff881fd2d13e78
> [152879.141121]  ffffffff81655cdd ffff881fd2d13dc8 <EOI>  ffff881fd2d13e78
> 00000000000003e8
> [152879.149702]  ffff881cfbf278f6 000000000000039f 0000000000000018 00000000000003e8
> [152879.157647]  00008b2da04f9b8e 0000000000000018 0000000225c17d03 ffff881fd2d13fd8
> [152879.165597]  00008b2da04f9b8e ffffffffffffff0e ffffffff814e2b72 0000000000000010
> [152879.173560]  0000000000000206 ffff881fd2d13e50 0000000000000018 ffffe8ffff800500
> [152879.181401]  0000000000000004 0000000000000004 ffffffff81a133c0 0000000000000000
> [152879.189297]  ffff881fd2d13eb8 ffffffff814e2cb9 0000000a00000000 ffff881fd2d10000
> [152879.197183]  ffffffff81a7de20 ffff881fd2d10000 ffff881fd2d10000 0000000000000000
> [152879.205069]  ffff881fd2d13ec8 ffffffff8101e68e ffff881fd2d13f20 ffffffff810d7535
> [152879.212968]  ffff881fd2d13fd8 ffff881fd2d10000 a960cc5a1933ed1c ef90c751bae26ef0
> [152879.220892]  ffff881fd2d13f30 0000000000000000 0000000000000000 0000000000000000
> [152879.228792]  0000000000000000 ffff881fd2d13f48 ffffffff81047c1a ef90c751bae26ef0
> [152879.236675]  f26ae3384c8900f4 0000000000000000 0000000000000000 0000000000000000
> [152879.244597]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> [152879.252505]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> [152879.260490]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> [152879.268393]  0000000000000000 0000000000000000 0000000000000000 ffffffffffffffff
> [152879.276269]  0000000000000000 0000000000000010 0000000000000202 ffff881fd2d13f58
> [152879.284171]  0000000000000018
> [152879.287489] Call Trace:
> [152879.290205]  <IRQ>
> [152879.292219]  [<ffffffff81016a28>] smp_kvm_posted_intr_wakeup_ipi+0x48/0x60
> [152879.299762]  [<ffffffff81655cdd>] kvm_posted_intr_wakeup_ipi+0x6d/0x80
> [152879.306567]  <EOI>
> [152879.308598]  [<ffffffff814e2b72>] ? cpuidle_enter_state+0x52/0xc0
> [152879.315359]  [<ffffffff814e2cb9>] cpuidle_idle_call+0xd9/0x210
> [152879.321481]  [<ffffffff8101e68e>] arch_cpu_idle+0xe/0x30
> [152879.327058]  [<ffffffff810d7535>] cpu_startup_entry+0x245/0x290
> [152879.333224]  [<ffffffff81047c1a>] start_secondary+0x1ba/0x230
> [152879.339222] Code: 4a 8d 0c 32 48 39 c8 48 8d 58 b0 75 1e eb 3b 0f 1f 00 4a
> 8b 14 ed a0 14 a7 81 48 8b 43 50 49 8d 0c 16 48 8d 58 b0 48 39 c8 74 1f <48> 8b
> 83 e0 30 00 00 a8 01 74 dc 48 89 df e8 1c 6d e5 fe eb d2
> [152879.360254] RIP  [<ffffffffa1767ec1>] wakeup_handler+0x71/0xb0 [kvm_intel]
> [152879.367436]  RSP <ffff883fff003f70>
> [152879.371668] ---[ end trace 382c2b1701889417 ]---
> 
> There's no vmcore for some reason, but we disassembly the wakeup_handler():
>     ......
>     1e92:       4a 8b 04 32             mov    (%rdx,%r14,1),%rax <-- *Here*
>     1e96:       4a 8d 0c 32             lea    (%rdx,%r14,1),%rcx
>     1e9a:       48 39 c8                cmp    %rcx,%rax
>     1e9d:       48 8d 58 b0             lea    -0x50(%rax),%rbx
>     1ea1:       75 1e                   jne    1ec1 <wakeup_handler+0x71>
>     1ea3:       eb 3b                   jmp    1ee0 <wakeup_handler+0x90>
>     1ea5:       0f 1f 00                nopl   (%rax)
>     1ea8:       4a 8b 14 ed 00 00 00    mov    0x0(,%r13,8),%rdx
>     1eaf:       00
>     1eb0:       48 8b 43 50             mov    0x50(%rbx),%rax
>     1eb4:       49 8d 0c 16             lea    (%r14,%rdx,1),%rcx
>     1eb8:       48 8d 58 b0             lea    -0x50(%rax),%rbx
>     1ebc:       48 39 c8                cmp    %rcx,%rax
>     1ebf:       74 1f                   je     1ee0 <wakeup_handler+0x90>
>     1ec1:       48 8b 83 e0 30 00 00    mov    0x30e0(%rbx),%rax <-- *Here*
>     ......
> it crashed at *1ec1* and %rax get a wrong value(0xdead000000100100) at *1e92*,
> it seems the *blocked_vcpu_on_cpu* list is corrupted, but kvm only access this
> list in pre_block/post_block/wakeup_handler, and these three functions seems good.
> 
> kvm version is 4.4-stable.
> 
> Do you have any ideas? Any suggestion would be greatly appreciated, thanks!
> 

Is this only seen with posted interrupt support enabled?  Booting with
intremap=nopost on the kernel commandline would disable it.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [help] host kernel panic in kvm's wakeup_handler()
  2017-05-24  4:34 ` Alex Williamson
@ 2017-05-24  5:04   ` Longpeng (Mike)
  2017-05-26 10:40     ` Paolo Bonzini
  0 siblings, 1 reply; 6+ messages in thread
From: Longpeng (Mike) @ 2017-05-24  5:04 UTC (permalink / raw)
  To: Alex Williamson
  Cc: kvm, Paolo Bonzini, Radim Krčmář, Huangweidong (C),
	Gonglei, wangxin (U)



On 2017/5/24 12:34, Alex Williamson wrote:

> On Wed, 24 May 2017 11:57:34 +0800
> "Longpeng (Mike)" <longpeng2@huawei.com> wrote:
> 
>> Hi guys,
>>
>> We power-on/power-off 20 VMs(4 VMs with vfio passthrough NICs) concurrently so
>> many times, and then encounter a host-panic problem:
>>
>> [152878.870508] general protection fault: 0000 [#1] SMP
>> [152878.878710] collected_len = 1048576, LOG_BUF_LEN_LOCAL = 1048576
>> [152878.886921] kbox current status: maintain, do not flush regions to devices.
>> [152878.893952] kbox: notify die begin
>> [152878.897453] kbox: no notify die func register. no need to notify
>> [152878.903533] do nothing after die!
>> [152878.906929] Modules linked in: ib_uverbs(OVE) vhost_scsi(OE)
>> target_core_pscsi target_core_file target_core_iblock target_core_mod
>> guest_kbox_ram(O) kbox_pci(OVE) igb(OVE) mlx4_ib(OVE) ib_sa(OVE) ib_mad(OVE)
>> ib_core(OVE) ib_addr(OVE) ib_netlink(OVE) mlx4_en(OVE) mlx4_core(OVE)
>> compat(OVE) vfio_pci vfio_iommu_type1 vfio(OVE) prio(O) nat(O) vport_vxlan(O)
>> openvswitch(O) nf_defrag_ipv6 gre libcrc32c ixgbe(O) ext3 mbcache jbd kbox(O)
>> pmcint(O) signo_catch(O) dm_mod vxlan ip6_udp_tunnel udp_tunnel sd_mod
>> crc_t10dif crct10dif_generic sg ipmi_devintf iTCO_wdt iTCO_vendor_support
>> kvm_intel(O) kvm(O) coretemp crct10dif_pclmul crct10dif_common crc32_pclmul
>> crc32c_intel ghash_clmulni_intel aesni_intel glue_helper lrw gf128mul
>> ablk_helper cryptd mpt2sas ahci i2c_algo_bit ptp libahci raid_class pps_core
>> i2c_i801 libata scsi_transport_sas dca lpc_ich i2c_core mfd_core shpchp ipmi_si
>> ipmi_msghandler nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack vhost_net(O)
>> tun(O) vhost(O) macvtap macvlan irqbypass ip_tables [last unloaded: igb]
>> [152878.998665] CPU: 10 PID: 0 Comm: swapper/10 Tainted: G        W  OE
>> ----V-------   3.10.0-327.49.58.45_12.x86_64 #1
>> [152879.009245] Hardware name: HUAWEI TECHNOLOGIES CO.,LTD. CH80GPUB8/CH80GPUB8,
>> BIOS GPUBV201 06/18/2015
>> [152879.018881] task: ffff881fd2ce7300 ti: ffff881fd2d10000 task.ti:
>> ffff881fd2d10000
>> [152879.026803] RIP: 0010:[<ffffffffa1767ec1>]  [<ffffffffa1767ec1>]
>> wakeup_handler+0x71/0xb0 [kvm_intel]
>> [152879.036460] RSP: 0018:ffff883fff003f70  EFLAGS: 00010083
>> [152879.042024] RAX: dead000000100100 RBX: dead0000001000b0 RCX: ffff883fff0176f0
>> [152879.049595] RDX: ffff883fff000000 RSI: 0000000000000082 RDI: ffff881c9c7f0000
>> [152879.057139] RBP: ffff883fff003f90 R08: ffff881e522dfd90 R09: 0000000000000018
>> [152879.061675] mlx4_en: eth1: Port:2: removing fa:29:3e:2e:68:80
>> [152879.070720] R10: 000000000000039f R11: ffff881cfbf278f6 R12: 00000000000176e0
>> [152879.078282] R13: 000000000000000a R14: 00000000000176f0 R15: ffffffff81a13538
>> [152879.085845] FS:  0000000000000000(0000) GS:ffff883fff000000(0000)
>> knlGS:0000000000000000
>> [152879.094361] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [152879.100378] CR2: 0000000000605168 CR3: 000000000195e000 CR4: 00000000003427e0
>> [152879.107921] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [152879.115478] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> [152879.123019] Stack:
>> [152879.125313]  0000000000000000 0000000000000004 00008b2da04a3938 0000000000000004
>> [152879.133227]  ffff883fff003fa8 ffffffff81016a28 ffffe8ffff800500 ffff881fd2d13e78
>> [152879.141121]  ffffffff81655cdd ffff881fd2d13dc8 <EOI>  ffff881fd2d13e78
>> 00000000000003e8
>> [152879.149702]  ffff881cfbf278f6 000000000000039f 0000000000000018 00000000000003e8
>> [152879.157647]  00008b2da04f9b8e 0000000000000018 0000000225c17d03 ffff881fd2d13fd8
>> [152879.165597]  00008b2da04f9b8e ffffffffffffff0e ffffffff814e2b72 0000000000000010
>> [152879.173560]  0000000000000206 ffff881fd2d13e50 0000000000000018 ffffe8ffff800500
>> [152879.181401]  0000000000000004 0000000000000004 ffffffff81a133c0 0000000000000000
>> [152879.189297]  ffff881fd2d13eb8 ffffffff814e2cb9 0000000a00000000 ffff881fd2d10000
>> [152879.197183]  ffffffff81a7de20 ffff881fd2d10000 ffff881fd2d10000 0000000000000000
>> [152879.205069]  ffff881fd2d13ec8 ffffffff8101e68e ffff881fd2d13f20 ffffffff810d7535
>> [152879.212968]  ffff881fd2d13fd8 ffff881fd2d10000 a960cc5a1933ed1c ef90c751bae26ef0
>> [152879.220892]  ffff881fd2d13f30 0000000000000000 0000000000000000 0000000000000000
>> [152879.228792]  0000000000000000 ffff881fd2d13f48 ffffffff81047c1a ef90c751bae26ef0
>> [152879.236675]  f26ae3384c8900f4 0000000000000000 0000000000000000 0000000000000000
>> [152879.244597]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> [152879.252505]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> [152879.260490]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> [152879.268393]  0000000000000000 0000000000000000 0000000000000000 ffffffffffffffff
>> [152879.276269]  0000000000000000 0000000000000010 0000000000000202 ffff881fd2d13f58
>> [152879.284171]  0000000000000018
>> [152879.287489] Call Trace:
>> [152879.290205]  <IRQ>
>> [152879.292219]  [<ffffffff81016a28>] smp_kvm_posted_intr_wakeup_ipi+0x48/0x60
>> [152879.299762]  [<ffffffff81655cdd>] kvm_posted_intr_wakeup_ipi+0x6d/0x80
>> [152879.306567]  <EOI>
>> [152879.308598]  [<ffffffff814e2b72>] ? cpuidle_enter_state+0x52/0xc0
>> [152879.315359]  [<ffffffff814e2cb9>] cpuidle_idle_call+0xd9/0x210
>> [152879.321481]  [<ffffffff8101e68e>] arch_cpu_idle+0xe/0x30
>> [152879.327058]  [<ffffffff810d7535>] cpu_startup_entry+0x245/0x290
>> [152879.333224]  [<ffffffff81047c1a>] start_secondary+0x1ba/0x230
>> [152879.339222] Code: 4a 8d 0c 32 48 39 c8 48 8d 58 b0 75 1e eb 3b 0f 1f 00 4a
>> 8b 14 ed a0 14 a7 81 48 8b 43 50 49 8d 0c 16 48 8d 58 b0 48 39 c8 74 1f <48> 8b
>> 83 e0 30 00 00 a8 01 74 dc 48 89 df e8 1c 6d e5 fe eb d2
>> [152879.360254] RIP  [<ffffffffa1767ec1>] wakeup_handler+0x71/0xb0 [kvm_intel]
>> [152879.367436]  RSP <ffff883fff003f70>
>> [152879.371668] ---[ end trace 382c2b1701889417 ]---
>>
>> There's no vmcore for some reason, but we disassembly the wakeup_handler():
>>     ......
>>     1e92:       4a 8b 04 32             mov    (%rdx,%r14,1),%rax <-- *Here*
>>     1e96:       4a 8d 0c 32             lea    (%rdx,%r14,1),%rcx
>>     1e9a:       48 39 c8                cmp    %rcx,%rax
>>     1e9d:       48 8d 58 b0             lea    -0x50(%rax),%rbx
>>     1ea1:       75 1e                   jne    1ec1 <wakeup_handler+0x71>
>>     1ea3:       eb 3b                   jmp    1ee0 <wakeup_handler+0x90>
>>     1ea5:       0f 1f 00                nopl   (%rax)
>>     1ea8:       4a 8b 14 ed 00 00 00    mov    0x0(,%r13,8),%rdx
>>     1eaf:       00
>>     1eb0:       48 8b 43 50             mov    0x50(%rbx),%rax
>>     1eb4:       49 8d 0c 16             lea    (%r14,%rdx,1),%rcx
>>     1eb8:       48 8d 58 b0             lea    -0x50(%rax),%rbx
>>     1ebc:       48 39 c8                cmp    %rcx,%rax
>>     1ebf:       74 1f                   je     1ee0 <wakeup_handler+0x90>
>>     1ec1:       48 8b 83 e0 30 00 00    mov    0x30e0(%rbx),%rax <-- *Here*
>>     ......
>> it crashed at *1ec1* and %rax get a wrong value(0xdead000000100100) at *1e92*,
>> it seems the *blocked_vcpu_on_cpu* list is corrupted, but kvm only access this
>> list in pre_block/post_block/wakeup_handler, and these three functions seems good.
>>
>> kvm version is 4.4-stable.
>>
>> Do you have any ideas? Any suggestion would be greatly appreciated, thanks!
>>
> 
> Is this only seen with posted interrupt support enabled?  Booting with
> intremap=nopost on the kernel commandline would disable it.  Thanks,
> 
> Alex
> 


Hi Alex,

We tested with PI support enabled, but we not sure if it only occurs with PI
enabled yet.

*lscpu:*
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                40
On-line CPU(s) list:   0-39
Thread(s) per core:    2
Core(s) per socket:    10
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 79
Model name:            Intel(R) Xeon(R) CPU E5-2618L v4 @ 2.20GHz
Stepping:              1
CPU MHz:               1452.085
BogoMIPS:              4405.88
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              25600K
NUMA node0 CPU(s):     0-9,20-29
NUMA node1 CPU(s):     10-19,30-39

We would try to reproduce the problem again. Thanks :)

> .
> 


-- 
Regards,
Longpeng(Mike)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [help] host kernel panic in kvm's wakeup_handler()
  2017-05-24  5:04   ` Longpeng (Mike)
@ 2017-05-26 10:40     ` Paolo Bonzini
  2017-05-26 10:53       ` Longpeng (Mike)
  2017-06-05 16:21       ` Longpeng (Mike)
  0 siblings, 2 replies; 6+ messages in thread
From: Paolo Bonzini @ 2017-05-26 10:40 UTC (permalink / raw)
  To: Longpeng (Mike), Alex Williamson
  Cc: kvm, Radim Krčmář, Huangweidong (C), Gonglei, wangxin (U)



On 24/05/2017 07:04, Longpeng (Mike) wrote:
>>> it crashed at *1ec1* and %rax get a wrong value(0xdead000000100100) at *1e92*,
>>> it seems the *blocked_vcpu_on_cpu* list is corrupted, but kvm only access this
>>> list in pre_block/post_block/wakeup_handler, and these three functions seems good.
>>>
>>> kvm version is 4.4-stable.
>>>
>>> Do you have any ideas? Any suggestion would be greatly appreciated, thanks!
>>>
>> Is this only seen with posted interrupt support enabled?  Booting with
>> intremap=nopost on the kernel commandline would disable it.  Thanks,
> 
> We tested with PI support enabled, but we not sure if it only occurs with PI
> enabled yet.

This code should not run at all with PI disabled, since the handler is
only reachable through an IRTE.

As you said, the list manipulation in those function is fairly simple.
If you have a reproducer, you can try running it with CONFIG_LIST_DEBUG
and see what you get.

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [help] host kernel panic in kvm's wakeup_handler()
  2017-05-26 10:40     ` Paolo Bonzini
@ 2017-05-26 10:53       ` Longpeng (Mike)
  2017-06-05 16:21       ` Longpeng (Mike)
  1 sibling, 0 replies; 6+ messages in thread
From: Longpeng (Mike) @ 2017-05-26 10:53 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Alex Williamson, kvm, Radim Krčmář,
	Huangweidong (C), Gonglei, wangxin (U)



On 2017/5/26 18:40, Paolo Bonzini wrote:

> 
> 
> On 24/05/2017 07:04, Longpeng (Mike) wrote:
>>>> it crashed at *1ec1* and %rax get a wrong value(0xdead000000100100) at *1e92*,
>>>> it seems the *blocked_vcpu_on_cpu* list is corrupted, but kvm only access this
>>>> list in pre_block/post_block/wakeup_handler, and these three functions seems good.
>>>>
>>>> kvm version is 4.4-stable.
>>>>
>>>> Do you have any ideas? Any suggestion would be greatly appreciated, thanks!
>>>>
>>> Is this only seen with posted interrupt support enabled?  Booting with
>>> intremap=nopost on the kernel commandline would disable it.  Thanks,
>>
>> We tested with PI support enabled, but we not sure if it only occurs with PI
>> enabled yet.
> 
> This code should not run at all with PI disabled, since the handler is
> only reachable through an IRTE.
> 
> As you said, the list manipulation in those function is fairly simple.
> If you have a reproducer, you can try running it with CONFIG_LIST_DEBUG
> and see what you get.
> 


OK.

We have already started test for a long time, but didn't panic yet.

Thanks :)

> Thanks,
> 
> Paolo
> 
> .
> 


-- 
Regards,
Longpeng(Mike)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [help] host kernel panic in kvm's wakeup_handler()
  2017-05-26 10:40     ` Paolo Bonzini
  2017-05-26 10:53       ` Longpeng (Mike)
@ 2017-06-05 16:21       ` Longpeng (Mike)
  1 sibling, 0 replies; 6+ messages in thread
From: Longpeng (Mike) @ 2017-06-05 16:21 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Alex Williamson, kvm, Radim Krčmář,
	Huangweidong (C), Gonglei, wangxin (U)

Hi Paolo,

We have a reproducer now, it says that the *blocked_vcpu_on_cpu* list is
corruption and double addition.

Do you have any suggestion?

[231298.241923] WARNING: at lib/list_debug.c:36 __list_add+0x8a/0xc0()
[231298.241925] list_add double add: new=ffff881b8bc48050,
prev=ffff881b8bc48050, next=ffff881fffa576f0.
[231298.241926] Modules linked in: guest_kbox_ram(O) igb(OVE) mlx4_ib(OVE)
ib_sa(OVE) ib_mad(OVE) mlx4_en(OVE) mlx4_core(OVE) ib_uverbs(OVE) vhost_scsi(OE)
target_core_pscsi target_core_file target_core_iblock target_core_mod dm_mod
kbox_pci(OVE) ib_core(OVE) ib_addr(OVE) ib_netlink(OVE) compat(OVE) ixgbe(O)
ext3 mbcache jbd signo_catch(O) bum(O) ip_set nfnetlink prio(O) nat(O)
vport_vxlan(O) openvswitch(O) nf_defrag_ipv6 gre libcrc32c kbox(O) pmcint(O)
vxlan ip6_udp_tunnel udp_tunnel sd_mod crc_t10dif crct10dif_generic sg
ipmi_devintf kvm_intel(O) kvm(O) coretemp crct10dif_pclmul crct10dif_common ahci
libahci mpt2sas i2c_i801 i2c_algo_bit libata dca i2c_core raid_class ptp
scsi_transport_sas pps_core ipmi_si ipmi_msghandler nf_conntrack_ipv4
nf_defrag_ipv4 nf_conntrack vhost_net(O) tun(O) vhost(O) macvtap
[231298.241986]  macvlan vfio_pci irqbypass vfio_iommu_type1 vfio ip_tables
[last unloaded: guest_kbox_ram]
[231298.241994] CPU: 1 PID: 12431 Comm: CPU 0/KVM Tainted: G        W  OE
----V-------   3.10.0-327.49.58.52_13.x86_64 #1
[231298.241996] Hardware name: HUAWEI TECHNOLOGIES CO.,LTD. CH80GPUB8/CH80GPUB8,
BIOS GPUBV201 06/18/2015
[231298.241997]  ffff881fa372fc60 00000000054b553c ffff881fa372fc18 ffffffff81644aaf
[231298.242002]  ffff881fa372fc50 ffffffff8107b1c0 ffff881b8bc48050 ffff881fffa576f0
[231298.242006]  ffff881b8bc48050 000000000000a022 000000000000001b ffff881fa372fcb8
[231298.242011]  ffffffff8107b25c ffffffff818a9ce8 ffff881b00000030 ffff881fa372fcc8
[231298.242015]  ffff881fa372fc88 00000000054b553c 0000000000000001 ffffffff8107b205
[231298.242020]  ffffffff81a38960 ffff881b8bc48050 ffff881b8bc48050 ffff881fffa576f0
[231298.242025]  ffff881fa372fce0 ffffffff8131a41a ffff881b8bc48000 00000000000176e0
[231298.242029]  0000000000000292 ffff881fa372fdd0 ffffffffa10de6d0 ffff881b8bc48050
[231298.242036]  ffff881fa372ffd8 ffff881bb4c70000 ffff88176dfd8048 0000000000000001
[231298.242043]  ffff881fa372fe18 ffffffff81656a31 ffffffffa10d9360 ffffffffa10fc140
[231298.242048]  ffff881c23580100 0000000000000000 0000000000000000 ffff881b8bc48000
[231298.242052]  ffff881fa372fd88 ffffffffa10dad65 0000000000000000 ffffffffa10d9360
[231298.242057]  00000000054b553c ffff881c23580200 0000000000000000 ffff881c23580000
[231298.242061]  ffff881b8bc48000 ffff881fa372fdb8 ffff881b8bc48000 ffff881fa372ffd8
[231298.242066]  ffff881bb4c70000 ffff88176dfd8048 0000000000000001 ffff881fa372fe18
[231298.242070]  ffffffffa05ed1e8 ffffffee7ffbfaff 00000000054b553c ffff881b8bc48000
[231298.242075]  ffff883fb857b600 0000000000000000 ffff881ae737dc00 ffff881bb4c70000
[231298.242079]  ffff881fa372feb0 ffffffffa05d4b31 0000000000000000 0000000000008000
[231298.242084]  ffff881fa372fe70 ffffffff8112f643 000000000000ffff ffff881ae737dc38
[231298.242088]  ffffffffa05d4880 0000000000000000 0000000000000000 000000000000ae80
[231298.242093]  ffff881ae737dc00 00000000054b553c ffff881ae737dc00 ffff883fd2a0a500
[231298.242097]  0000000000000000 0000000000000000 0000000000000001 ffff881fa372ff28
[231298.242102]  ffffffff811fd9d5 000000000000ffff ffff881ae737dc38 0000000000000000
[231298.242106]  0000000000000000 000000000000ae80 0000000000000018 ffff881ae737dc00
[231298.242111] Call Trace:
[231298.242115]  [<ffffffff81644aaf>] dump_stack+0x19/0x1b
[231298.242118]  [<ffffffff8107b1c0>] warn_slowpath_common+0x70/0xb0
[231298.242122]  [<ffffffff8107b25c>] warn_slowpath_fmt+0x5c/0x80
[231298.242126]  [<ffffffff8107b205>] ? warn_slowpath_fmt+0x5/0x80
[231298.242130]  [<ffffffff8131a41a>] __list_add+0x8a/0xc0
[231298.242136]  [<ffffffffa10de6d0>] vmx_pre_block+0xe0/0x220 [kvm_intel]
[231298.242140]  [<ffffffff81656a31>] ? ftrace_call+0x5/0x2f
[231298.242145]  [<ffffffffa10d9360>] ? vmx_invpcid_supported+0x20/0x20 [kvm_intel]
[231298.242151]  [<ffffffffa10dad65>] ? vmx_sync_pir_to_irr+0x5/0x30 [kvm_intel]
[231298.242156]  [<ffffffffa10d9360>] ? vmx_invpcid_supported+0x20/0x20 [kvm_intel]
[231298.242167]  [<ffffffffa05ed1e8>] kvm_arch_vcpu_ioctl_run+0x178/0x440 [kvm]
[231298.242176]  [<ffffffffa05d4b31>] kvm_vcpu_ioctl+0x2b1/0x640 [kvm]
[231298.242180]  [<ffffffff8112f643>] ? ftrace_ops_list_func+0x83/0x110
[231298.242189]  [<ffffffffa05d4880>] ? vcpu_put+0x30/0x30 [kvm]
[231298.242193]  [<ffffffff811fd9d5>] do_vfs_ioctl+0x2e5/0x4c0
[231298.242197]  [<ffffffff811fdc51>] SyS_ioctl+0xa1/0xc0
[231298.242201]  [<ffffffff81654e09>] system_call_fastpath+0x16/0x1b


[231298.245626] WARNING: at lib/list_debug.c:33 __list_add+0xac/0xc0()
[231298.245628] list_add corruption. prev->next should be next
(ffff881fffa576f0), but was dead000000100100. (prev=ffff881b8bc48050).
[231298.245629] Modules linked in: guest_kbox_ram(O) igb(OVE) mlx4_ib(OVE)
ib_sa(OVE) ib_mad(OVE) mlx4_en(OVE) mlx4_core(OVE) ib_uverbs(OVE) vhost_scsi(OE)
target_core_pscsi target_core_file target_core_iblock target_core_mod dm_mod
kbox_pci(OVE) ib_core(OVE) ib_addr(OVE) ib_netlink(OVE) compat(OVE) ixgbe(O)
ext3 mbcache jbd signo_catch(O) bum(O) ip_set nfnetlink prio(O) nat(O)
vport_vxlan(O) openvswitch(O) nf_defrag_ipv6 gre libcrc32c kbox(O) pmcint(O)
vxlan ip6_udp_tunnel udp_tunnel sd_mod crc_t10dif crct10dif_generic sg
ipmi_devintf kvm_intel(O) kvm(O) coretemp crct10dif_pclmul crct10dif_common ahci
libahci mpt2sas i2c_i801 i2c_algo_bit libata dca i2c_core raid_class ptp
scsi_transport_sas pps_core ipmi_si ipmi_msghandler nf_conntrack_ipv4
nf_defrag_ipv4 nf_conntrack vhost_net(O) tun(O) vhost(O) macvtap
[231298.245711]  macvlan vfio_pci irqbypass vfio_iommu_type1 vfio ip_tables
[last unloaded: guest_kbox_ram]
[231298.245725] CPU: 1 PID: 12431 Comm: CPU 0/KVM Tainted: G        W  OE
----V-------   3.10.0-327.49.58.52_13.x86_64 #1
[231298.245729] Hardware name: HUAWEI TECHNOLOGIES CO.,LTD. CH80GPUB8/CH80GPUB8,
BIOS GPUBV201 06/18/2015
[231298.245732]  ffff881fa372fc60 00000000054b553c ffff881fa372fc18 ffffffff81644aaf
[231298.245740]  ffff881fa372fc50 ffffffff8107b1c0 ffff881b8bc48050 ffff881fffa576f0
[231298.245748]  ffff881b8bc48050 000000000000a022 000000000000001b ffff881fa372fcb8
[231298.245756]  ffffffff8107b25c ffffffff818a9c98 ffff881f00000030 ffff881fa372fcc8
[231298.245765]  ffff881fa372fc88 00000000054b553c 0000000000000001 ffffffff8107b205
[231298.245773]  ffffffff81a38960 ffff881fffa576f0 dead000000100100 ffff881b8bc48050
[231298.245781]  ffff881fa372fce0 ffffffff8131a43c ffff881b8bc48000 00000000000176e0
[231298.245791]  0000000000000292 ffff881fa372fdd0 ffffffffa10de6d0 ffff881b8bc48050
[231298.245799]  ffff881fa372ffd8 ffff881bb4c70000 ffff88176dfd8048 0000000000000001
[231298.245808]  ffff881fa372fe18 ffffffff81656a31 ffffffffa10d9360 ffffffffa10fc140
[231298.245816]  ffff881c23580100 0000000000000000 0000000000000000 ffff881b8bc48000
[231298.245826]  ffff881fa372fd88 ffffffffa10dad65 0000000000000000 ffffffffa10d9360
[231298.245834]  00000000054b553c ffff881c23580200 0000000000000000 ffff881c23580000
[231298.245842]  ffff881b8bc48000 ffff881fa372fdb8 ffff881b8bc48000 ffff881fa372ffd8
[231298.245847]  ffff881bb4c70000 ffff88176dfd8048 0000000000000001 ffff881fa372fe18
[231298.245851]  ffffffffa05ed1e8 ffffffee7ffbfaff 00000000054b553c ffff881b8bc48000
[231298.245856]  ffff883fb857b600 0000000000000000 ffff881ae737dc00 ffff881bb4c70000
[231298.245861]  ffff881fa372feb0 ffffffffa05d4b31 0000000000000000 0000000000008000
[231298.245866]  ffff881fa372fe70 ffffffff8112f643 000000000000ffff ffff881ae737dc38
[231298.245870]  ffffffffa05d4880 0000000000000000 0000000000000000 000000000000ae80
[231298.245875]  ffff881ae737dc00 00000000054b553c ffff881ae737dc00 ffff883fd2a0a500
[231298.245879]  0000000000000000 0000000000000000 0000000000000001 ffff881fa372ff28
[231298.245883]  ffffffff811fd9d5 000000000000ffff ffff881ae737dc38 0000000000000000
[231298.245888]  0000000000000000 000000000000ae80 0000000000000018 ffff881ae737dc00
[231298.245893] Call Trace:
[231298.245898]  [<ffffffff81644aaf>] dump_stack+0x19/0x1b
[231298.245902]  [<ffffffff8107b1c0>] warn_slowpath_common+0x70/0xb0
[231298.245906]  [<ffffffff8107b25c>] warn_slowpath_fmt+0x5c/0x80
[231298.245910]  [<ffffffff8107b205>] ? warn_slowpath_fmt+0x5/0x80
[231298.245913]  [<ffffffff8131a43c>] __list_add+0xac/0xc0
[231298.245920]  [<ffffffffa10de6d0>] vmx_pre_block+0xe0/0x220 [kvm_intel]
[231298.245924]  [<ffffffff81656a31>] ? ftrace_call+0x5/0x2f
[231298.245930]  [<ffffffffa10d9360>] ? vmx_invpcid_supported+0x20/0x20 [kvm_intel]
[231298.245936]  [<ffffffffa10dad65>] ? vmx_sync_pir_to_irr+0x5/0x30 [kvm_intel]
[231298.245941]  [<ffffffffa10d9360>] ? vmx_invpcid_supported+0x20/0x20 [kvm_intel]
[231298.245953]  [<ffffffffa05ed1e8>] kvm_arch_vcpu_ioctl_run+0x178/0x440 [kvm]
[231298.245962]  [<ffffffffa05d4b31>] kvm_vcpu_ioctl+0x2b1/0x640 [kvm]
[231298.245967]  [<ffffffff8112f643>] ? ftrace_ops_list_func+0x83/0x110
[231298.245976]  [<ffffffffa05d4880>] ? vcpu_put+0x30/0x30 [kvm]
[231298.245980]  [<ffffffff811fd9d5>] do_vfs_ioctl+0x2e5/0x4c0
[231298.245985]  [<ffffffff811fdc51>] SyS_ioctl+0xa1/0xc0
[231298.245989]  [<ffffffff81654e09>] system_call_fastpath+0x16/0x1b


On 2017/5/26 18:40, Paolo Bonzini wrote:

> 
> 
> On 24/05/2017 07:04, Longpeng (Mike) wrote:
>>>> it crashed at *1ec1* and %rax get a wrong value(0xdead000000100100) at *1e92*,
>>>> it seems the *blocked_vcpu_on_cpu* list is corrupted, but kvm only access this
>>>> list in pre_block/post_block/wakeup_handler, and these three functions seems good.
>>>>
>>>> kvm version is 4.4-stable.
>>>>
>>>> Do you have any ideas? Any suggestion would be greatly appreciated, thanks!
>>>>
>>> Is this only seen with posted interrupt support enabled?  Booting with
>>> intremap=nopost on the kernel commandline would disable it.  Thanks,
>>
>> We tested with PI support enabled, but we not sure if it only occurs with PI
>> enabled yet.
> 
> This code should not run at all with PI disabled, since the handler is
> only reachable through an IRTE.
> 
> As you said, the list manipulation in those function is fairly simple.
> If you have a reproducer, you can try running it with CONFIG_LIST_DEBUG
> and see what you get.
> 
> Thanks,
> 
> Paolo
> 
> .
> 


-- 
Regards,
Longpeng(Mike)

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-06-05 16:24 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-24  3:57 [help] host kernel panic in kvm's wakeup_handler() Longpeng (Mike)
2017-05-24  4:34 ` Alex Williamson
2017-05-24  5:04   ` Longpeng (Mike)
2017-05-26 10:40     ` Paolo Bonzini
2017-05-26 10:53       ` Longpeng (Mike)
2017-06-05 16:21       ` Longpeng (Mike)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.