All of lore.kernel.org
 help / color / mirror / Atom feed
* kvm bug in __rmap_clear_dirty during live migration
@ 2017-02-09 14:37 ` Herongguang (Stephen)
  0 siblings, 0 replies; 39+ messages in thread
From: Herongguang (Stephen) @ 2017-02-09 14:37 UTC (permalink / raw)
  To: kvm, qemu-devel, chris.friesen; +Cc: fangying1, wangxin (U)

Hi.
I had a problem when I just repeatedly live migrate a vm between two compute nodes.
The phenomenon was that the KVM module was crashed and then the host rebooted.
However I cannot reliably trigger this BUG.

The backtrace is the same as http://www.spinics.net/lists/kvm/msg138475.html.

The crash is triggered when function __rmap_clear_dirty is invoked and an invalid page(0x000000007f183000) is visited.
The value %rdi is 0x000000007f183000, which is obviously not a valid kernel pointer for x86_64.
The assembly of __rmap_clear_dirty is:
     0xffffffffa04d9ac0 <__rmap_clear_dirty>:        callq 0xffffffff816543d0 <ftrace_caller>
     0xffffffffa04d9ac5 <__rmap_clear_dirty+5>:      push %rbp
     0xffffffffa04d9ac6 <__rmap_clear_dirty+6>:      mov %rsp,%rbp
     0xffffffffa04d9ac9 <__rmap_clear_dirty+9>:      push %rbx
     0xffffffffa04d9aca <__rmap_clear_dirty+10>:     sub $0x18,%rsp
     0xffffffffa04d9ace <__rmap_clear_dirty+14>:     mov (%rsi),%rdi
     0xffffffffa04d9ad1 <__rmap_clear_dirty+17>:     mov %gs:0x28,%rax
     0xffffffffa04d9ada <__rmap_clear_dirty+26>:     mov %rax,-0x10(%rbp)
     0xffffffffa04d9ade <__rmap_clear_dirty+30>:     xor %eax,%eax
     0xffffffffa04d9ae0 <__rmap_clear_dirty+32>:     test %rdi,%rdi
     0xffffffffa04d9ae3 <__rmap_clear_dirty+35>:     je 0xffffffffa04d9b78
     0xffffffffa04d9ae9 <__rmap_clear_dirty+41>:     test $0x1,%dil
     0xffffffffa04d9aed <__rmap_clear_dirty+45>:     je 0xffffffffa04d9b98
     0xffffffffa04d9af3 <__rmap_clear_dirty+51>:     and $0xfffffffffffffffe,%rdi
     0xffffffffa04d9af7 <__rmap_clear_dirty+55>:     movl $0x0,-0x18(%rbp)
     0xffffffffa04d9afe <__rmap_clear_dirty+62>:     mov %rdi,-0x20(%rbp)
     0xffffffffa04d9b02 <__rmap_clear_dirty+66>:     mov (%rdi),%rdi
     0xffffffffa04d9b05 <__rmap_clear_dirty+69>:     test %rdi,%rdi
                                    ...

The details of the environment is:
Host info: x86_64 CentOS 7 kernel (3.10.0-327.36.58.10_2.x86_64, slightly modified). The CPU is Broadwell Intel(R) Xeon(R) CPU E5-2618L v4 @ 2.20GHz.
Kmod info: version is 4.4.36
What I know is that the parameter PML(Page Modification Logging) is enabled by default.
     # cat /sys/module/kvm_intel/parameters/pml
     # Y

Below is the crash message:
[1548777.924180] kvm: zapping shadow pages for mmio generation wraparound
[1548777.947844] HTB: quantum of class 10001 is big. Consider r2q change.
[1548778.185389] kvm: zapping shadow pages for mmio generation wraparound
[1548778.994801] BUG: unable to handle kernel paging request at 000000007f183000
[1548779.002135] IP: [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
[1548779.009151] PGD 1f8452067 PUD 0
[1548779.012774] Thread overran stack, or stack corrupted
[1548779.018076] Oops: 0000 [#1] SMP
[1548779.027050] collected_len = 1048570, LOG_BUF_LEN_LOCAL = 1048576
[1548779.042039] kbox: no notify die func register. no need to notify
[1548779.048392] do nothing after die!
[1548779.052071] Modules linked in: kboxdriver(O) kbox(O) sch_htb ip_set_hash_net ip6table_filter ip6_tables iptable_filter igb_uio(OE) uio bridge dm_service_time dm_multipath iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi 8021q garp stp mrp llc vfat fat isofs ext4 jbd2 xfs sha512_generic dev_connlimit(O) bum(O) ip_set nfnetlink prio(O) nat(O) vport_vxlan(O) openvswitch(O) nf_defrag_ipv6 gre ib_uverbs(OVE) hotpatch(OE) sigma_serial(O) pmcint(O) guest_kbox_ram(O) signo_catch(O) mlx4_ib(OVE) mlx4_en(OVE) vxlan ip6_udp_tunnel udp_tunnel ib_sa(OVE) ib_mad(OVE) ptp ib_core(OVE) ib_addr(OVE) pps_core ib_netlink(OVE) intel_powerclamp coretemp intel_rapl crc32_pclmul crc32c_intel ipmi_devintf(OVE) ghash_clmulni_intel mlx4_core(OVE) aesni_intel kvm_intel(O) igb(OVE) lrw gf128mul i2c_algo_bit 
 glue_helper
[1548779.125577]  iTCO_wdt ablk_helper kvm(O) iTCO_vendor_support sg cryptd compat(OVE) dca sb_edac i2c_i801 kbox_pci(OVE) i2c_core edac_core pcspkr lpc_ich shpchp mfd_core ipmi_si(OVE) ipmi_msghandler(OVE) nf_conntrack_ipv4 nf_defrag_ipv4 vhost_net(O) tun(O) vhost(O) macvtap macvlan vfio_pci irqbypass vfio_iommu_type1 vfio xt_sctp nf_conntrack_proto_sctp nf_nat_proto_sctp nf_nat nf_conntrack sctp libcrc32c ip_tables ext3 mbcache jbd sd_mod crc_t10dif crct10dif_generic ahci crct10dif_pclmul crct10dif_common libahci mpt2sas(OVE) libata raid_class scsi_transport_sas dm_mod [last unloaded: kbox]
[1548779.178704] CPU: 35 PID: 33769 Comm: migration Tainted: G           OE  ----V-------   3.10.0-327.36.58.10_2.x86_64 #1
[1548779.189855] Hardware name: HUAWEI TECHNOLOGIES CO.,LTD. CH80GPUB8/CH80GPUB8, BIOS GPUBV201 06/18/2015
[1548779.199585] task: ffff880123da0000 ti: ffff8801a8068000 task.ti: ffff8801a8068000
[1548779.207596] RIP: 0010:[<ffffffffa04d9b0a>] [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
[1548779.217209] RSP: 0018:ffff8801a806bba0  EFLAGS: 00010206
[1548779.222879] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[1548779.230543] RDX: ffffc90059d64000 RSI: ffffc90059f2efd0 RDI: 000000007f183000
[1548779.238211] RBP: ffff8801a806bbc0 R08: ffff8801a806bd60 R09: 0000000000000100
[1548779.245878] R10: 0000000000003649 R11: 0000000000000006 R12: ffff88027a7f0000
[1548779.253524] R13: ffffffffa04d9ac0 R14: ffff8801a806bfd8 R15: 0000000000000000
[1548779.261125] FS:  00007f3dd6aff700(0000) GS:ffff883ffe3c0000(0000) knlGS:0000000000000000
[1548779.269733] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1548779.275795] CR2: 000000007f183000 CR3: 00000001f50ab000 CR4: 00000000003427e0
[1548779.283444] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[1548779.291040] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[1548779.298689] Stack:
[1548779.301045]  0000000000000000 0000000000000297 0000000009b989ac 0000000000000000
[1548779.309033]  ffff8801a806bc48 ffffffffa04d9206 00ff8801a806bd10 ffff8801a806bd60
[1548779.317023]  0000000000000100 00000000000bffff 0000000100000001 00000000000396fa
[1548779.325012] Call Trace:
[1548779.327856]  [<ffffffffa04d9206>] slot_handle_level_range+0x76/0x100 [kvm]
[1548779.335090]  [<ffffffffa04dac34>] kvm_mmu_slot_leaf_clear_dirty+0x54/0x80 [kvm]
[1548779.342922]  [<ffffffffa03f8967>] vmx_slot_enable_log_dirty+0x17/0x30 [kvm_intel]
[1548779.350943]  [<ffffffffa04d5d2f>] kvm_arch_commit_memory_region+0x5f/0xe0 [kvm]
[1548779.358776]  [<ffffffffa04be7c5>] __kvm_set_memory_region+0x665/0xa90 [kvm]
[1548779.366099]  [<ffffffffa04be160>] ? kvm_kvzalloc+0x30/0x30 [kvm]
[1548779.372457]  [<ffffffffa04bec1b>] kvm_set_memory_region+0x2b/0x40 [kvm]
[1548779.379423]  [<ffffffffa04bf084>] kvm_vm_ioctl+0x454/0x700 [kvm]
[1548779.385777]  [<ffffffff8112f083>] ? ftrace_ops_list_func+0x83/0x110
[1548779.392406]  [<ffffffffa04bec30>] ? kvm_set_memory_region+0x40/0x40 [kvm]
[1548779.399538]  [<ffffffff811fceb5>] do_vfs_ioctl+0x2e5/0x4c0
[1548779.405340]  [<ffffffff811fd131>] SyS_ioctl+0xa1/0xc0
[1548779.410736]  [<ffffffff81652809>] system_call_fastpath+0x16/0x1b
[1548779.417086] Code: c0 48 85 ff 0f 84 8f 00 00 00 40 f6 c7 01 0f 84 a5 00 00 00 48 83 e7 fe c7 45 e8 00 00 00 00 48 89 7d e0 48 8b 3f 48 85 ff 74 6e <48> 8b 17 f6 c2 01 74 5d 48 8b 05 c7 40 04 00 48 89 d1 31 db 48
[1548779.438015] RIP  [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
[1548779.445109]  RSP <ffff8801a806bba0>
[1548779.448915] CR2: 000000007f183000
[1548779.453070] ---[ end trace d99f28b71290c64c ]---
[1548779.523937] Kernel panic - not syncing: Fatal exception
[1548779.523945] perf interrupt took too long (124114 > 17857), lowering kernel.perf_event_max_sample_rate to 7000
[1548779.524826] perf interrupt took too long (123163 > 31250), lowering kernel.perf_event_max_sample_rate to 4000
[1548779.525717] perf interrupt took too long (122216 > 62500), lowering kernel.perf_event_max_sample_rate to 2000
[1548779.603824] die even has been record!

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration
@ 2017-02-09 14:37 ` Herongguang (Stephen)
  0 siblings, 0 replies; 39+ messages in thread
From: Herongguang (Stephen) @ 2017-02-09 14:37 UTC (permalink / raw)
  To: kvm, qemu-devel, chris.friesen; +Cc: fangying1, wangxin (U)

Hi.
I had a problem when I just repeatedly live migrate a vm between two compute nodes.
The phenomenon was that the KVM module was crashed and then the host rebooted.
However I cannot reliably trigger this BUG.

The backtrace is the same as http://www.spinics.net/lists/kvm/msg138475.html.

The crash is triggered when function __rmap_clear_dirty is invoked and an invalid page(0x000000007f183000) is visited.
The value %rdi is 0x000000007f183000, which is obviously not a valid kernel pointer for x86_64.
The assembly of __rmap_clear_dirty is:
     0xffffffffa04d9ac0 <__rmap_clear_dirty>:        callq 0xffffffff816543d0 <ftrace_caller>
     0xffffffffa04d9ac5 <__rmap_clear_dirty+5>:      push %rbp
     0xffffffffa04d9ac6 <__rmap_clear_dirty+6>:      mov %rsp,%rbp
     0xffffffffa04d9ac9 <__rmap_clear_dirty+9>:      push %rbx
     0xffffffffa04d9aca <__rmap_clear_dirty+10>:     sub $0x18,%rsp
     0xffffffffa04d9ace <__rmap_clear_dirty+14>:     mov (%rsi),%rdi
     0xffffffffa04d9ad1 <__rmap_clear_dirty+17>:     mov %gs:0x28,%rax
     0xffffffffa04d9ada <__rmap_clear_dirty+26>:     mov %rax,-0x10(%rbp)
     0xffffffffa04d9ade <__rmap_clear_dirty+30>:     xor %eax,%eax
     0xffffffffa04d9ae0 <__rmap_clear_dirty+32>:     test %rdi,%rdi
     0xffffffffa04d9ae3 <__rmap_clear_dirty+35>:     je 0xffffffffa04d9b78
     0xffffffffa04d9ae9 <__rmap_clear_dirty+41>:     test $0x1,%dil
     0xffffffffa04d9aed <__rmap_clear_dirty+45>:     je 0xffffffffa04d9b98
     0xffffffffa04d9af3 <__rmap_clear_dirty+51>:     and $0xfffffffffffffffe,%rdi
     0xffffffffa04d9af7 <__rmap_clear_dirty+55>:     movl $0x0,-0x18(%rbp)
     0xffffffffa04d9afe <__rmap_clear_dirty+62>:     mov %rdi,-0x20(%rbp)
     0xffffffffa04d9b02 <__rmap_clear_dirty+66>:     mov (%rdi),%rdi
     0xffffffffa04d9b05 <__rmap_clear_dirty+69>:     test %rdi,%rdi
                                    ...

The details of the environment is:
Host info: x86_64 CentOS 7 kernel (3.10.0-327.36.58.10_2.x86_64, slightly modified). The CPU is Broadwell Intel(R) Xeon(R) CPU E5-2618L v4 @ 2.20GHz.
Kmod info: version is 4.4.36
What I know is that the parameter PML(Page Modification Logging) is enabled by default.
     # cat /sys/module/kvm_intel/parameters/pml
     # Y

Below is the crash message:
[1548777.924180] kvm: zapping shadow pages for mmio generation wraparound
[1548777.947844] HTB: quantum of class 10001 is big. Consider r2q change.
[1548778.185389] kvm: zapping shadow pages for mmio generation wraparound
[1548778.994801] BUG: unable to handle kernel paging request at 000000007f183000
[1548779.002135] IP: [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
[1548779.009151] PGD 1f8452067 PUD 0
[1548779.012774] Thread overran stack, or stack corrupted
[1548779.018076] Oops: 0000 [#1] SMP
[1548779.027050] collected_len = 1048570, LOG_BUF_LEN_LOCAL = 1048576
[1548779.042039] kbox: no notify die func register. no need to notify
[1548779.048392] do nothing after die!
[1548779.052071] Modules linked in: kboxdriver(O) kbox(O) sch_htb ip_set_hash_net ip6table_filter ip6_tables iptable_filter igb_uio(OE) uio bridge dm_service_time dm_multipath iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi 8021q garp stp mrp llc vfat fat isofs ext4 jbd2 xfs sha512_generic dev_connlimit(O) bum(O) ip_set nfnetlink prio(O) nat(O) vport_vxlan(O) openvswitch(O) nf_defrag_ipv6 gre ib_uverbs(OVE) hotpatch(OE) sigma_serial(O) pmcint(O) guest_kbox_ram(O) signo_catch(O) mlx4_ib(OVE) mlx4_en(OVE) vxlan ip6_udp_tunnel udp_tunnel ib_sa(OVE) ib_mad(OVE) ptp ib_core(OVE) ib_addr(OVE) pps_core ib_netlink(OVE) intel_powerclamp coretemp intel_rapl crc32_pclmul crc32c_intel ipmi_devintf(OVE) ghash_clmulni_intel mlx4_core(OVE) aesni_intel kvm_intel(O) igb(OVE) lrw gf128mul i2c_algo_bit glue_helper
[1548779.125577]  iTCO_wdt ablk_helper kvm(O) iTCO_vendor_support sg cryptd compat(OVE) dca sb_edac i2c_i801 kbox_pci(OVE) i2c_core edac_core pcspkr lpc_ich shpchp mfd_core ipmi_si(OVE) ipmi_msghandler(OVE) nf_conntrack_ipv4 nf_defrag_ipv4 vhost_net(O) tun(O) vhost(O) macvtap macvlan vfio_pci irqbypass vfio_iommu_type1 vfio xt_sctp nf_conntrack_proto_sctp nf_nat_proto_sctp nf_nat nf_conntrack sctp libcrc32c ip_tables ext3 mbcache jbd sd_mod crc_t10dif crct10dif_generic ahci crct10dif_pclmul crct10dif_common libahci mpt2sas(OVE) libata raid_class scsi_transport_sas dm_mod [last unloaded: kbox]
[1548779.178704] CPU: 35 PID: 33769 Comm: migration Tainted: G           OE  ----V-------   3.10.0-327.36.58.10_2.x86_64 #1
[1548779.189855] Hardware name: HUAWEI TECHNOLOGIES CO.,LTD. CH80GPUB8/CH80GPUB8, BIOS GPUBV201 06/18/2015
[1548779.199585] task: ffff880123da0000 ti: ffff8801a8068000 task.ti: ffff8801a8068000
[1548779.207596] RIP: 0010:[<ffffffffa04d9b0a>] [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
[1548779.217209] RSP: 0018:ffff8801a806bba0  EFLAGS: 00010206
[1548779.222879] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[1548779.230543] RDX: ffffc90059d64000 RSI: ffffc90059f2efd0 RDI: 000000007f183000
[1548779.238211] RBP: ffff8801a806bbc0 R08: ffff8801a806bd60 R09: 0000000000000100
[1548779.245878] R10: 0000000000003649 R11: 0000000000000006 R12: ffff88027a7f0000
[1548779.253524] R13: ffffffffa04d9ac0 R14: ffff8801a806bfd8 R15: 0000000000000000
[1548779.261125] FS:  00007f3dd6aff700(0000) GS:ffff883ffe3c0000(0000) knlGS:0000000000000000
[1548779.269733] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1548779.275795] CR2: 000000007f183000 CR3: 00000001f50ab000 CR4: 00000000003427e0
[1548779.283444] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[1548779.291040] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[1548779.298689] Stack:
[1548779.301045]  0000000000000000 0000000000000297 0000000009b989ac 0000000000000000
[1548779.309033]  ffff8801a806bc48 ffffffffa04d9206 00ff8801a806bd10 ffff8801a806bd60
[1548779.317023]  0000000000000100 00000000000bffff 0000000100000001 00000000000396fa
[1548779.325012] Call Trace:
[1548779.327856]  [<ffffffffa04d9206>] slot_handle_level_range+0x76/0x100 [kvm]
[1548779.335090]  [<ffffffffa04dac34>] kvm_mmu_slot_leaf_clear_dirty+0x54/0x80 [kvm]
[1548779.342922]  [<ffffffffa03f8967>] vmx_slot_enable_log_dirty+0x17/0x30 [kvm_intel]
[1548779.350943]  [<ffffffffa04d5d2f>] kvm_arch_commit_memory_region+0x5f/0xe0 [kvm]
[1548779.358776]  [<ffffffffa04be7c5>] __kvm_set_memory_region+0x665/0xa90 [kvm]
[1548779.366099]  [<ffffffffa04be160>] ? kvm_kvzalloc+0x30/0x30 [kvm]
[1548779.372457]  [<ffffffffa04bec1b>] kvm_set_memory_region+0x2b/0x40 [kvm]
[1548779.379423]  [<ffffffffa04bf084>] kvm_vm_ioctl+0x454/0x700 [kvm]
[1548779.385777]  [<ffffffff8112f083>] ? ftrace_ops_list_func+0x83/0x110
[1548779.392406]  [<ffffffffa04bec30>] ? kvm_set_memory_region+0x40/0x40 [kvm]
[1548779.399538]  [<ffffffff811fceb5>] do_vfs_ioctl+0x2e5/0x4c0
[1548779.405340]  [<ffffffff811fd131>] SyS_ioctl+0xa1/0xc0
[1548779.410736]  [<ffffffff81652809>] system_call_fastpath+0x16/0x1b
[1548779.417086] Code: c0 48 85 ff 0f 84 8f 00 00 00 40 f6 c7 01 0f 84 a5 00 00 00 48 83 e7 fe c7 45 e8 00 00 00 00 48 89 7d e0 48 8b 3f 48 85 ff 74 6e <48> 8b 17 f6 c2 01 74 5d 48 8b 05 c7 40 04 00 48 89 d1 31 db 48
[1548779.438015] RIP  [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
[1548779.445109]  RSP <ffff8801a806bba0>
[1548779.448915] CR2: 000000007f183000
[1548779.453070] ---[ end trace d99f28b71290c64c ]---
[1548779.523937] Kernel panic - not syncing: Fatal exception
[1548779.523945] perf interrupt took too long (124114 > 17857), lowering kernel.perf_event_max_sample_rate to 7000
[1548779.524826] perf interrupt took too long (123163 > 31250), lowering kernel.perf_event_max_sample_rate to 4000
[1548779.525717] perf interrupt took too long (122216 > 62500), lowering kernel.perf_event_max_sample_rate to 2000
[1548779.603824] die even has been record!

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: kvm bug in __rmap_clear_dirty during live migration
  2017-02-09 14:37 ` [Qemu-devel] " Herongguang (Stephen)
@ 2017-02-10  8:50   ` Herongguang (Stephen)
  -1 siblings, 0 replies; 39+ messages in thread
From: Herongguang (Stephen) @ 2017-02-10  8:50 UTC (permalink / raw)
  To: chris.friesen, pbonzini, rkrcmar; +Cc: fangying1, wangxin (U), qemu-devel, kvm

Hi, Chris Friesen, did you solve the problem?

On 2017/2/9 22:37, Herongguang (Stephen) wrote:
> Hi.
> I had a problem when I just repeatedly live migrate a vm between two compute nodes.
> The phenomenon was that the KVM module was crashed and then the host rebooted.
> However I cannot reliably trigger this BUG.
>
> The backtrace is the same as http://www.spinics.net/lists/kvm/msg138475.html.
>
> The crash is triggered when function __rmap_clear_dirty is invoked and an invalid page(0x000000007f183000) is visited.
> The value %rdi is 0x000000007f183000, which is obviously not a valid kernel pointer for x86_64.
> The assembly of __rmap_clear_dirty is:
>     0xffffffffa04d9ac0 <__rmap_clear_dirty>:        callq 0xffffffff816543d0 <ftrace_caller>
>     0xffffffffa04d9ac5 <__rmap_clear_dirty+5>:      push %rbp
>     0xffffffffa04d9ac6 <__rmap_clear_dirty+6>:      mov %rsp,%rbp
>     0xffffffffa04d9ac9 <__rmap_clear_dirty+9>:      push %rbx
>     0xffffffffa04d9aca <__rmap_clear_dirty+10>:     sub $0x18,%rsp
>     0xffffffffa04d9ace <__rmap_clear_dirty+14>:     mov (%rsi),%rdi
>     0xffffffffa04d9ad1 <__rmap_clear_dirty+17>:     mov %gs:0x28,%rax
>     0xffffffffa04d9ada <__rmap_clear_dirty+26>:     mov %rax,-0x10(%rbp)
>     0xffffffffa04d9ade <__rmap_clear_dirty+30>:     xor %eax,%eax
>     0xffffffffa04d9ae0 <__rmap_clear_dirty+32>:     test %rdi,%rdi
>     0xffffffffa04d9ae3 <__rmap_clear_dirty+35>:     je 0xffffffffa04d9b78
>     0xffffffffa04d9ae9 <__rmap_clear_dirty+41>:     test $0x1,%dil
>     0xffffffffa04d9aed <__rmap_clear_dirty+45>:     je 0xffffffffa04d9b98
>     0xffffffffa04d9af3 <__rmap_clear_dirty+51>:     and $0xfffffffffffffffe,%rdi
>     0xffffffffa04d9af7 <__rmap_clear_dirty+55>:     movl $0x0,-0x18(%rbp)
>     0xffffffffa04d9afe <__rmap_clear_dirty+62>:     mov %rdi,-0x20(%rbp)
>     0xffffffffa04d9b02 <__rmap_clear_dirty+66>:     mov (%rdi),%rdi
>     0xffffffffa04d9b05 <__rmap_clear_dirty+69>:     test %rdi,%rdi
>                                    ...
>
> The details of the environment is:
> Host info: x86_64 CentOS 7 kernel (3.10.0-327.36.58.10_2.x86_64, slightly modified). The CPU is Broadwell Intel(R) Xeon(R) CPU E5-2618L v4 @ 2.20GHz.
> Kmod info: version is 4.4.36
> What I know is that the parameter PML(Page Modification Logging) is enabled by default.
>     # cat /sys/module/kvm_intel/parameters/pml
>     # Y
>
> Below is the crash message:
> [1548777.924180] kvm: zapping shadow pages for mmio generation wraparound
> [1548777.947844] HTB: quantum of class 10001 is big. Consider r2q change.
> [1548778.185389] kvm: zapping shadow pages for mmio generation wraparound
> [1548778.994801] BUG: unable to handle kernel paging request at 000000007f183000
> [1548779.002135] IP: [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
> [1548779.009151] PGD 1f8452067 PUD 0
> [1548779.012774] Thread overran stack, or stack corrupted
> [1548779.018076] Oops: 0000 [#1] SMP
> [1548779.027050] collected_len = 1048570, LOG_BUF_LEN_LOCAL = 1048576
> [1548779.042039] kbox: no notify die func register. no need to notify
> [1548779.048392] do nothing after die!
> [1548779.052071] Modules linked in: kboxdriver(O) kbox(O) sch_htb ip_set_hash_net ip6table_filter ip6_tables iptable_filter igb_uio(OE) uio bridge dm_service_time dm_multipath iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi 8021q garp stp mrp llc vfat fat isofs ext4 jbd2 xfs sha512_generic dev_connlimit(O) bum(O) ip_set nfnetlink prio(O) nat(O) vport_vxlan(O) openvswitch(O) nf_defrag_ipv6 gre ib_uverbs(OVE) hotpatch(OE) sigma_serial(O) pmcint(O) guest_kbox_ram(O) signo_catch(O) mlx4_ib(OVE) mlx4_en(OVE) vxlan ip6_udp_tunnel udp_tunnel ib_sa(OVE) ib_mad(OVE) ptp ib_core(OVE) ib_addr(OVE) pps_core ib_netlink(OVE) intel_powerclamp coretemp intel_rapl crc32_pclmul crc32c_intel ipmi_devintf(OVE) ghash_clmulni_intel mlx4_core(OVE) aesni_intel kvm_intel(O) igb(OVE) lrw gf128mul i2c_algo_bi
 t glue_helper
> [1548779.125577]  iTCO_wdt ablk_helper kvm(O) iTCO_vendor_support sg cryptd compat(OVE) dca sb_edac i2c_i801 kbox_pci(OVE) i2c_core edac_core pcspkr lpc_ich shpchp mfd_core ipmi_si(OVE) ipmi_msghandler(OVE) nf_conntrack_ipv4 nf_defrag_ipv4 vhost_net(O) tun(O) vhost(O) macvtap macvlan vfio_pci irqbypass vfio_iommu_type1 vfio xt_sctp nf_conntrack_proto_sctp nf_nat_proto_sctp nf_nat nf_conntrack sctp libcrc32c ip_tables ext3 mbcache jbd sd_mod crc_t10dif crct10dif_generic ahci crct10dif_pclmul crct10dif_common libahci mpt2sas(OVE) libata raid_class scsi_transport_sas dm_mod [last unloaded: kbox]
> [1548779.178704] CPU: 35 PID: 33769 Comm: migration Tainted: G           OE  ----V-------   3.10.0-327.36.58.10_2.x86_64 #1
> [1548779.189855] Hardware name: HUAWEI TECHNOLOGIES CO.,LTD. CH80GPUB8/CH80GPUB8, BIOS GPUBV201 06/18/2015
> [1548779.199585] task: ffff880123da0000 ti: ffff8801a8068000 task.ti: ffff8801a8068000
> [1548779.207596] RIP: 0010:[<ffffffffa04d9b0a>] [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
> [1548779.217209] RSP: 0018:ffff8801a806bba0  EFLAGS: 00010206
> [1548779.222879] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [1548779.230543] RDX: ffffc90059d64000 RSI: ffffc90059f2efd0 RDI: 000000007f183000
> [1548779.238211] RBP: ffff8801a806bbc0 R08: ffff8801a806bd60 R09: 0000000000000100
> [1548779.245878] R10: 0000000000003649 R11: 0000000000000006 R12: ffff88027a7f0000
> [1548779.253524] R13: ffffffffa04d9ac0 R14: ffff8801a806bfd8 R15: 0000000000000000
> [1548779.261125] FS:  00007f3dd6aff700(0000) GS:ffff883ffe3c0000(0000) knlGS:0000000000000000
> [1548779.269733] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [1548779.275795] CR2: 000000007f183000 CR3: 00000001f50ab000 CR4: 00000000003427e0
> [1548779.283444] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [1548779.291040] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [1548779.298689] Stack:
> [1548779.301045]  0000000000000000 0000000000000297 0000000009b989ac 0000000000000000
> [1548779.309033]  ffff8801a806bc48 ffffffffa04d9206 00ff8801a806bd10 ffff8801a806bd60
> [1548779.317023]  0000000000000100 00000000000bffff 0000000100000001 00000000000396fa
> [1548779.325012] Call Trace:
> [1548779.327856]  [<ffffffffa04d9206>] slot_handle_level_range+0x76/0x100 [kvm]
> [1548779.335090]  [<ffffffffa04dac34>] kvm_mmu_slot_leaf_clear_dirty+0x54/0x80 [kvm]
> [1548779.342922]  [<ffffffffa03f8967>] vmx_slot_enable_log_dirty+0x17/0x30 [kvm_intel]
> [1548779.350943]  [<ffffffffa04d5d2f>] kvm_arch_commit_memory_region+0x5f/0xe0 [kvm]
> [1548779.358776]  [<ffffffffa04be7c5>] __kvm_set_memory_region+0x665/0xa90 [kvm]
> [1548779.366099]  [<ffffffffa04be160>] ? kvm_kvzalloc+0x30/0x30 [kvm]
> [1548779.372457]  [<ffffffffa04bec1b>] kvm_set_memory_region+0x2b/0x40 [kvm]
> [1548779.379423]  [<ffffffffa04bf084>] kvm_vm_ioctl+0x454/0x700 [kvm]
> [1548779.385777]  [<ffffffff8112f083>] ? ftrace_ops_list_func+0x83/0x110
> [1548779.392406]  [<ffffffffa04bec30>] ? kvm_set_memory_region+0x40/0x40 [kvm]
> [1548779.399538]  [<ffffffff811fceb5>] do_vfs_ioctl+0x2e5/0x4c0
> [1548779.405340]  [<ffffffff811fd131>] SyS_ioctl+0xa1/0xc0
> [1548779.410736]  [<ffffffff81652809>] system_call_fastpath+0x16/0x1b
> [1548779.417086] Code: c0 48 85 ff 0f 84 8f 00 00 00 40 f6 c7 01 0f 84 a5 00 00 00 48 83 e7 fe c7 45 e8 00 00 00 00 48 89 7d e0 48 8b 3f 48 85 ff 74 6e <48> 8b 17 f6 c2 01 74 5d 48 8b 05 c7 40 04 00 48 89 d1 31 db 48
> [1548779.438015] RIP  [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
> [1548779.445109]  RSP <ffff8801a806bba0>
> [1548779.448915] CR2: 000000007f183000
> [1548779.453070] ---[ end trace d99f28b71290c64c ]---
> [1548779.523937] Kernel panic - not syncing: Fatal exception
> [1548779.523945] perf interrupt took too long (124114 > 17857), lowering kernel.perf_event_max_sample_rate to 7000
> [1548779.524826] perf interrupt took too long (123163 > 31250), lowering kernel.perf_event_max_sample_rate to 4000
> [1548779.525717] perf interrupt took too long (122216 > 62500), lowering kernel.perf_event_max_sample_rate to 2000
> [1548779.603824] die even has been record!

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration
@ 2017-02-10  8:50   ` Herongguang (Stephen)
  0 siblings, 0 replies; 39+ messages in thread
From: Herongguang (Stephen) @ 2017-02-10  8:50 UTC (permalink / raw)
  To: chris.friesen, pbonzini, rkrcmar; +Cc: kvm, qemu-devel, fangying1, wangxin (U)

Hi, Chris Friesen, did you solve the problem?

On 2017/2/9 22:37, Herongguang (Stephen) wrote:
> Hi.
> I had a problem when I just repeatedly live migrate a vm between two compute nodes.
> The phenomenon was that the KVM module was crashed and then the host rebooted.
> However I cannot reliably trigger this BUG.
>
> The backtrace is the same as http://www.spinics.net/lists/kvm/msg138475.html.
>
> The crash is triggered when function __rmap_clear_dirty is invoked and an invalid page(0x000000007f183000) is visited.
> The value %rdi is 0x000000007f183000, which is obviously not a valid kernel pointer for x86_64.
> The assembly of __rmap_clear_dirty is:
>     0xffffffffa04d9ac0 <__rmap_clear_dirty>:        callq 0xffffffff816543d0 <ftrace_caller>
>     0xffffffffa04d9ac5 <__rmap_clear_dirty+5>:      push %rbp
>     0xffffffffa04d9ac6 <__rmap_clear_dirty+6>:      mov %rsp,%rbp
>     0xffffffffa04d9ac9 <__rmap_clear_dirty+9>:      push %rbx
>     0xffffffffa04d9aca <__rmap_clear_dirty+10>:     sub $0x18,%rsp
>     0xffffffffa04d9ace <__rmap_clear_dirty+14>:     mov (%rsi),%rdi
>     0xffffffffa04d9ad1 <__rmap_clear_dirty+17>:     mov %gs:0x28,%rax
>     0xffffffffa04d9ada <__rmap_clear_dirty+26>:     mov %rax,-0x10(%rbp)
>     0xffffffffa04d9ade <__rmap_clear_dirty+30>:     xor %eax,%eax
>     0xffffffffa04d9ae0 <__rmap_clear_dirty+32>:     test %rdi,%rdi
>     0xffffffffa04d9ae3 <__rmap_clear_dirty+35>:     je 0xffffffffa04d9b78
>     0xffffffffa04d9ae9 <__rmap_clear_dirty+41>:     test $0x1,%dil
>     0xffffffffa04d9aed <__rmap_clear_dirty+45>:     je 0xffffffffa04d9b98
>     0xffffffffa04d9af3 <__rmap_clear_dirty+51>:     and $0xfffffffffffffffe,%rdi
>     0xffffffffa04d9af7 <__rmap_clear_dirty+55>:     movl $0x0,-0x18(%rbp)
>     0xffffffffa04d9afe <__rmap_clear_dirty+62>:     mov %rdi,-0x20(%rbp)
>     0xffffffffa04d9b02 <__rmap_clear_dirty+66>:     mov (%rdi),%rdi
>     0xffffffffa04d9b05 <__rmap_clear_dirty+69>:     test %rdi,%rdi
>                                    ...
>
> The details of the environment is:
> Host info: x86_64 CentOS 7 kernel (3.10.0-327.36.58.10_2.x86_64, slightly modified). The CPU is Broadwell Intel(R) Xeon(R) CPU E5-2618L v4 @ 2.20GHz.
> Kmod info: version is 4.4.36
> What I know is that the parameter PML(Page Modification Logging) is enabled by default.
>     # cat /sys/module/kvm_intel/parameters/pml
>     # Y
>
> Below is the crash message:
> [1548777.924180] kvm: zapping shadow pages for mmio generation wraparound
> [1548777.947844] HTB: quantum of class 10001 is big. Consider r2q change.
> [1548778.185389] kvm: zapping shadow pages for mmio generation wraparound
> [1548778.994801] BUG: unable to handle kernel paging request at 000000007f183000
> [1548779.002135] IP: [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
> [1548779.009151] PGD 1f8452067 PUD 0
> [1548779.012774] Thread overran stack, or stack corrupted
> [1548779.018076] Oops: 0000 [#1] SMP
> [1548779.027050] collected_len = 1048570, LOG_BUF_LEN_LOCAL = 1048576
> [1548779.042039] kbox: no notify die func register. no need to notify
> [1548779.048392] do nothing after die!
> [1548779.052071] Modules linked in: kboxdriver(O) kbox(O) sch_htb ip_set_hash_net ip6table_filter ip6_tables iptable_filter igb_uio(OE) uio bridge dm_service_time dm_multipath iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi 8021q garp stp mrp llc vfat fat isofs ext4 jbd2 xfs sha512_generic dev_connlimit(O) bum(O) ip_set nfnetlink prio(O) nat(O) vport_vxlan(O) openvswitch(O) nf_defrag_ipv6 gre ib_uverbs(OVE) hotpatch(OE) sigma_serial(O) pmcint(O) guest_kbox_ram(O) signo_catch(O) mlx4_ib(OVE) mlx4_en(OVE) vxlan ip6_udp_tunnel udp_tunnel ib_sa(OVE) ib_mad(OVE) ptp ib_core(OVE) ib_addr(OVE) pps_core ib_netlink(OVE) intel_powerclamp coretemp intel_rapl crc32_pclmul crc32c_intel ipmi_devintf(OVE) ghash_clmulni_intel mlx4_core(OVE) aesni_intel kvm_intel(O) igb(OVE) lrw gf128mul i2c_algo_bit glue_helper
> [1548779.125577]  iTCO_wdt ablk_helper kvm(O) iTCO_vendor_support sg cryptd compat(OVE) dca sb_edac i2c_i801 kbox_pci(OVE) i2c_core edac_core pcspkr lpc_ich shpchp mfd_core ipmi_si(OVE) ipmi_msghandler(OVE) nf_conntrack_ipv4 nf_defrag_ipv4 vhost_net(O) tun(O) vhost(O) macvtap macvlan vfio_pci irqbypass vfio_iommu_type1 vfio xt_sctp nf_conntrack_proto_sctp nf_nat_proto_sctp nf_nat nf_conntrack sctp libcrc32c ip_tables ext3 mbcache jbd sd_mod crc_t10dif crct10dif_generic ahci crct10dif_pclmul crct10dif_common libahci mpt2sas(OVE) libata raid_class scsi_transport_sas dm_mod [last unloaded: kbox]
> [1548779.178704] CPU: 35 PID: 33769 Comm: migration Tainted: G           OE  ----V-------   3.10.0-327.36.58.10_2.x86_64 #1
> [1548779.189855] Hardware name: HUAWEI TECHNOLOGIES CO.,LTD. CH80GPUB8/CH80GPUB8, BIOS GPUBV201 06/18/2015
> [1548779.199585] task: ffff880123da0000 ti: ffff8801a8068000 task.ti: ffff8801a8068000
> [1548779.207596] RIP: 0010:[<ffffffffa04d9b0a>] [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
> [1548779.217209] RSP: 0018:ffff8801a806bba0  EFLAGS: 00010206
> [1548779.222879] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [1548779.230543] RDX: ffffc90059d64000 RSI: ffffc90059f2efd0 RDI: 000000007f183000
> [1548779.238211] RBP: ffff8801a806bbc0 R08: ffff8801a806bd60 R09: 0000000000000100
> [1548779.245878] R10: 0000000000003649 R11: 0000000000000006 R12: ffff88027a7f0000
> [1548779.253524] R13: ffffffffa04d9ac0 R14: ffff8801a806bfd8 R15: 0000000000000000
> [1548779.261125] FS:  00007f3dd6aff700(0000) GS:ffff883ffe3c0000(0000) knlGS:0000000000000000
> [1548779.269733] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [1548779.275795] CR2: 000000007f183000 CR3: 00000001f50ab000 CR4: 00000000003427e0
> [1548779.283444] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [1548779.291040] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [1548779.298689] Stack:
> [1548779.301045]  0000000000000000 0000000000000297 0000000009b989ac 0000000000000000
> [1548779.309033]  ffff8801a806bc48 ffffffffa04d9206 00ff8801a806bd10 ffff8801a806bd60
> [1548779.317023]  0000000000000100 00000000000bffff 0000000100000001 00000000000396fa
> [1548779.325012] Call Trace:
> [1548779.327856]  [<ffffffffa04d9206>] slot_handle_level_range+0x76/0x100 [kvm]
> [1548779.335090]  [<ffffffffa04dac34>] kvm_mmu_slot_leaf_clear_dirty+0x54/0x80 [kvm]
> [1548779.342922]  [<ffffffffa03f8967>] vmx_slot_enable_log_dirty+0x17/0x30 [kvm_intel]
> [1548779.350943]  [<ffffffffa04d5d2f>] kvm_arch_commit_memory_region+0x5f/0xe0 [kvm]
> [1548779.358776]  [<ffffffffa04be7c5>] __kvm_set_memory_region+0x665/0xa90 [kvm]
> [1548779.366099]  [<ffffffffa04be160>] ? kvm_kvzalloc+0x30/0x30 [kvm]
> [1548779.372457]  [<ffffffffa04bec1b>] kvm_set_memory_region+0x2b/0x40 [kvm]
> [1548779.379423]  [<ffffffffa04bf084>] kvm_vm_ioctl+0x454/0x700 [kvm]
> [1548779.385777]  [<ffffffff8112f083>] ? ftrace_ops_list_func+0x83/0x110
> [1548779.392406]  [<ffffffffa04bec30>] ? kvm_set_memory_region+0x40/0x40 [kvm]
> [1548779.399538]  [<ffffffff811fceb5>] do_vfs_ioctl+0x2e5/0x4c0
> [1548779.405340]  [<ffffffff811fd131>] SyS_ioctl+0xa1/0xc0
> [1548779.410736]  [<ffffffff81652809>] system_call_fastpath+0x16/0x1b
> [1548779.417086] Code: c0 48 85 ff 0f 84 8f 00 00 00 40 f6 c7 01 0f 84 a5 00 00 00 48 83 e7 fe c7 45 e8 00 00 00 00 48 89 7d e0 48 8b 3f 48 85 ff 74 6e <48> 8b 17 f6 c2 01 74 5d 48 8b 05 c7 40 04 00 48 89 d1 31 db 48
> [1548779.438015] RIP  [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
> [1548779.445109]  RSP <ffff8801a806bba0>
> [1548779.448915] CR2: 000000007f183000
> [1548779.453070] ---[ end trace d99f28b71290c64c ]---
> [1548779.523937] Kernel panic - not syncing: Fatal exception
> [1548779.523945] perf interrupt took too long (124114 > 17857), lowering kernel.perf_event_max_sample_rate to 7000
> [1548779.524826] perf interrupt took too long (123163 > 31250), lowering kernel.perf_event_max_sample_rate to 4000
> [1548779.525717] perf interrupt took too long (122216 > 62500), lowering kernel.perf_event_max_sample_rate to 2000
> [1548779.603824] die even has been record!

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: kvm bug in __rmap_clear_dirty during live migration
  2017-02-09 14:37 ` [Qemu-devel] " Herongguang (Stephen)
@ 2017-02-10  9:11   ` Herongguang (Stephen)
  -1 siblings, 0 replies; 39+ messages in thread
From: Herongguang (Stephen) @ 2017-02-10  9:11 UTC (permalink / raw)
  To: chris.friesen, pbonzini, rkrcmar; +Cc: fangying1, wangxin (U), qemu-devel, kvm

Hi, Chris Friesen, did you solve the problem?

On 2017/2/9 22:37, Herongguang (Stephen) wrote:
> Hi.
> I had a problem when I just repeatedly live migrate a vm between two compute nodes.
> The phenomenon was that the KVM module was crashed and then the host rebooted.
> However I cannot reliably trigger this BUG.
>
> The backtrace is the same as http://www.spinics.net/lists/kvm/msg138475.html.
>
> The crash is triggered when function __rmap_clear_dirty is invoked and an invalid page(0x000000007f183000) is visited.
> The value %rdi is 0x000000007f183000, which is obviously not a valid kernel pointer for x86_64.
> The assembly of __rmap_clear_dirty is:
>      0xffffffffa04d9ac0 <__rmap_clear_dirty>:        callq 0xffffffff816543d0 <ftrace_caller>
>      0xffffffffa04d9ac5 <__rmap_clear_dirty+5>:      push %rbp
>      0xffffffffa04d9ac6 <__rmap_clear_dirty+6>:      mov %rsp,%rbp
>      0xffffffffa04d9ac9 <__rmap_clear_dirty+9>:      push %rbx
>      0xffffffffa04d9aca <__rmap_clear_dirty+10>:     sub $0x18,%rsp
>      0xffffffffa04d9ace <__rmap_clear_dirty+14>:     mov (%rsi),%rdi
>      0xffffffffa04d9ad1 <__rmap_clear_dirty+17>:     mov %gs:0x28,%rax
>      0xffffffffa04d9ada <__rmap_clear_dirty+26>:     mov %rax,-0x10(%rbp)
>      0xffffffffa04d9ade <__rmap_clear_dirty+30>:     xor %eax,%eax
>      0xffffffffa04d9ae0 <__rmap_clear_dirty+32>:     test %rdi,%rdi
>      0xffffffffa04d9ae3 <__rmap_clear_dirty+35>:     je 0xffffffffa04d9b78
>      0xffffffffa04d9ae9 <__rmap_clear_dirty+41>:     test $0x1,%dil
>      0xffffffffa04d9aed <__rmap_clear_dirty+45>:     je 0xffffffffa04d9b98
>      0xffffffffa04d9af3 <__rmap_clear_dirty+51>:     and $0xfffffffffffffffe,%rdi
>      0xffffffffa04d9af7 <__rmap_clear_dirty+55>:     movl $0x0,-0x18(%rbp)
>      0xffffffffa04d9afe <__rmap_clear_dirty+62>:     mov %rdi,-0x20(%rbp)
>      0xffffffffa04d9b02 <__rmap_clear_dirty+66>:     mov (%rdi),%rdi
>      0xffffffffa04d9b05 <__rmap_clear_dirty+69>:     test %rdi,%rdi
>                                     ...
>
> The details of the environment is:
> Host info: x86_64 CentOS 7 kernel (3.10.0-327.36.58.10_2.x86_64, slightly modified). The CPU is Broadwell Intel(R) Xeon(R) CPU E5-2618L v4 @ 2.20GHz.
> Kmod info: version is 4.4.36
> What I know is that the parameter PML(Page Modification Logging) is enabled by default.
>      # cat /sys/module/kvm_intel/parameters/pml
>      # Y
>
> Below is the crash message:
> [1548777.924180] kvm: zapping shadow pages for mmio generation wraparound
> [1548777.947844] HTB: quantum of class 10001 is big. Consider r2q change.
> [1548778.185389] kvm: zapping shadow pages for mmio generation wraparound
> [1548778.994801] BUG: unable to handle kernel paging request at 000000007f183000
> [1548779.002135] IP: [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
> [1548779.009151] PGD 1f8452067 PUD 0
> [1548779.012774] Thread overran stack, or stack corrupted
> [1548779.018076] Oops: 0000 [#1] SMP
> [1548779.027050] collected_len = 1048570, LOG_BUF_LEN_LOCAL = 1048576
> [1548779.042039] kbox: no notify die func register. no need to notify
> [1548779.048392] do nothing after die!
> [1548779.052071] Modules linked in: kboxdriver(O) kbox(O) sch_htb ip_set_hash_net ip6table_filter ip6_tables iptable_filter igb_uio(OE) uio bridge dm_service_time dm_multipath iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi 8021q garp stp mrp llc vfat fat isofs ext4 jbd2 xfs sha512_generic dev_connlimit(O) bum(O) ip_set nfnetlink prio(O) nat(O) vport_vxlan(O) openvswitch(O) nf_defrag_ipv6 gre ib_uverbs(OVE) hotpatch(OE) sigma_serial(O) pmcint(O) guest_kbox_ram(O) signo_catch(O) mlx4_ib(OVE) mlx4_en(OVE) vxlan ip6_udp_tunnel udp_tunnel ib_sa(OVE) ib_mad(OVE) ptp ib_core(OVE) ib_addr(OVE) pps_core ib_netlink(OVE) intel_powerclamp coretemp intel_rapl crc32_pclmul crc32c_intel ipmi_devintf(OVE) ghash_clmulni_intel mlx4_core(OVE) aesni_intel kvm_intel(O) igb(OVE) lrw gf128mul i2c_algo_bi
 t glue_helper
> [1548779.125577]  iTCO_wdt ablk_helper kvm(O) iTCO_vendor_support sg cryptd compat(OVE) dca sb_edac i2c_i801 kbox_pci(OVE) i2c_core edac_core pcspkr lpc_ich shpchp mfd_core ipmi_si(OVE) ipmi_msghandler(OVE) nf_conntrack_ipv4 nf_defrag_ipv4 vhost_net(O) tun(O) vhost(O) macvtap macvlan vfio_pci irqbypass vfio_iommu_type1 vfio xt_sctp nf_conntrack_proto_sctp nf_nat_proto_sctp nf_nat nf_conntrack sctp libcrc32c ip_tables ext3 mbcache jbd sd_mod crc_t10dif crct10dif_generic ahci crct10dif_pclmul crct10dif_common libahci mpt2sas(OVE) libata raid_class scsi_transport_sas dm_mod [last unloaded: kbox]
> [1548779.178704] CPU: 35 PID: 33769 Comm: migration Tainted: G           OE  ----V-------   3.10.0-327.36.58.10_2.x86_64 #1
> [1548779.189855] Hardware name: HUAWEI TECHNOLOGIES CO.,LTD. CH80GPUB8/CH80GPUB8, BIOS GPUBV201 06/18/2015
> [1548779.199585] task: ffff880123da0000 ti: ffff8801a8068000 task.ti: ffff8801a8068000
> [1548779.207596] RIP: 0010:[<ffffffffa04d9b0a>] [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
> [1548779.217209] RSP: 0018:ffff8801a806bba0  EFLAGS: 00010206
> [1548779.222879] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [1548779.230543] RDX: ffffc90059d64000 RSI: ffffc90059f2efd0 RDI: 000000007f183000
> [1548779.238211] RBP: ffff8801a806bbc0 R08: ffff8801a806bd60 R09: 0000000000000100
> [1548779.245878] R10: 0000000000003649 R11: 0000000000000006 R12: ffff88027a7f0000
> [1548779.253524] R13: ffffffffa04d9ac0 R14: ffff8801a806bfd8 R15: 0000000000000000
> [1548779.261125] FS:  00007f3dd6aff700(0000) GS:ffff883ffe3c0000(0000) knlGS:0000000000000000
> [1548779.269733] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [1548779.275795] CR2: 000000007f183000 CR3: 00000001f50ab000 CR4: 00000000003427e0
> [1548779.283444] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [1548779.291040] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [1548779.298689] Stack:
> [1548779.301045]  0000000000000000 0000000000000297 0000000009b989ac 0000000000000000
> [1548779.309033]  ffff8801a806bc48 ffffffffa04d9206 00ff8801a806bd10 ffff8801a806bd60
> [1548779.317023]  0000000000000100 00000000000bffff 0000000100000001 00000000000396fa
> [1548779.325012] Call Trace:
> [1548779.327856]  [<ffffffffa04d9206>] slot_handle_level_range+0x76/0x100 [kvm]
> [1548779.335090]  [<ffffffffa04dac34>] kvm_mmu_slot_leaf_clear_dirty+0x54/0x80 [kvm]
> [1548779.342922]  [<ffffffffa03f8967>] vmx_slot_enable_log_dirty+0x17/0x30 [kvm_intel]
> [1548779.350943]  [<ffffffffa04d5d2f>] kvm_arch_commit_memory_region+0x5f/0xe0 [kvm]
> [1548779.358776]  [<ffffffffa04be7c5>] __kvm_set_memory_region+0x665/0xa90 [kvm]
> [1548779.366099]  [<ffffffffa04be160>] ? kvm_kvzalloc+0x30/0x30 [kvm]
> [1548779.372457]  [<ffffffffa04bec1b>] kvm_set_memory_region+0x2b/0x40 [kvm]
> [1548779.379423]  [<ffffffffa04bf084>] kvm_vm_ioctl+0x454/0x700 [kvm]
> [1548779.385777]  [<ffffffff8112f083>] ? ftrace_ops_list_func+0x83/0x110
> [1548779.392406]  [<ffffffffa04bec30>] ? kvm_set_memory_region+0x40/0x40 [kvm]
> [1548779.399538]  [<ffffffff811fceb5>] do_vfs_ioctl+0x2e5/0x4c0
> [1548779.405340]  [<ffffffff811fd131>] SyS_ioctl+0xa1/0xc0
> [1548779.410736]  [<ffffffff81652809>] system_call_fastpath+0x16/0x1b
> [1548779.417086] Code: c0 48 85 ff 0f 84 8f 00 00 00 40 f6 c7 01 0f 84 a5 00 00 00 48 83 e7 fe c7 45 e8 00 00 00 00 48 89 7d e0 48 8b 3f 48 85 ff 74 6e <48> 8b 17 f6 c2 01 74 5d 48 8b 05 c7 40 04 00 48 89 d1 31 db 48
> [1548779.438015] RIP  [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
> [1548779.445109]  RSP <ffff8801a806bba0>
> [1548779.448915] CR2: 000000007f183000
> [1548779.453070] ---[ end trace d99f28b71290c64c ]---
> [1548779.523937] Kernel panic - not syncing: Fatal exception
> [1548779.523945] perf interrupt took too long (124114 > 17857), lowering kernel.perf_event_max_sample_rate to 7000
> [1548779.524826] perf interrupt took too long (123163 > 31250), lowering kernel.perf_event_max_sample_rate to 4000
> [1548779.525717] perf interrupt took too long (122216 > 62500), lowering kernel.perf_event_max_sample_rate to 2000
> [1548779.603824] die even has been record!

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration
@ 2017-02-10  9:11   ` Herongguang (Stephen)
  0 siblings, 0 replies; 39+ messages in thread
From: Herongguang (Stephen) @ 2017-02-10  9:11 UTC (permalink / raw)
  To: chris.friesen, pbonzini, rkrcmar; +Cc: kvm, qemu-devel, fangying1, wangxin (U)

Hi, Chris Friesen, did you solve the problem?

On 2017/2/9 22:37, Herongguang (Stephen) wrote:
> Hi.
> I had a problem when I just repeatedly live migrate a vm between two compute nodes.
> The phenomenon was that the KVM module was crashed and then the host rebooted.
> However I cannot reliably trigger this BUG.
>
> The backtrace is the same as http://www.spinics.net/lists/kvm/msg138475.html.
>
> The crash is triggered when function __rmap_clear_dirty is invoked and an invalid page(0x000000007f183000) is visited.
> The value %rdi is 0x000000007f183000, which is obviously not a valid kernel pointer for x86_64.
> The assembly of __rmap_clear_dirty is:
>      0xffffffffa04d9ac0 <__rmap_clear_dirty>:        callq 0xffffffff816543d0 <ftrace_caller>
>      0xffffffffa04d9ac5 <__rmap_clear_dirty+5>:      push %rbp
>      0xffffffffa04d9ac6 <__rmap_clear_dirty+6>:      mov %rsp,%rbp
>      0xffffffffa04d9ac9 <__rmap_clear_dirty+9>:      push %rbx
>      0xffffffffa04d9aca <__rmap_clear_dirty+10>:     sub $0x18,%rsp
>      0xffffffffa04d9ace <__rmap_clear_dirty+14>:     mov (%rsi),%rdi
>      0xffffffffa04d9ad1 <__rmap_clear_dirty+17>:     mov %gs:0x28,%rax
>      0xffffffffa04d9ada <__rmap_clear_dirty+26>:     mov %rax,-0x10(%rbp)
>      0xffffffffa04d9ade <__rmap_clear_dirty+30>:     xor %eax,%eax
>      0xffffffffa04d9ae0 <__rmap_clear_dirty+32>:     test %rdi,%rdi
>      0xffffffffa04d9ae3 <__rmap_clear_dirty+35>:     je 0xffffffffa04d9b78
>      0xffffffffa04d9ae9 <__rmap_clear_dirty+41>:     test $0x1,%dil
>      0xffffffffa04d9aed <__rmap_clear_dirty+45>:     je 0xffffffffa04d9b98
>      0xffffffffa04d9af3 <__rmap_clear_dirty+51>:     and $0xfffffffffffffffe,%rdi
>      0xffffffffa04d9af7 <__rmap_clear_dirty+55>:     movl $0x0,-0x18(%rbp)
>      0xffffffffa04d9afe <__rmap_clear_dirty+62>:     mov %rdi,-0x20(%rbp)
>      0xffffffffa04d9b02 <__rmap_clear_dirty+66>:     mov (%rdi),%rdi
>      0xffffffffa04d9b05 <__rmap_clear_dirty+69>:     test %rdi,%rdi
>                                     ...
>
> The details of the environment is:
> Host info: x86_64 CentOS 7 kernel (3.10.0-327.36.58.10_2.x86_64, slightly modified). The CPU is Broadwell Intel(R) Xeon(R) CPU E5-2618L v4 @ 2.20GHz.
> Kmod info: version is 4.4.36
> What I know is that the parameter PML(Page Modification Logging) is enabled by default.
>      # cat /sys/module/kvm_intel/parameters/pml
>      # Y
>
> Below is the crash message:
> [1548777.924180] kvm: zapping shadow pages for mmio generation wraparound
> [1548777.947844] HTB: quantum of class 10001 is big. Consider r2q change.
> [1548778.185389] kvm: zapping shadow pages for mmio generation wraparound
> [1548778.994801] BUG: unable to handle kernel paging request at 000000007f183000
> [1548779.002135] IP: [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
> [1548779.009151] PGD 1f8452067 PUD 0
> [1548779.012774] Thread overran stack, or stack corrupted
> [1548779.018076] Oops: 0000 [#1] SMP
> [1548779.027050] collected_len = 1048570, LOG_BUF_LEN_LOCAL = 1048576
> [1548779.042039] kbox: no notify die func register. no need to notify
> [1548779.048392] do nothing after die!
> [1548779.052071] Modules linked in: kboxdriver(O) kbox(O) sch_htb ip_set_hash_net ip6table_filter ip6_tables iptable_filter igb_uio(OE) uio bridge dm_service_time dm_multipath iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi 8021q garp stp mrp llc vfat fat isofs ext4 jbd2 xfs sha512_generic dev_connlimit(O) bum(O) ip_set nfnetlink prio(O) nat(O) vport_vxlan(O) openvswitch(O) nf_defrag_ipv6 gre ib_uverbs(OVE) hotpatch(OE) sigma_serial(O) pmcint(O) guest_kbox_ram(O) signo_catch(O) mlx4_ib(OVE) mlx4_en(OVE) vxlan ip6_udp_tunnel udp_tunnel ib_sa(OVE) ib_mad(OVE) ptp ib_core(OVE) ib_addr(OVE) pps_core ib_netlink(OVE) intel_powerclamp coretemp intel_rapl crc32_pclmul crc32c_intel ipmi_devintf(OVE) ghash_clmulni_intel mlx4_core(OVE) aesni_intel kvm_intel(O) igb(OVE) lrw gf128mul i2c_algo_bit glue_helper
> [1548779.125577]  iTCO_wdt ablk_helper kvm(O) iTCO_vendor_support sg cryptd compat(OVE) dca sb_edac i2c_i801 kbox_pci(OVE) i2c_core edac_core pcspkr lpc_ich shpchp mfd_core ipmi_si(OVE) ipmi_msghandler(OVE) nf_conntrack_ipv4 nf_defrag_ipv4 vhost_net(O) tun(O) vhost(O) macvtap macvlan vfio_pci irqbypass vfio_iommu_type1 vfio xt_sctp nf_conntrack_proto_sctp nf_nat_proto_sctp nf_nat nf_conntrack sctp libcrc32c ip_tables ext3 mbcache jbd sd_mod crc_t10dif crct10dif_generic ahci crct10dif_pclmul crct10dif_common libahci mpt2sas(OVE) libata raid_class scsi_transport_sas dm_mod [last unloaded: kbox]
> [1548779.178704] CPU: 35 PID: 33769 Comm: migration Tainted: G           OE  ----V-------   3.10.0-327.36.58.10_2.x86_64 #1
> [1548779.189855] Hardware name: HUAWEI TECHNOLOGIES CO.,LTD. CH80GPUB8/CH80GPUB8, BIOS GPUBV201 06/18/2015
> [1548779.199585] task: ffff880123da0000 ti: ffff8801a8068000 task.ti: ffff8801a8068000
> [1548779.207596] RIP: 0010:[<ffffffffa04d9b0a>] [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
> [1548779.217209] RSP: 0018:ffff8801a806bba0  EFLAGS: 00010206
> [1548779.222879] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [1548779.230543] RDX: ffffc90059d64000 RSI: ffffc90059f2efd0 RDI: 000000007f183000
> [1548779.238211] RBP: ffff8801a806bbc0 R08: ffff8801a806bd60 R09: 0000000000000100
> [1548779.245878] R10: 0000000000003649 R11: 0000000000000006 R12: ffff88027a7f0000
> [1548779.253524] R13: ffffffffa04d9ac0 R14: ffff8801a806bfd8 R15: 0000000000000000
> [1548779.261125] FS:  00007f3dd6aff700(0000) GS:ffff883ffe3c0000(0000) knlGS:0000000000000000
> [1548779.269733] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [1548779.275795] CR2: 000000007f183000 CR3: 00000001f50ab000 CR4: 00000000003427e0
> [1548779.283444] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [1548779.291040] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [1548779.298689] Stack:
> [1548779.301045]  0000000000000000 0000000000000297 0000000009b989ac 0000000000000000
> [1548779.309033]  ffff8801a806bc48 ffffffffa04d9206 00ff8801a806bd10 ffff8801a806bd60
> [1548779.317023]  0000000000000100 00000000000bffff 0000000100000001 00000000000396fa
> [1548779.325012] Call Trace:
> [1548779.327856]  [<ffffffffa04d9206>] slot_handle_level_range+0x76/0x100 [kvm]
> [1548779.335090]  [<ffffffffa04dac34>] kvm_mmu_slot_leaf_clear_dirty+0x54/0x80 [kvm]
> [1548779.342922]  [<ffffffffa03f8967>] vmx_slot_enable_log_dirty+0x17/0x30 [kvm_intel]
> [1548779.350943]  [<ffffffffa04d5d2f>] kvm_arch_commit_memory_region+0x5f/0xe0 [kvm]
> [1548779.358776]  [<ffffffffa04be7c5>] __kvm_set_memory_region+0x665/0xa90 [kvm]
> [1548779.366099]  [<ffffffffa04be160>] ? kvm_kvzalloc+0x30/0x30 [kvm]
> [1548779.372457]  [<ffffffffa04bec1b>] kvm_set_memory_region+0x2b/0x40 [kvm]
> [1548779.379423]  [<ffffffffa04bf084>] kvm_vm_ioctl+0x454/0x700 [kvm]
> [1548779.385777]  [<ffffffff8112f083>] ? ftrace_ops_list_func+0x83/0x110
> [1548779.392406]  [<ffffffffa04bec30>] ? kvm_set_memory_region+0x40/0x40 [kvm]
> [1548779.399538]  [<ffffffff811fceb5>] do_vfs_ioctl+0x2e5/0x4c0
> [1548779.405340]  [<ffffffff811fd131>] SyS_ioctl+0xa1/0xc0
> [1548779.410736]  [<ffffffff81652809>] system_call_fastpath+0x16/0x1b
> [1548779.417086] Code: c0 48 85 ff 0f 84 8f 00 00 00 40 f6 c7 01 0f 84 a5 00 00 00 48 83 e7 fe c7 45 e8 00 00 00 00 48 89 7d e0 48 8b 3f 48 85 ff 74 6e <48> 8b 17 f6 c2 01 74 5d 48 8b 05 c7 40 04 00 48 89 d1 31 db 48
> [1548779.438015] RIP  [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
> [1548779.445109]  RSP <ffff8801a806bba0>
> [1548779.448915] CR2: 000000007f183000
> [1548779.453070] ---[ end trace d99f28b71290c64c ]---
> [1548779.523937] Kernel panic - not syncing: Fatal exception
> [1548779.523945] perf interrupt took too long (124114 > 17857), lowering kernel.perf_event_max_sample_rate to 7000
> [1548779.524826] perf interrupt took too long (123163 > 31250), lowering kernel.perf_event_max_sample_rate to 4000
> [1548779.525717] perf interrupt took too long (122216 > 62500), lowering kernel.perf_event_max_sample_rate to 2000
> [1548779.603824] die even has been record!

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: kvm bug in __rmap_clear_dirty during live migration
  2017-02-10  9:11   ` [Qemu-devel] " Herongguang (Stephen)
@ 2017-02-10 15:28     ` Chris Friesen
  -1 siblings, 0 replies; 39+ messages in thread
From: Chris Friesen @ 2017-02-10 15:28 UTC (permalink / raw)
  To: Herongguang (Stephen), pbonzini, rkrcmar
  Cc: fangying1, wangxin (U), qemu-devel, kvm


Well, not so much solved as worked around it.

It seems that the problem only showed up on Broadwell, which made us wonder 
about something hardware specific.

Setting "kvm-intel.eptad=0" in the kernel boot args seems to mask the problem 
for us.

Chris

On 02/10/2017 03:11 AM, Herongguang (Stephen) wrote:
> Hi, Chris Friesen, did you solve the problem?
>
> On 2017/2/9 22:37, Herongguang (Stephen) wrote:
>> Hi.
>> I had a problem when I just repeatedly live migrate a vm between two compute
>> nodes.
>> The phenomenon was that the KVM module was crashed and then the host rebooted.
>> However I cannot reliably trigger this BUG.
>>
>> The backtrace is the same as http://www.spinics.net/lists/kvm/msg138475.html.
>>
>> The crash is triggered when function __rmap_clear_dirty is invoked and an
>> invalid page(0x000000007f183000) is visited.
>> The value %rdi is 0x000000007f183000, which is obviously not a valid kernel
>> pointer for x86_64.
>> The assembly of __rmap_clear_dirty is:
>>      0xffffffffa04d9ac0 <__rmap_clear_dirty>:        callq 0xffffffff816543d0
>> <ftrace_caller>
>>      0xffffffffa04d9ac5 <__rmap_clear_dirty+5>:      push %rbp
>>      0xffffffffa04d9ac6 <__rmap_clear_dirty+6>:      mov %rsp,%rbp
>>      0xffffffffa04d9ac9 <__rmap_clear_dirty+9>:      push %rbx
>>      0xffffffffa04d9aca <__rmap_clear_dirty+10>:     sub $0x18,%rsp
>>      0xffffffffa04d9ace <__rmap_clear_dirty+14>:     mov (%rsi),%rdi
>>      0xffffffffa04d9ad1 <__rmap_clear_dirty+17>:     mov %gs:0x28,%rax
>>      0xffffffffa04d9ada <__rmap_clear_dirty+26>:     mov %rax,-0x10(%rbp)
>>      0xffffffffa04d9ade <__rmap_clear_dirty+30>:     xor %eax,%eax
>>      0xffffffffa04d9ae0 <__rmap_clear_dirty+32>:     test %rdi,%rdi
>>      0xffffffffa04d9ae3 <__rmap_clear_dirty+35>:     je 0xffffffffa04d9b78
>>      0xffffffffa04d9ae9 <__rmap_clear_dirty+41>:     test $0x1,%dil
>>      0xffffffffa04d9aed <__rmap_clear_dirty+45>:     je 0xffffffffa04d9b98
>>      0xffffffffa04d9af3 <__rmap_clear_dirty+51>:     and $0xfffffffffffffffe,%rdi
>>      0xffffffffa04d9af7 <__rmap_clear_dirty+55>:     movl $0x0,-0x18(%rbp)
>>      0xffffffffa04d9afe <__rmap_clear_dirty+62>:     mov %rdi,-0x20(%rbp)
>>      0xffffffffa04d9b02 <__rmap_clear_dirty+66>:     mov (%rdi),%rdi
>>      0xffffffffa04d9b05 <__rmap_clear_dirty+69>:     test %rdi,%rdi
>>                                     ...
>>
>> The details of the environment is:
>> Host info: x86_64 CentOS 7 kernel (3.10.0-327.36.58.10_2.x86_64, slightly
>> modified). The CPU is Broadwell Intel(R) Xeon(R) CPU E5-2618L v4 @ 2.20GHz.
>> Kmod info: version is 4.4.36
>> What I know is that the parameter PML(Page Modification Logging) is enabled by
>> default.
>>      # cat /sys/module/kvm_intel/parameters/pml
>>      # Y
>>
>> Below is the crash message:
>> [1548777.924180] kvm: zapping shadow pages for mmio generation wraparound
>> [1548777.947844] HTB: quantum of class 10001 is big. Consider r2q change.
>> [1548778.185389] kvm: zapping shadow pages for mmio generation wraparound
>> [1548778.994801] BUG: unable to handle kernel paging request at 000000007f183000
>> [1548779.002135] IP: [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
>> [1548779.009151] PGD 1f8452067 PUD 0
>> [1548779.012774] Thread overran stack, or stack corrupted
>> [1548779.018076] Oops: 0000 [#1] SMP
>> [1548779.027050] collected_len = 1048570, LOG_BUF_LEN_LOCAL = 1048576
>> [1548779.042039] kbox: no notify die func register. no need to notify
>> [1548779.048392] do nothing after die!
>> [1548779.052071] Modules linked in: kboxdriver(O) kbox(O) sch_htb
>> ip_set_hash_net ip6table_filter ip6_tables iptable_filter igb_uio(OE) uio
>> bridge dm_service_time dm_multipath iscsi_tcp libiscsi_tcp libiscsi
>> scsi_transport_iscsi 8021q garp stp mrp llc vfat fat isofs ext4 jbd2 xfs
>> sha512_generic dev_connlimit(O) bum(O) ip_set nfnetlink prio(O) nat(O)
>> vport_vxlan(O) openvswitch(O) nf_defrag_ipv6 gre ib_uverbs(OVE) hotpatch(OE)
>> sigma_serial(O) pmcint(O) guest_kbox_ram(O) signo_catch(O) mlx4_ib(OVE)
>> mlx4_en(OVE) vxlan ip6_udp_tunnel udp_tunnel ib_sa(OVE) ib_mad(OVE) ptp
>> ib_core(OVE) ib_addr(OVE) pps_core ib_netlink(OVE) intel_powerclamp coretemp
>> intel_rapl crc32_pclmul crc32c_intel ipmi_devintf(OVE) ghash_clmulni_intel
>> mlx4_core(OVE) aesni_intel kvm_intel(O) igb(OVE) lrw gf128mul i2c_algo_bit
>> glue_helper
>> [1548779.125577]  iTCO_wdt ablk_helper kvm(O) iTCO_vendor_support sg cryptd
>> compat(OVE) dca sb_edac i2c_i801 kbox_pci(OVE) i2c_core edac_core pcspkr
>> lpc_ich shpchp mfd_core ipmi_si(OVE) ipmi_msghandler(OVE) nf_conntrack_ipv4
>> nf_defrag_ipv4 vhost_net(O) tun(O) vhost(O) macvtap macvlan vfio_pci irqbypass
>> vfio_iommu_type1 vfio xt_sctp nf_conntrack_proto_sctp nf_nat_proto_sctp nf_nat
>> nf_conntrack sctp libcrc32c ip_tables ext3 mbcache jbd sd_mod crc_t10dif
>> crct10dif_generic ahci crct10dif_pclmul crct10dif_common libahci mpt2sas(OVE)
>> libata raid_class scsi_transport_sas dm_mod [last unloaded: kbox]
>> [1548779.178704] CPU: 35 PID: 33769 Comm: migration Tainted: G           OE
>> ----V-------   3.10.0-327.36.58.10_2.x86_64 #1
>> [1548779.189855] Hardware name: HUAWEI TECHNOLOGIES CO.,LTD.
>> CH80GPUB8/CH80GPUB8, BIOS GPUBV201 06/18/2015
>> [1548779.199585] task: ffff880123da0000 ti: ffff8801a8068000 task.ti:
>> ffff8801a8068000
>> [1548779.207596] RIP: 0010:[<ffffffffa04d9b0a>] [<ffffffffa04d9b0a>]
>> __rmap_clear_dirty+0x4a/0xf0 [kvm]
>> [1548779.217209] RSP: 0018:ffff8801a806bba0  EFLAGS: 00010206
>> [1548779.222879] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
>> 0000000000000000
>> [1548779.230543] RDX: ffffc90059d64000 RSI: ffffc90059f2efd0 RDI:
>> 000000007f183000
>> [1548779.238211] RBP: ffff8801a806bbc0 R08: ffff8801a806bd60 R09:
>> 0000000000000100
>> [1548779.245878] R10: 0000000000003649 R11: 0000000000000006 R12:
>> ffff88027a7f0000
>> [1548779.253524] R13: ffffffffa04d9ac0 R14: ffff8801a806bfd8 R15:
>> 0000000000000000
>> [1548779.261125] FS:  00007f3dd6aff700(0000) GS:ffff883ffe3c0000(0000)
>> knlGS:0000000000000000
>> [1548779.269733] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [1548779.275795] CR2: 000000007f183000 CR3: 00000001f50ab000 CR4:
>> 00000000003427e0
>> [1548779.283444] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>> 0000000000000000
>> [1548779.291040] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
>> 0000000000000400
>> [1548779.298689] Stack:
>> [1548779.301045]  0000000000000000 0000000000000297 0000000009b989ac
>> 0000000000000000
>> [1548779.309033]  ffff8801a806bc48 ffffffffa04d9206 00ff8801a806bd10
>> ffff8801a806bd60
>> [1548779.317023]  0000000000000100 00000000000bffff 0000000100000001
>> 00000000000396fa
>> [1548779.325012] Call Trace:
>> [1548779.327856]  [<ffffffffa04d9206>] slot_handle_level_range+0x76/0x100 [kvm]
>> [1548779.335090]  [<ffffffffa04dac34>] kvm_mmu_slot_leaf_clear_dirty+0x54/0x80
>> [kvm]
>> [1548779.342922]  [<ffffffffa03f8967>] vmx_slot_enable_log_dirty+0x17/0x30
>> [kvm_intel]
>> [1548779.350943]  [<ffffffffa04d5d2f>] kvm_arch_commit_memory_region+0x5f/0xe0
>> [kvm]
>> [1548779.358776]  [<ffffffffa04be7c5>] __kvm_set_memory_region+0x665/0xa90 [kvm]
>> [1548779.366099]  [<ffffffffa04be160>] ? kvm_kvzalloc+0x30/0x30 [kvm]
>> [1548779.372457]  [<ffffffffa04bec1b>] kvm_set_memory_region+0x2b/0x40 [kvm]
>> [1548779.379423]  [<ffffffffa04bf084>] kvm_vm_ioctl+0x454/0x700 [kvm]
>> [1548779.385777]  [<ffffffff8112f083>] ? ftrace_ops_list_func+0x83/0x110
>> [1548779.392406]  [<ffffffffa04bec30>] ? kvm_set_memory_region+0x40/0x40 [kvm]
>> [1548779.399538]  [<ffffffff811fceb5>] do_vfs_ioctl+0x2e5/0x4c0
>> [1548779.405340]  [<ffffffff811fd131>] SyS_ioctl+0xa1/0xc0
>> [1548779.410736]  [<ffffffff81652809>] system_call_fastpath+0x16/0x1b
>> [1548779.417086] Code: c0 48 85 ff 0f 84 8f 00 00 00 40 f6 c7 01 0f 84 a5 00
>> 00 00 48 83 e7 fe c7 45 e8 00 00 00 00 48 89 7d e0 48 8b 3f 48 85 ff 74 6e
>> <48> 8b 17 f6 c2 01 74 5d 48 8b 05 c7 40 04 00 48 89 d1 31 db 48
>> [1548779.438015] RIP  [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
>> [1548779.445109]  RSP <ffff8801a806bba0>
>> [1548779.448915] CR2: 000000007f183000
>> [1548779.453070] ---[ end trace d99f28b71290c64c ]---
>> [1548779.523937] Kernel panic - not syncing: Fatal exception
>> [1548779.523945] perf interrupt took too long (124114 > 17857), lowering
>> kernel.perf_event_max_sample_rate to 7000
>> [1548779.524826] perf interrupt took too long (123163 > 31250), lowering
>> kernel.perf_event_max_sample_rate to 4000
>> [1548779.525717] perf interrupt took too long (122216 > 62500), lowering
>> kernel.perf_event_max_sample_rate to 2000
>> [1548779.603824] die even has been record!
>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration
@ 2017-02-10 15:28     ` Chris Friesen
  0 siblings, 0 replies; 39+ messages in thread
From: Chris Friesen @ 2017-02-10 15:28 UTC (permalink / raw)
  To: Herongguang (Stephen), pbonzini, rkrcmar
  Cc: kvm, qemu-devel, fangying1, wangxin (U)


Well, not so much solved as worked around it.

It seems that the problem only showed up on Broadwell, which made us wonder 
about something hardware specific.

Setting "kvm-intel.eptad=0" in the kernel boot args seems to mask the problem 
for us.

Chris

On 02/10/2017 03:11 AM, Herongguang (Stephen) wrote:
> Hi, Chris Friesen, did you solve the problem?
>
> On 2017/2/9 22:37, Herongguang (Stephen) wrote:
>> Hi.
>> I had a problem when I just repeatedly live migrate a vm between two compute
>> nodes.
>> The phenomenon was that the KVM module was crashed and then the host rebooted.
>> However I cannot reliably trigger this BUG.
>>
>> The backtrace is the same as http://www.spinics.net/lists/kvm/msg138475.html.
>>
>> The crash is triggered when function __rmap_clear_dirty is invoked and an
>> invalid page(0x000000007f183000) is visited.
>> The value %rdi is 0x000000007f183000, which is obviously not a valid kernel
>> pointer for x86_64.
>> The assembly of __rmap_clear_dirty is:
>>      0xffffffffa04d9ac0 <__rmap_clear_dirty>:        callq 0xffffffff816543d0
>> <ftrace_caller>
>>      0xffffffffa04d9ac5 <__rmap_clear_dirty+5>:      push %rbp
>>      0xffffffffa04d9ac6 <__rmap_clear_dirty+6>:      mov %rsp,%rbp
>>      0xffffffffa04d9ac9 <__rmap_clear_dirty+9>:      push %rbx
>>      0xffffffffa04d9aca <__rmap_clear_dirty+10>:     sub $0x18,%rsp
>>      0xffffffffa04d9ace <__rmap_clear_dirty+14>:     mov (%rsi),%rdi
>>      0xffffffffa04d9ad1 <__rmap_clear_dirty+17>:     mov %gs:0x28,%rax
>>      0xffffffffa04d9ada <__rmap_clear_dirty+26>:     mov %rax,-0x10(%rbp)
>>      0xffffffffa04d9ade <__rmap_clear_dirty+30>:     xor %eax,%eax
>>      0xffffffffa04d9ae0 <__rmap_clear_dirty+32>:     test %rdi,%rdi
>>      0xffffffffa04d9ae3 <__rmap_clear_dirty+35>:     je 0xffffffffa04d9b78
>>      0xffffffffa04d9ae9 <__rmap_clear_dirty+41>:     test $0x1,%dil
>>      0xffffffffa04d9aed <__rmap_clear_dirty+45>:     je 0xffffffffa04d9b98
>>      0xffffffffa04d9af3 <__rmap_clear_dirty+51>:     and $0xfffffffffffffffe,%rdi
>>      0xffffffffa04d9af7 <__rmap_clear_dirty+55>:     movl $0x0,-0x18(%rbp)
>>      0xffffffffa04d9afe <__rmap_clear_dirty+62>:     mov %rdi,-0x20(%rbp)
>>      0xffffffffa04d9b02 <__rmap_clear_dirty+66>:     mov (%rdi),%rdi
>>      0xffffffffa04d9b05 <__rmap_clear_dirty+69>:     test %rdi,%rdi
>>                                     ...
>>
>> The details of the environment is:
>> Host info: x86_64 CentOS 7 kernel (3.10.0-327.36.58.10_2.x86_64, slightly
>> modified). The CPU is Broadwell Intel(R) Xeon(R) CPU E5-2618L v4 @ 2.20GHz.
>> Kmod info: version is 4.4.36
>> What I know is that the parameter PML(Page Modification Logging) is enabled by
>> default.
>>      # cat /sys/module/kvm_intel/parameters/pml
>>      # Y
>>
>> Below is the crash message:
>> [1548777.924180] kvm: zapping shadow pages for mmio generation wraparound
>> [1548777.947844] HTB: quantum of class 10001 is big. Consider r2q change.
>> [1548778.185389] kvm: zapping shadow pages for mmio generation wraparound
>> [1548778.994801] BUG: unable to handle kernel paging request at 000000007f183000
>> [1548779.002135] IP: [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
>> [1548779.009151] PGD 1f8452067 PUD 0
>> [1548779.012774] Thread overran stack, or stack corrupted
>> [1548779.018076] Oops: 0000 [#1] SMP
>> [1548779.027050] collected_len = 1048570, LOG_BUF_LEN_LOCAL = 1048576
>> [1548779.042039] kbox: no notify die func register. no need to notify
>> [1548779.048392] do nothing after die!
>> [1548779.052071] Modules linked in: kboxdriver(O) kbox(O) sch_htb
>> ip_set_hash_net ip6table_filter ip6_tables iptable_filter igb_uio(OE) uio
>> bridge dm_service_time dm_multipath iscsi_tcp libiscsi_tcp libiscsi
>> scsi_transport_iscsi 8021q garp stp mrp llc vfat fat isofs ext4 jbd2 xfs
>> sha512_generic dev_connlimit(O) bum(O) ip_set nfnetlink prio(O) nat(O)
>> vport_vxlan(O) openvswitch(O) nf_defrag_ipv6 gre ib_uverbs(OVE) hotpatch(OE)
>> sigma_serial(O) pmcint(O) guest_kbox_ram(O) signo_catch(O) mlx4_ib(OVE)
>> mlx4_en(OVE) vxlan ip6_udp_tunnel udp_tunnel ib_sa(OVE) ib_mad(OVE) ptp
>> ib_core(OVE) ib_addr(OVE) pps_core ib_netlink(OVE) intel_powerclamp coretemp
>> intel_rapl crc32_pclmul crc32c_intel ipmi_devintf(OVE) ghash_clmulni_intel
>> mlx4_core(OVE) aesni_intel kvm_intel(O) igb(OVE) lrw gf128mul i2c_algo_bit
>> glue_helper
>> [1548779.125577]  iTCO_wdt ablk_helper kvm(O) iTCO_vendor_support sg cryptd
>> compat(OVE) dca sb_edac i2c_i801 kbox_pci(OVE) i2c_core edac_core pcspkr
>> lpc_ich shpchp mfd_core ipmi_si(OVE) ipmi_msghandler(OVE) nf_conntrack_ipv4
>> nf_defrag_ipv4 vhost_net(O) tun(O) vhost(O) macvtap macvlan vfio_pci irqbypass
>> vfio_iommu_type1 vfio xt_sctp nf_conntrack_proto_sctp nf_nat_proto_sctp nf_nat
>> nf_conntrack sctp libcrc32c ip_tables ext3 mbcache jbd sd_mod crc_t10dif
>> crct10dif_generic ahci crct10dif_pclmul crct10dif_common libahci mpt2sas(OVE)
>> libata raid_class scsi_transport_sas dm_mod [last unloaded: kbox]
>> [1548779.178704] CPU: 35 PID: 33769 Comm: migration Tainted: G           OE
>> ----V-------   3.10.0-327.36.58.10_2.x86_64 #1
>> [1548779.189855] Hardware name: HUAWEI TECHNOLOGIES CO.,LTD.
>> CH80GPUB8/CH80GPUB8, BIOS GPUBV201 06/18/2015
>> [1548779.199585] task: ffff880123da0000 ti: ffff8801a8068000 task.ti:
>> ffff8801a8068000
>> [1548779.207596] RIP: 0010:[<ffffffffa04d9b0a>] [<ffffffffa04d9b0a>]
>> __rmap_clear_dirty+0x4a/0xf0 [kvm]
>> [1548779.217209] RSP: 0018:ffff8801a806bba0  EFLAGS: 00010206
>> [1548779.222879] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
>> 0000000000000000
>> [1548779.230543] RDX: ffffc90059d64000 RSI: ffffc90059f2efd0 RDI:
>> 000000007f183000
>> [1548779.238211] RBP: ffff8801a806bbc0 R08: ffff8801a806bd60 R09:
>> 0000000000000100
>> [1548779.245878] R10: 0000000000003649 R11: 0000000000000006 R12:
>> ffff88027a7f0000
>> [1548779.253524] R13: ffffffffa04d9ac0 R14: ffff8801a806bfd8 R15:
>> 0000000000000000
>> [1548779.261125] FS:  00007f3dd6aff700(0000) GS:ffff883ffe3c0000(0000)
>> knlGS:0000000000000000
>> [1548779.269733] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [1548779.275795] CR2: 000000007f183000 CR3: 00000001f50ab000 CR4:
>> 00000000003427e0
>> [1548779.283444] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>> 0000000000000000
>> [1548779.291040] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
>> 0000000000000400
>> [1548779.298689] Stack:
>> [1548779.301045]  0000000000000000 0000000000000297 0000000009b989ac
>> 0000000000000000
>> [1548779.309033]  ffff8801a806bc48 ffffffffa04d9206 00ff8801a806bd10
>> ffff8801a806bd60
>> [1548779.317023]  0000000000000100 00000000000bffff 0000000100000001
>> 00000000000396fa
>> [1548779.325012] Call Trace:
>> [1548779.327856]  [<ffffffffa04d9206>] slot_handle_level_range+0x76/0x100 [kvm]
>> [1548779.335090]  [<ffffffffa04dac34>] kvm_mmu_slot_leaf_clear_dirty+0x54/0x80
>> [kvm]
>> [1548779.342922]  [<ffffffffa03f8967>] vmx_slot_enable_log_dirty+0x17/0x30
>> [kvm_intel]
>> [1548779.350943]  [<ffffffffa04d5d2f>] kvm_arch_commit_memory_region+0x5f/0xe0
>> [kvm]
>> [1548779.358776]  [<ffffffffa04be7c5>] __kvm_set_memory_region+0x665/0xa90 [kvm]
>> [1548779.366099]  [<ffffffffa04be160>] ? kvm_kvzalloc+0x30/0x30 [kvm]
>> [1548779.372457]  [<ffffffffa04bec1b>] kvm_set_memory_region+0x2b/0x40 [kvm]
>> [1548779.379423]  [<ffffffffa04bf084>] kvm_vm_ioctl+0x454/0x700 [kvm]
>> [1548779.385777]  [<ffffffff8112f083>] ? ftrace_ops_list_func+0x83/0x110
>> [1548779.392406]  [<ffffffffa04bec30>] ? kvm_set_memory_region+0x40/0x40 [kvm]
>> [1548779.399538]  [<ffffffff811fceb5>] do_vfs_ioctl+0x2e5/0x4c0
>> [1548779.405340]  [<ffffffff811fd131>] SyS_ioctl+0xa1/0xc0
>> [1548779.410736]  [<ffffffff81652809>] system_call_fastpath+0x16/0x1b
>> [1548779.417086] Code: c0 48 85 ff 0f 84 8f 00 00 00 40 f6 c7 01 0f 84 a5 00
>> 00 00 48 83 e7 fe c7 45 e8 00 00 00 00 48 89 7d e0 48 8b 3f 48 85 ff 74 6e
>> <48> 8b 17 f6 c2 01 74 5d 48 8b 05 c7 40 04 00 48 89 d1 31 db 48
>> [1548779.438015] RIP  [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
>> [1548779.445109]  RSP <ffff8801a806bba0>
>> [1548779.448915] CR2: 000000007f183000
>> [1548779.453070] ---[ end trace d99f28b71290c64c ]---
>> [1548779.523937] Kernel panic - not syncing: Fatal exception
>> [1548779.523945] perf interrupt took too long (124114 > 17857), lowering
>> kernel.perf_event_max_sample_rate to 7000
>> [1548779.524826] perf interrupt took too long (123163 > 31250), lowering
>> kernel.perf_event_max_sample_rate to 4000
>> [1548779.525717] perf interrupt took too long (122216 > 62500), lowering
>> kernel.perf_event_max_sample_rate to 2000
>> [1548779.603824] die even has been record!
>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: kvm bug in __rmap_clear_dirty during live migration
  2017-02-10 15:28     ` [Qemu-devel] " Chris Friesen
@ 2017-02-10 16:14       ` Paolo Bonzini
  -1 siblings, 0 replies; 39+ messages in thread
From: Paolo Bonzini @ 2017-02-10 16:14 UTC (permalink / raw)
  To: Chris Friesen
  Cc: Herongguang (Stephen), rkrcmar, kvm, qemu-devel, fangying1, wangxin (U)

> Well, not so much solved as worked around it.
> 
> It seems that the problem only showed up on Broadwell, which made us wonder
> about something hardware specific.

Yes, the bug seems to be related to PML.  I am looking at the code,
but I haven't found anything yet.

Paolo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration
@ 2017-02-10 16:14       ` Paolo Bonzini
  0 siblings, 0 replies; 39+ messages in thread
From: Paolo Bonzini @ 2017-02-10 16:14 UTC (permalink / raw)
  To: Chris Friesen
  Cc: Herongguang (Stephen), rkrcmar, kvm, qemu-devel, fangying1, wangxin (U)

> Well, not so much solved as worked around it.
> 
> It seems that the problem only showed up on Broadwell, which made us wonder
> about something hardware specific.

Yes, the bug seems to be related to PML.  I am looking at the code,
but I haven't found anything yet.

Paolo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration
  2017-02-10 15:28     ` [Qemu-devel] " Chris Friesen
@ 2017-02-13 13:19       ` hangaohuai
  -1 siblings, 0 replies; 39+ messages in thread
From: hangaohuai @ 2017-02-13 13:19 UTC (permalink / raw)
  To: Chris Friesen
  Cc: herongguang.he, wangxin (U),
	fangying1, qemu-devel, kvm, rkrcmar, pbonzini

Hi,  Chris Friesen.
We notice that you can reliably trigger the BUG during the live-migration stress test, however we can't.
Could you descripe your test steps so that we can re-trigger the BUG and get more information about it ?

On 2017/2/10 23:28, Chris Friesen wrote:
>
> Well, not so much solved as worked around it.
>
> It seems that the problem only showed up on Broadwell, which made us wonder about something hardware specific.
>
> Setting "kvm-intel.eptad=0" in the kernel boot args seems to mask the problem for us.
>
> Chris
>
> On 02/10/2017 03:11 AM, Herongguang (Stephen) wrote:
>> Hi, Chris Friesen, did you solve the problem?
>>
>> On 2017/2/9 22:37, Herongguang (Stephen) wrote:
>>> Hi.
>>> I had a problem when I just repeatedly live migrate a vm between two compute
>>> nodes.
>>> The phenomenon was that the KVM module was crashed and then the host rebooted.
>>> However I cannot reliably trigger this BUG.
>>>
>>> The backtrace is the same as http://www.spinics.net/lists/kvm/msg138475.html.
>>>
>>> The crash is triggered when function __rmap_clear_dirty is invoked and an
>>> invalid page(0x000000007f183000) is visited.
>>> The value %rdi is 0x000000007f183000, which is obviously not a valid kernel
>>> pointer for x86_64.
>>> The assembly of __rmap_clear_dirty is:
>>>      0xffffffffa04d9ac0 <__rmap_clear_dirty>:        callq 0xffffffff816543d0
>>> <ftrace_caller>
>>>      0xffffffffa04d9ac5 <__rmap_clear_dirty+5>:      push %rbp
>>>      0xffffffffa04d9ac6 <__rmap_clear_dirty+6>:      mov %rsp,%rbp
>>>      0xffffffffa04d9ac9 <__rmap_clear_dirty+9>:      push %rbx
>>>      0xffffffffa04d9aca <__rmap_clear_dirty+10>:     sub $0x18,%rsp
>>>      0xffffffffa04d9ace <__rmap_clear_dirty+14>:     mov (%rsi),%rdi
>>>      0xffffffffa04d9ad1 <__rmap_clear_dirty+17>:     mov %gs:0x28,%rax
>>>      0xffffffffa04d9ada <__rmap_clear_dirty+26>:     mov %rax,-0x10(%rbp)
>>>      0xffffffffa04d9ade <__rmap_clear_dirty+30>:     xor %eax,%eax
>>>      0xffffffffa04d9ae0 <__rmap_clear_dirty+32>:     test %rdi,%rdi
>>>      0xffffffffa04d9ae3 <__rmap_clear_dirty+35>:     je 0xffffffffa04d9b78
>>>      0xffffffffa04d9ae9 <__rmap_clear_dirty+41>:     test $0x1,%dil
>>>      0xffffffffa04d9aed <__rmap_clear_dirty+45>:     je 0xffffffffa04d9b98
>>>      0xffffffffa04d9af3 <__rmap_clear_dirty+51>:     and $0xfffffffffffffffe,%rdi
>>>      0xffffffffa04d9af7 <__rmap_clear_dirty+55>:     movl $0x0,-0x18(%rbp)
>>>      0xffffffffa04d9afe <__rmap_clear_dirty+62>:     mov %rdi,-0x20(%rbp)
>>>      0xffffffffa04d9b02 <__rmap_clear_dirty+66>:     mov (%rdi),%rdi
>>>      0xffffffffa04d9b05 <__rmap_clear_dirty+69>:     test %rdi,%rdi
>>>                                     ...
>>>
>>> The details of the environment is:
>>> Host info: x86_64 CentOS 7 kernel (3.10.0-327.36.58.10_2.x86_64, slightly
>>> modified). The CPU is Broadwell Intel(R) Xeon(R) CPU E5-2618L v4 @ 2.20GHz.
>>> Kmod info: version is 4.4.36
>>> What I know is that the parameter PML(Page Modification Logging) is enabled by
>>> default.
>>>      # cat /sys/module/kvm_intel/parameters/pml
>>>      # Y
>>>
>>> Below is the crash message:
>>> [1548777.924180] kvm: zapping shadow pages for mmio generation wraparound
>>> [1548777.947844] HTB: quantum of class 10001 is big. Consider r2q change.
>>> [1548778.185389] kvm: zapping shadow pages for mmio generation wraparound
>>> [1548778.994801] BUG: unable to handle kernel paging request at 000000007f183000
>>> [1548779.002135] IP: [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
>>> [1548779.009151] PGD 1f8452067 PUD 0
>>> [1548779.012774] Thread overran stack, or stack corrupted
>>> [1548779.018076] Oops: 0000 [#1] SMP
>>> [1548779.027050] collected_len = 1048570, LOG_BUF_LEN_LOCAL = 1048576
>>> [1548779.042039] kbox: no notify die func register. no need to notify
>>> [1548779.048392] do nothing after die!
>>> [1548779.052071] Modules linked in: kboxdriver(O) kbox(O) sch_htb
>>> ip_set_hash_net ip6table_filter ip6_tables iptable_filter igb_uio(OE) uio
>>> bridge dm_service_time dm_multipath iscsi_tcp libiscsi_tcp libiscsi
>>> scsi_transport_iscsi 8021q garp stp mrp llc vfat fat isofs ext4 jbd2 xfs
>>> sha512_generic dev_connlimit(O) bum(O) ip_set nfnetlink prio(O) nat(O)
>>> vport_vxlan(O) openvswitch(O) nf_defrag_ipv6 gre ib_uverbs(OVE) hotpatch(OE)
>>> sigma_serial(O) pmcint(O) guest_kbox_ram(O) signo_catch(O) mlx4_ib(OVE)
>>> mlx4_en(OVE) vxlan ip6_udp_tunnel udp_tunnel ib_sa(OVE) ib_mad(OVE) ptp
>>> ib_core(OVE) ib_addr(OVE) pps_core ib_netlink(OVE) intel_powerclamp coretemp
>>> intel_rapl crc32_pclmul crc32c_intel ipmi_devintf(OVE) ghash_clmulni_intel
>>> mlx4_core(OVE) aesni_intel kvm_intel(O) igb(OVE) lrw gf128mul i2c_algo_bit
>>> glue_helper
>>> [1548779.125577]  iTCO_wdt ablk_helper kvm(O) iTCO_vendor_support sg cryptd
>>> compat(OVE) dca sb_edac i2c_i801 kbox_pci(OVE) i2c_core edac_core pcspkr
>>> lpc_ich shpchp mfd_core ipmi_si(OVE) ipmi_msghandler(OVE) nf_conntrack_ipv4
>>> nf_defrag_ipv4 vhost_net(O) tun(O) vhost(O) macvtap macvlan vfio_pci irqbypass
>>> vfio_iommu_type1 vfio xt_sctp nf_conntrack_proto_sctp nf_nat_proto_sctp nf_nat
>>> nf_conntrack sctp libcrc32c ip_tables ext3 mbcache jbd sd_mod crc_t10dif
>>> crct10dif_generic ahci crct10dif_pclmul crct10dif_common libahci mpt2sas(OVE)
>>> libata raid_class scsi_transport_sas dm_mod [last unloaded: kbox]
>>> [1548779.178704] CPU: 35 PID: 33769 Comm: migration Tainted: G           OE
>>> ----V-------   3.10.0-327.36.58.10_2.x86_64 #1
>>> [1548779.189855] Hardware name: HUAWEI TECHNOLOGIES CO.,LTD.
>>> CH80GPUB8/CH80GPUB8, BIOS GPUBV201 06/18/2015
>>> [1548779.199585] task: ffff880123da0000 ti: ffff8801a8068000 task.ti:
>>> ffff8801a8068000
>>> [1548779.207596] RIP: 0010:[<ffffffffa04d9b0a>] [<ffffffffa04d9b0a>]
>>> __rmap_clear_dirty+0x4a/0xf0 [kvm]
>>> [1548779.217209] RSP: 0018:ffff8801a806bba0  EFLAGS: 00010206
>>> [1548779.222879] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
>>> 0000000000000000
>>> [1548779.230543] RDX: ffffc90059d64000 RSI: ffffc90059f2efd0 RDI:
>>> 000000007f183000
>>> [1548779.238211] RBP: ffff8801a806bbc0 R08: ffff8801a806bd60 R09:
>>> 0000000000000100
>>> [1548779.245878] R10: 0000000000003649 R11: 0000000000000006 R12:
>>> ffff88027a7f0000
>>> [1548779.253524] R13: ffffffffa04d9ac0 R14: ffff8801a806bfd8 R15:
>>> 0000000000000000
>>> [1548779.261125] FS:  00007f3dd6aff700(0000) GS:ffff883ffe3c0000(0000)
>>> knlGS:0000000000000000
>>> [1548779.269733] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [1548779.275795] CR2: 000000007f183000 CR3: 00000001f50ab000 CR4:
>>> 00000000003427e0
>>> [1548779.283444] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>>> 0000000000000000
>>> [1548779.291040] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
>>> 0000000000000400
>>> [1548779.298689] Stack:
>>> [1548779.301045]  0000000000000000 0000000000000297 0000000009b989ac
>>> 0000000000000000
>>> [1548779.309033]  ffff8801a806bc48 ffffffffa04d9206 00ff8801a806bd10
>>> ffff8801a806bd60
>>> [1548779.317023]  0000000000000100 00000000000bffff 0000000100000001
>>> 00000000000396fa
>>> [1548779.325012] Call Trace:
>>> [1548779.327856]  [<ffffffffa04d9206>] slot_handle_level_range+0x76/0x100 [kvm]
>>> [1548779.335090]  [<ffffffffa04dac34>] kvm_mmu_slot_leaf_clear_dirty+0x54/0x80
>>> [kvm]
>>> [1548779.342922]  [<ffffffffa03f8967>] vmx_slot_enable_log_dirty+0x17/0x30
>>> [kvm_intel]
>>> [1548779.350943]  [<ffffffffa04d5d2f>] kvm_arch_commit_memory_region+0x5f/0xe0
>>> [kvm]
>>> [1548779.358776]  [<ffffffffa04be7c5>] __kvm_set_memory_region+0x665/0xa90 [kvm]
>>> [1548779.366099]  [<ffffffffa04be160>] ? kvm_kvzalloc+0x30/0x30 [kvm]
>>> [1548779.372457]  [<ffffffffa04bec1b>] kvm_set_memory_region+0x2b/0x40 [kvm]
>>> [1548779.379423]  [<ffffffffa04bf084>] kvm_vm_ioctl+0x454/0x700 [kvm]
>>> [1548779.385777]  [<ffffffff8112f083>] ? ftrace_ops_list_func+0x83/0x110
>>> [1548779.392406]  [<ffffffffa04bec30>] ? kvm_set_memory_region+0x40/0x40 [kvm]
>>> [1548779.399538]  [<ffffffff811fceb5>] do_vfs_ioctl+0x2e5/0x4c0
>>> [1548779.405340]  [<ffffffff811fd131>] SyS_ioctl+0xa1/0xc0
>>> [1548779.410736]  [<ffffffff81652809>] system_call_fastpath+0x16/0x1b
>>> [1548779.417086] Code: c0 48 85 ff 0f 84 8f 00 00 00 40 f6 c7 01 0f 84 a5 00
>>> 00 00 48 83 e7 fe c7 45 e8 00 00 00 00 48 89 7d e0 48 8b 3f 48 85 ff 74 6e
>>> <48> 8b 17 f6 c2 01 74 5d 48 8b 05 c7 40 04 00 48 89 d1 31 db 48
>>> [1548779.438015] RIP  [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
>>> [1548779.445109]  RSP <ffff8801a806bba0>
>>> [1548779.448915] CR2: 000000007f183000
>>> [1548779.453070] ---[ end trace d99f28b71290c64c ]---
>>> [1548779.523937] Kernel panic - not syncing: Fatal exception
>>> [1548779.523945] perf interrupt took too long (124114 > 17857), lowering
>>> kernel.perf_event_max_sample_rate to 7000
>>> [1548779.524826] perf interrupt took too long (123163 > 31250), lowering
>>> kernel.perf_event_max_sample_rate to 4000
>>> [1548779.525717] perf interrupt took too long (122216 > 62500), lowering
>>> kernel.perf_event_max_sample_rate to 2000
>>> [1548779.603824] die even has been record!
>>
>
>
>
> .
>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration
@ 2017-02-13 13:19       ` hangaohuai
  0 siblings, 0 replies; 39+ messages in thread
From: hangaohuai @ 2017-02-13 13:19 UTC (permalink / raw)
  To: Chris Friesen
  Cc: herongguang.he, wangxin (U),
	fangying1, qemu-devel, kvm, rkrcmar, pbonzini

Hi,  Chris Friesen.
We notice that you can reliably trigger the BUG during the live-migration stress test, however we can't.
Could you descripe your test steps so that we can re-trigger the BUG and get more information about it ?

On 2017/2/10 23:28, Chris Friesen wrote:
>
> Well, not so much solved as worked around it.
>
> It seems that the problem only showed up on Broadwell, which made us wonder about something hardware specific.
>
> Setting "kvm-intel.eptad=0" in the kernel boot args seems to mask the problem for us.
>
> Chris
>
> On 02/10/2017 03:11 AM, Herongguang (Stephen) wrote:
>> Hi, Chris Friesen, did you solve the problem?
>>
>> On 2017/2/9 22:37, Herongguang (Stephen) wrote:
>>> Hi.
>>> I had a problem when I just repeatedly live migrate a vm between two compute
>>> nodes.
>>> The phenomenon was that the KVM module was crashed and then the host rebooted.
>>> However I cannot reliably trigger this BUG.
>>>
>>> The backtrace is the same as http://www.spinics.net/lists/kvm/msg138475.html.
>>>
>>> The crash is triggered when function __rmap_clear_dirty is invoked and an
>>> invalid page(0x000000007f183000) is visited.
>>> The value %rdi is 0x000000007f183000, which is obviously not a valid kernel
>>> pointer for x86_64.
>>> The assembly of __rmap_clear_dirty is:
>>>      0xffffffffa04d9ac0 <__rmap_clear_dirty>:        callq 0xffffffff816543d0
>>> <ftrace_caller>
>>>      0xffffffffa04d9ac5 <__rmap_clear_dirty+5>:      push %rbp
>>>      0xffffffffa04d9ac6 <__rmap_clear_dirty+6>:      mov %rsp,%rbp
>>>      0xffffffffa04d9ac9 <__rmap_clear_dirty+9>:      push %rbx
>>>      0xffffffffa04d9aca <__rmap_clear_dirty+10>:     sub $0x18,%rsp
>>>      0xffffffffa04d9ace <__rmap_clear_dirty+14>:     mov (%rsi),%rdi
>>>      0xffffffffa04d9ad1 <__rmap_clear_dirty+17>:     mov %gs:0x28,%rax
>>>      0xffffffffa04d9ada <__rmap_clear_dirty+26>:     mov %rax,-0x10(%rbp)
>>>      0xffffffffa04d9ade <__rmap_clear_dirty+30>:     xor %eax,%eax
>>>      0xffffffffa04d9ae0 <__rmap_clear_dirty+32>:     test %rdi,%rdi
>>>      0xffffffffa04d9ae3 <__rmap_clear_dirty+35>:     je 0xffffffffa04d9b78
>>>      0xffffffffa04d9ae9 <__rmap_clear_dirty+41>:     test $0x1,%dil
>>>      0xffffffffa04d9aed <__rmap_clear_dirty+45>:     je 0xffffffffa04d9b98
>>>      0xffffffffa04d9af3 <__rmap_clear_dirty+51>:     and $0xfffffffffffffffe,%rdi
>>>      0xffffffffa04d9af7 <__rmap_clear_dirty+55>:     movl $0x0,-0x18(%rbp)
>>>      0xffffffffa04d9afe <__rmap_clear_dirty+62>:     mov %rdi,-0x20(%rbp)
>>>      0xffffffffa04d9b02 <__rmap_clear_dirty+66>:     mov (%rdi),%rdi
>>>      0xffffffffa04d9b05 <__rmap_clear_dirty+69>:     test %rdi,%rdi
>>>                                     ...
>>>
>>> The details of the environment is:
>>> Host info: x86_64 CentOS 7 kernel (3.10.0-327.36.58.10_2.x86_64, slightly
>>> modified). The CPU is Broadwell Intel(R) Xeon(R) CPU E5-2618L v4 @ 2.20GHz.
>>> Kmod info: version is 4.4.36
>>> What I know is that the parameter PML(Page Modification Logging) is enabled by
>>> default.
>>>      # cat /sys/module/kvm_intel/parameters/pml
>>>      # Y
>>>
>>> Below is the crash message:
>>> [1548777.924180] kvm: zapping shadow pages for mmio generation wraparound
>>> [1548777.947844] HTB: quantum of class 10001 is big. Consider r2q change.
>>> [1548778.185389] kvm: zapping shadow pages for mmio generation wraparound
>>> [1548778.994801] BUG: unable to handle kernel paging request at 000000007f183000
>>> [1548779.002135] IP: [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
>>> [1548779.009151] PGD 1f8452067 PUD 0
>>> [1548779.012774] Thread overran stack, or stack corrupted
>>> [1548779.018076] Oops: 0000 [#1] SMP
>>> [1548779.027050] collected_len = 1048570, LOG_BUF_LEN_LOCAL = 1048576
>>> [1548779.042039] kbox: no notify die func register. no need to notify
>>> [1548779.048392] do nothing after die!
>>> [1548779.052071] Modules linked in: kboxdriver(O) kbox(O) sch_htb
>>> ip_set_hash_net ip6table_filter ip6_tables iptable_filter igb_uio(OE) uio
>>> bridge dm_service_time dm_multipath iscsi_tcp libiscsi_tcp libiscsi
>>> scsi_transport_iscsi 8021q garp stp mrp llc vfat fat isofs ext4 jbd2 xfs
>>> sha512_generic dev_connlimit(O) bum(O) ip_set nfnetlink prio(O) nat(O)
>>> vport_vxlan(O) openvswitch(O) nf_defrag_ipv6 gre ib_uverbs(OVE) hotpatch(OE)
>>> sigma_serial(O) pmcint(O) guest_kbox_ram(O) signo_catch(O) mlx4_ib(OVE)
>>> mlx4_en(OVE) vxlan ip6_udp_tunnel udp_tunnel ib_sa(OVE) ib_mad(OVE) ptp
>>> ib_core(OVE) ib_addr(OVE) pps_core ib_netlink(OVE) intel_powerclamp coretemp
>>> intel_rapl crc32_pclmul crc32c_intel ipmi_devintf(OVE) ghash_clmulni_intel
>>> mlx4_core(OVE) aesni_intel kvm_intel(O) igb(OVE) lrw gf128mul i2c_algo_bit
>>> glue_helper
>>> [1548779.125577]  iTCO_wdt ablk_helper kvm(O) iTCO_vendor_support sg cryptd
>>> compat(OVE) dca sb_edac i2c_i801 kbox_pci(OVE) i2c_core edac_core pcspkr
>>> lpc_ich shpchp mfd_core ipmi_si(OVE) ipmi_msghandler(OVE) nf_conntrack_ipv4
>>> nf_defrag_ipv4 vhost_net(O) tun(O) vhost(O) macvtap macvlan vfio_pci irqbypass
>>> vfio_iommu_type1 vfio xt_sctp nf_conntrack_proto_sctp nf_nat_proto_sctp nf_nat
>>> nf_conntrack sctp libcrc32c ip_tables ext3 mbcache jbd sd_mod crc_t10dif
>>> crct10dif_generic ahci crct10dif_pclmul crct10dif_common libahci mpt2sas(OVE)
>>> libata raid_class scsi_transport_sas dm_mod [last unloaded: kbox]
>>> [1548779.178704] CPU: 35 PID: 33769 Comm: migration Tainted: G           OE
>>> ----V-------   3.10.0-327.36.58.10_2.x86_64 #1
>>> [1548779.189855] Hardware name: HUAWEI TECHNOLOGIES CO.,LTD.
>>> CH80GPUB8/CH80GPUB8, BIOS GPUBV201 06/18/2015
>>> [1548779.199585] task: ffff880123da0000 ti: ffff8801a8068000 task.ti:
>>> ffff8801a8068000
>>> [1548779.207596] RIP: 0010:[<ffffffffa04d9b0a>] [<ffffffffa04d9b0a>]
>>> __rmap_clear_dirty+0x4a/0xf0 [kvm]
>>> [1548779.217209] RSP: 0018:ffff8801a806bba0  EFLAGS: 00010206
>>> [1548779.222879] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
>>> 0000000000000000
>>> [1548779.230543] RDX: ffffc90059d64000 RSI: ffffc90059f2efd0 RDI:
>>> 000000007f183000
>>> [1548779.238211] RBP: ffff8801a806bbc0 R08: ffff8801a806bd60 R09:
>>> 0000000000000100
>>> [1548779.245878] R10: 0000000000003649 R11: 0000000000000006 R12:
>>> ffff88027a7f0000
>>> [1548779.253524] R13: ffffffffa04d9ac0 R14: ffff8801a806bfd8 R15:
>>> 0000000000000000
>>> [1548779.261125] FS:  00007f3dd6aff700(0000) GS:ffff883ffe3c0000(0000)
>>> knlGS:0000000000000000
>>> [1548779.269733] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [1548779.275795] CR2: 000000007f183000 CR3: 00000001f50ab000 CR4:
>>> 00000000003427e0
>>> [1548779.283444] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>>> 0000000000000000
>>> [1548779.291040] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
>>> 0000000000000400
>>> [1548779.298689] Stack:
>>> [1548779.301045]  0000000000000000 0000000000000297 0000000009b989ac
>>> 0000000000000000
>>> [1548779.309033]  ffff8801a806bc48 ffffffffa04d9206 00ff8801a806bd10
>>> ffff8801a806bd60
>>> [1548779.317023]  0000000000000100 00000000000bffff 0000000100000001
>>> 00000000000396fa
>>> [1548779.325012] Call Trace:
>>> [1548779.327856]  [<ffffffffa04d9206>] slot_handle_level_range+0x76/0x100 [kvm]
>>> [1548779.335090]  [<ffffffffa04dac34>] kvm_mmu_slot_leaf_clear_dirty+0x54/0x80
>>> [kvm]
>>> [1548779.342922]  [<ffffffffa03f8967>] vmx_slot_enable_log_dirty+0x17/0x30
>>> [kvm_intel]
>>> [1548779.350943]  [<ffffffffa04d5d2f>] kvm_arch_commit_memory_region+0x5f/0xe0
>>> [kvm]
>>> [1548779.358776]  [<ffffffffa04be7c5>] __kvm_set_memory_region+0x665/0xa90 [kvm]
>>> [1548779.366099]  [<ffffffffa04be160>] ? kvm_kvzalloc+0x30/0x30 [kvm]
>>> [1548779.372457]  [<ffffffffa04bec1b>] kvm_set_memory_region+0x2b/0x40 [kvm]
>>> [1548779.379423]  [<ffffffffa04bf084>] kvm_vm_ioctl+0x454/0x700 [kvm]
>>> [1548779.385777]  [<ffffffff8112f083>] ? ftrace_ops_list_func+0x83/0x110
>>> [1548779.392406]  [<ffffffffa04bec30>] ? kvm_set_memory_region+0x40/0x40 [kvm]
>>> [1548779.399538]  [<ffffffff811fceb5>] do_vfs_ioctl+0x2e5/0x4c0
>>> [1548779.405340]  [<ffffffff811fd131>] SyS_ioctl+0xa1/0xc0
>>> [1548779.410736]  [<ffffffff81652809>] system_call_fastpath+0x16/0x1b
>>> [1548779.417086] Code: c0 48 85 ff 0f 84 8f 00 00 00 40 f6 c7 01 0f 84 a5 00
>>> 00 00 48 83 e7 fe c7 45 e8 00 00 00 00 48 89 7d e0 48 8b 3f 48 85 ff 74 6e
>>> <48> 8b 17 f6 c2 01 74 5d 48 8b 05 c7 40 04 00 48 89 d1 31 db 48
>>> [1548779.438015] RIP  [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
>>> [1548779.445109]  RSP <ffff8801a806bba0>
>>> [1548779.448915] CR2: 000000007f183000
>>> [1548779.453070] ---[ end trace d99f28b71290c64c ]---
>>> [1548779.523937] Kernel panic - not syncing: Fatal exception
>>> [1548779.523945] perf interrupt took too long (124114 > 17857), lowering
>>> kernel.perf_event_max_sample_rate to 7000
>>> [1548779.524826] perf interrupt took too long (123163 > 31250), lowering
>>> kernel.perf_event_max_sample_rate to 4000
>>> [1548779.525717] perf interrupt took too long (122216 > 62500), lowering
>>> kernel.perf_event_max_sample_rate to 2000
>>> [1548779.603824] die even has been record!
>>
>
>
>
> .
>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: kvm bug in __rmap_clear_dirty during live migration
  2017-02-10 15:28     ` [Qemu-devel] " Chris Friesen
@ 2017-02-20  2:17       ` hangaohuai
  -1 siblings, 0 replies; 39+ messages in thread
From: hangaohuai @ 2017-02-20  2:17 UTC (permalink / raw)
  To: kai.huang, guangrong.xiao, xudong.hao
  Cc: kvm, rkrcmar, wangxin (U),
	qemu-devel, pbonzini, fangying1, Chris Friesen,
	Herongguang (Stephen)

Hi, Kai Huang and Xiao Guangrong.

For the problem mentioned above, there may be a bug related to PML and probably on Broadwell CPUs.

I've been reading the code for PML for days, but I haven't found any clews. Do you have any idea about this BUG ?

Hope you can help!

On 2017/2/10 23:28, Chris Friesen wrote:
>
> Well, not so much solved as worked around it.
>
> It seems that the problem only showed up on Broadwell, which made us wonder about something hardware specific.
>
> Setting "kvm-intel.eptad=0" in the kernel boot args seems to mask the problem for us.
>
> Chris
>
> On 02/10/2017 03:11 AM, Herongguang (Stephen) wrote:
>> Hi, Chris Friesen, did you solve the problem?
>>
>> On 2017/2/9 22:37, Herongguang (Stephen) wrote:
>>> Hi.
>>> I had a problem when I just repeatedly live migrate a vm between two compute
>>> nodes.
>>> The phenomenon was that the KVM module was crashed and then the host rebooted.
>>> However I cannot reliably trigger this BUG.
>>>
>>> The backtrace is the same as http://www.spinics.net/lists/kvm/msg138475.html.
>>>
>>> The crash is triggered when function __rmap_clear_dirty is invoked and an
>>> invalid page(0x000000007f183000) is visited.
>>> The value %rdi is 0x000000007f183000, which is obviously not a valid kernel
>>> pointer for x86_64.
>>> The assembly of __rmap_clear_dirty is:
>>>      0xffffffffa04d9ac0 <__rmap_clear_dirty>:        callq 0xffffffff816543d0
>>> <ftrace_caller>
>>>      0xffffffffa04d9ac5 <__rmap_clear_dirty+5>:      push %rbp
>>>      0xffffffffa04d9ac6 <__rmap_clear_dirty+6>:      mov %rsp,%rbp
>>>      0xffffffffa04d9ac9 <__rmap_clear_dirty+9>:      push %rbx
>>>      0xffffffffa04d9aca <__rmap_clear_dirty+10>:     sub $0x18,%rsp
>>>      0xffffffffa04d9ace <__rmap_clear_dirty+14>:     mov (%rsi),%rdi
>>>      0xffffffffa04d9ad1 <__rmap_clear_dirty+17>:     mov %gs:0x28,%rax
>>>      0xffffffffa04d9ada <__rmap_clear_dirty+26>:     mov %rax,-0x10(%rbp)
>>>      0xffffffffa04d9ade <__rmap_clear_dirty+30>:     xor %eax,%eax
>>>      0xffffffffa04d9ae0 <__rmap_clear_dirty+32>:     test %rdi,%rdi
>>>      0xffffffffa04d9ae3 <__rmap_clear_dirty+35>:     je 0xffffffffa04d9b78
>>>      0xffffffffa04d9ae9 <__rmap_clear_dirty+41>:     test $0x1,%dil
>>>      0xffffffffa04d9aed <__rmap_clear_dirty+45>:     je 0xffffffffa04d9b98
>>>      0xffffffffa04d9af3 <__rmap_clear_dirty+51>:     and $0xfffffffffffffffe,%rdi
>>>      0xffffffffa04d9af7 <__rmap_clear_dirty+55>:     movl $0x0,-0x18(%rbp)
>>>      0xffffffffa04d9afe <__rmap_clear_dirty+62>:     mov %rdi,-0x20(%rbp)
>>>      0xffffffffa04d9b02 <__rmap_clear_dirty+66>:     mov (%rdi),%rdi
>>>      0xffffffffa04d9b05 <__rmap_clear_dirty+69>:     test %rdi,%rdi
>>>                                     ...
>>>
>>> The details of the environment is:
>>> Host info: x86_64 CentOS 7 kernel (3.10.0-327.36.58.10_2.x86_64, slightly
>>> modified). The CPU is Broadwell Intel(R) Xeon(R) CPU E5-2618L v4 @ 2.20GHz.
>>> Kmod info: version is 4.4.36
>>> What I know is that the parameter PML(Page Modification Logging) is enabled by
>>> default.
>>>      # cat /sys/module/kvm_intel/parameters/pml
>>>      # Y
>>>
>>> Below is the crash message:
>>> [1548777.924180] kvm: zapping shadow pages for mmio generation wraparound
>>> [1548777.947844] HTB: quantum of class 10001 is big. Consider r2q change.
>>> [1548778.185389] kvm: zapping shadow pages for mmio generation wraparound
>>> [1548778.994801] BUG: unable to handle kernel paging request at 000000007f183000
>>> [1548779.002135] IP: [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
>>> [1548779.009151] PGD 1f8452067 PUD 0
>>> [1548779.012774] Thread overran stack, or stack corrupted
>>> [1548779.018076] Oops: 0000 [#1] SMP
>>> [1548779.027050] collected_len = 1048570, LOG_BUF_LEN_LOCAL = 1048576
>>> [1548779.042039] kbox: no notify die func register. no need to notify
>>> [1548779.048392] do nothing after die!
>>> [1548779.052071] Modules linked in: kboxdriver(O) kbox(O) sch_htb
>>> ip_set_hash_net ip6table_filter ip6_tables iptable_filter igb_uio(OE) uio
>>> bridge dm_service_time dm_multipath iscsi_tcp libiscsi_tcp libiscsi
>>> scsi_transport_iscsi 8021q garp stp mrp llc vfat fat isofs ext4 jbd2 xfs
>>> sha512_generic dev_connlimit(O) bum(O) ip_set nfnetlink prio(O) nat(O)
>>> vport_vxlan(O) openvswitch(O) nf_defrag_ipv6 gre ib_uverbs(OVE) hotpatch(OE)
>>> sigma_serial(O) pmcint(O) guest_kbox_ram(O) signo_catch(O) mlx4_ib(OVE)
>>> mlx4_en(OVE) vxlan ip6_udp_tunnel udp_tunnel ib_sa(OVE) ib_mad(OVE) ptp
>>> ib_core(OVE) ib_addr(OVE) pps_core ib_netlink(OVE) intel_powerclamp coretemp
>>> intel_rapl crc32_pclmul crc32c_intel ipmi_devintf(OVE) ghash_clmulni_intel
>>> mlx4_core(OVE) aesni_intel kvm_intel(O) igb(OVE) lrw gf128mul i2c_algo_bit
>>> glue_helper
>>> [1548779.125577]  iTCO_wdt ablk_helper kvm(O) iTCO_vendor_support sg cryptd
>>> compat(OVE) dca sb_edac i2c_i801 kbox_pci(OVE) i2c_core edac_core pcspkr
>>> lpc_ich shpchp mfd_core ipmi_si(OVE) ipmi_msghandler(OVE) nf_conntrack_ipv4
>>> nf_defrag_ipv4 vhost_net(O) tun(O) vhost(O) macvtap macvlan vfio_pci irqbypass
>>> vfio_iommu_type1 vfio xt_sctp nf_conntrack_proto_sctp nf_nat_proto_sctp nf_nat
>>> nf_conntrack sctp libcrc32c ip_tables ext3 mbcache jbd sd_mod crc_t10dif
>>> crct10dif_generic ahci crct10dif_pclmul crct10dif_common libahci mpt2sas(OVE)
>>> libata raid_class scsi_transport_sas dm_mod [last unloaded: kbox]
>>> [1548779.178704] CPU: 35 PID: 33769 Comm: migration Tainted: G           OE
>>> ----V-------   3.10.0-327.36.58.10_2.x86_64 #1
>>> [1548779.189855] Hardware name: HUAWEI TECHNOLOGIES CO.,LTD.
>>> CH80GPUB8/CH80GPUB8, BIOS GPUBV201 06/18/2015
>>> [1548779.199585] task: ffff880123da0000 ti: ffff8801a8068000 task.ti:
>>> ffff8801a8068000
>>> [1548779.207596] RIP: 0010:[<ffffffffa04d9b0a>] [<ffffffffa04d9b0a>]
>>> __rmap_clear_dirty+0x4a/0xf0 [kvm]
>>> [1548779.217209] RSP: 0018:ffff8801a806bba0  EFLAGS: 00010206
>>> [1548779.222879] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
>>> 0000000000000000
>>> [1548779.230543] RDX: ffffc90059d64000 RSI: ffffc90059f2efd0 RDI:
>>> 000000007f183000
>>> [1548779.238211] RBP: ffff8801a806bbc0 R08: ffff8801a806bd60 R09:
>>> 0000000000000100
>>> [1548779.245878] R10: 0000000000003649 R11: 0000000000000006 R12:
>>> ffff88027a7f0000
>>> [1548779.253524] R13: ffffffffa04d9ac0 R14: ffff8801a806bfd8 R15:
>>> 0000000000000000
>>> [1548779.261125] FS:  00007f3dd6aff700(0000) GS:ffff883ffe3c0000(0000)
>>> knlGS:0000000000000000
>>> [1548779.269733] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [1548779.275795] CR2: 000000007f183000 CR3: 00000001f50ab000 CR4:
>>> 00000000003427e0
>>> [1548779.283444] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>>> 0000000000000000
>>> [1548779.291040] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
>>> 0000000000000400
>>> [1548779.298689] Stack:
>>> [1548779.301045]  0000000000000000 0000000000000297 0000000009b989ac
>>> 0000000000000000
>>> [1548779.309033]  ffff8801a806bc48 ffffffffa04d9206 00ff8801a806bd10
>>> ffff8801a806bd60
>>> [1548779.317023]  0000000000000100 00000000000bffff 0000000100000001
>>> 00000000000396fa
>>> [1548779.325012] Call Trace:
>>> [1548779.327856]  [<ffffffffa04d9206>] slot_handle_level_range+0x76/0x100 [kvm]
>>> [1548779.335090]  [<ffffffffa04dac34>] kvm_mmu_slot_leaf_clear_dirty+0x54/0x80
>>> [kvm]
>>> [1548779.342922]  [<ffffffffa03f8967>] vmx_slot_enable_log_dirty+0x17/0x30
>>> [kvm_intel]
>>> [1548779.350943]  [<ffffffffa04d5d2f>] kvm_arch_commit_memory_region+0x5f/0xe0
>>> [kvm]
>>> [1548779.358776]  [<ffffffffa04be7c5>] __kvm_set_memory_region+0x665/0xa90 [kvm]
>>> [1548779.366099]  [<ffffffffa04be160>] ? kvm_kvzalloc+0x30/0x30 [kvm]
>>> [1548779.372457]  [<ffffffffa04bec1b>] kvm_set_memory_region+0x2b/0x40 [kvm]
>>> [1548779.379423]  [<ffffffffa04bf084>] kvm_vm_ioctl+0x454/0x700 [kvm]
>>> [1548779.385777]  [<ffffffff8112f083>] ? ftrace_ops_list_func+0x83/0x110
>>> [1548779.392406]  [<ffffffffa04bec30>] ? kvm_set_memory_region+0x40/0x40 [kvm]
>>> [1548779.399538]  [<ffffffff811fceb5>] do_vfs_ioctl+0x2e5/0x4c0
>>> [1548779.405340]  [<ffffffff811fd131>] SyS_ioctl+0xa1/0xc0
>>> [1548779.410736]  [<ffffffff81652809>] system_call_fastpath+0x16/0x1b
>>> [1548779.417086] Code: c0 48 85 ff 0f 84 8f 00 00 00 40 f6 c7 01 0f 84 a5 00
>>> 00 00 48 83 e7 fe c7 45 e8 00 00 00 00 48 89 7d e0 48 8b 3f 48 85 ff 74 6e
>>> <48> 8b 17 f6 c2 01 74 5d 48 8b 05 c7 40 04 00 48 89 d1 31 db 48
>>> [1548779.438015] RIP  [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
>>> [1548779.445109]  RSP <ffff8801a806bba0>
>>> [1548779.448915] CR2: 000000007f183000
>>> [1548779.453070] ---[ end trace d99f28b71290c64c ]---
>>> [1548779.523937] Kernel panic - not syncing: Fatal exception
>>> [1548779.523945] perf interrupt took too long (124114 > 17857), lowering
>>> kernel.perf_event_max_sample_rate to 7000
>>> [1548779.524826] perf interrupt took too long (123163 > 31250), lowering
>>> kernel.perf_event_max_sample_rate to 4000
>>> [1548779.525717] perf interrupt took too long (122216 > 62500), lowering
>>> kernel.perf_event_max_sample_rate to 2000
>>> [1548779.603824] die even has been record!
>>
>
>
>
> .
>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration
@ 2017-02-20  2:17       ` hangaohuai
  0 siblings, 0 replies; 39+ messages in thread
From: hangaohuai @ 2017-02-20  2:17 UTC (permalink / raw)
  To: kai.huang, guangrong.xiao, xudong.hao
  Cc: Chris Friesen, Herongguang (Stephen),
	pbonzini, rkrcmar, fangying1, wangxin (U),
	qemu-devel, kvm

Hi, Kai Huang and Xiao Guangrong.

For the problem mentioned above, there may be a bug related to PML and probably on Broadwell CPUs.

I've been reading the code for PML for days, but I haven't found any clews. Do you have any idea about this BUG ?

Hope you can help!

On 2017/2/10 23:28, Chris Friesen wrote:
>
> Well, not so much solved as worked around it.
>
> It seems that the problem only showed up on Broadwell, which made us wonder about something hardware specific.
>
> Setting "kvm-intel.eptad=0" in the kernel boot args seems to mask the problem for us.
>
> Chris
>
> On 02/10/2017 03:11 AM, Herongguang (Stephen) wrote:
>> Hi, Chris Friesen, did you solve the problem?
>>
>> On 2017/2/9 22:37, Herongguang (Stephen) wrote:
>>> Hi.
>>> I had a problem when I just repeatedly live migrate a vm between two compute
>>> nodes.
>>> The phenomenon was that the KVM module was crashed and then the host rebooted.
>>> However I cannot reliably trigger this BUG.
>>>
>>> The backtrace is the same as http://www.spinics.net/lists/kvm/msg138475.html.
>>>
>>> The crash is triggered when function __rmap_clear_dirty is invoked and an
>>> invalid page(0x000000007f183000) is visited.
>>> The value %rdi is 0x000000007f183000, which is obviously not a valid kernel
>>> pointer for x86_64.
>>> The assembly of __rmap_clear_dirty is:
>>>      0xffffffffa04d9ac0 <__rmap_clear_dirty>:        callq 0xffffffff816543d0
>>> <ftrace_caller>
>>>      0xffffffffa04d9ac5 <__rmap_clear_dirty+5>:      push %rbp
>>>      0xffffffffa04d9ac6 <__rmap_clear_dirty+6>:      mov %rsp,%rbp
>>>      0xffffffffa04d9ac9 <__rmap_clear_dirty+9>:      push %rbx
>>>      0xffffffffa04d9aca <__rmap_clear_dirty+10>:     sub $0x18,%rsp
>>>      0xffffffffa04d9ace <__rmap_clear_dirty+14>:     mov (%rsi),%rdi
>>>      0xffffffffa04d9ad1 <__rmap_clear_dirty+17>:     mov %gs:0x28,%rax
>>>      0xffffffffa04d9ada <__rmap_clear_dirty+26>:     mov %rax,-0x10(%rbp)
>>>      0xffffffffa04d9ade <__rmap_clear_dirty+30>:     xor %eax,%eax
>>>      0xffffffffa04d9ae0 <__rmap_clear_dirty+32>:     test %rdi,%rdi
>>>      0xffffffffa04d9ae3 <__rmap_clear_dirty+35>:     je 0xffffffffa04d9b78
>>>      0xffffffffa04d9ae9 <__rmap_clear_dirty+41>:     test $0x1,%dil
>>>      0xffffffffa04d9aed <__rmap_clear_dirty+45>:     je 0xffffffffa04d9b98
>>>      0xffffffffa04d9af3 <__rmap_clear_dirty+51>:     and $0xfffffffffffffffe,%rdi
>>>      0xffffffffa04d9af7 <__rmap_clear_dirty+55>:     movl $0x0,-0x18(%rbp)
>>>      0xffffffffa04d9afe <__rmap_clear_dirty+62>:     mov %rdi,-0x20(%rbp)
>>>      0xffffffffa04d9b02 <__rmap_clear_dirty+66>:     mov (%rdi),%rdi
>>>      0xffffffffa04d9b05 <__rmap_clear_dirty+69>:     test %rdi,%rdi
>>>                                     ...
>>>
>>> The details of the environment is:
>>> Host info: x86_64 CentOS 7 kernel (3.10.0-327.36.58.10_2.x86_64, slightly
>>> modified). The CPU is Broadwell Intel(R) Xeon(R) CPU E5-2618L v4 @ 2.20GHz.
>>> Kmod info: version is 4.4.36
>>> What I know is that the parameter PML(Page Modification Logging) is enabled by
>>> default.
>>>      # cat /sys/module/kvm_intel/parameters/pml
>>>      # Y
>>>
>>> Below is the crash message:
>>> [1548777.924180] kvm: zapping shadow pages for mmio generation wraparound
>>> [1548777.947844] HTB: quantum of class 10001 is big. Consider r2q change.
>>> [1548778.185389] kvm: zapping shadow pages for mmio generation wraparound
>>> [1548778.994801] BUG: unable to handle kernel paging request at 000000007f183000
>>> [1548779.002135] IP: [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
>>> [1548779.009151] PGD 1f8452067 PUD 0
>>> [1548779.012774] Thread overran stack, or stack corrupted
>>> [1548779.018076] Oops: 0000 [#1] SMP
>>> [1548779.027050] collected_len = 1048570, LOG_BUF_LEN_LOCAL = 1048576
>>> [1548779.042039] kbox: no notify die func register. no need to notify
>>> [1548779.048392] do nothing after die!
>>> [1548779.052071] Modules linked in: kboxdriver(O) kbox(O) sch_htb
>>> ip_set_hash_net ip6table_filter ip6_tables iptable_filter igb_uio(OE) uio
>>> bridge dm_service_time dm_multipath iscsi_tcp libiscsi_tcp libiscsi
>>> scsi_transport_iscsi 8021q garp stp mrp llc vfat fat isofs ext4 jbd2 xfs
>>> sha512_generic dev_connlimit(O) bum(O) ip_set nfnetlink prio(O) nat(O)
>>> vport_vxlan(O) openvswitch(O) nf_defrag_ipv6 gre ib_uverbs(OVE) hotpatch(OE)
>>> sigma_serial(O) pmcint(O) guest_kbox_ram(O) signo_catch(O) mlx4_ib(OVE)
>>> mlx4_en(OVE) vxlan ip6_udp_tunnel udp_tunnel ib_sa(OVE) ib_mad(OVE) ptp
>>> ib_core(OVE) ib_addr(OVE) pps_core ib_netlink(OVE) intel_powerclamp coretemp
>>> intel_rapl crc32_pclmul crc32c_intel ipmi_devintf(OVE) ghash_clmulni_intel
>>> mlx4_core(OVE) aesni_intel kvm_intel(O) igb(OVE) lrw gf128mul i2c_algo_bit
>>> glue_helper
>>> [1548779.125577]  iTCO_wdt ablk_helper kvm(O) iTCO_vendor_support sg cryptd
>>> compat(OVE) dca sb_edac i2c_i801 kbox_pci(OVE) i2c_core edac_core pcspkr
>>> lpc_ich shpchp mfd_core ipmi_si(OVE) ipmi_msghandler(OVE) nf_conntrack_ipv4
>>> nf_defrag_ipv4 vhost_net(O) tun(O) vhost(O) macvtap macvlan vfio_pci irqbypass
>>> vfio_iommu_type1 vfio xt_sctp nf_conntrack_proto_sctp nf_nat_proto_sctp nf_nat
>>> nf_conntrack sctp libcrc32c ip_tables ext3 mbcache jbd sd_mod crc_t10dif
>>> crct10dif_generic ahci crct10dif_pclmul crct10dif_common libahci mpt2sas(OVE)
>>> libata raid_class scsi_transport_sas dm_mod [last unloaded: kbox]
>>> [1548779.178704] CPU: 35 PID: 33769 Comm: migration Tainted: G           OE
>>> ----V-------   3.10.0-327.36.58.10_2.x86_64 #1
>>> [1548779.189855] Hardware name: HUAWEI TECHNOLOGIES CO.,LTD.
>>> CH80GPUB8/CH80GPUB8, BIOS GPUBV201 06/18/2015
>>> [1548779.199585] task: ffff880123da0000 ti: ffff8801a8068000 task.ti:
>>> ffff8801a8068000
>>> [1548779.207596] RIP: 0010:[<ffffffffa04d9b0a>] [<ffffffffa04d9b0a>]
>>> __rmap_clear_dirty+0x4a/0xf0 [kvm]
>>> [1548779.217209] RSP: 0018:ffff8801a806bba0  EFLAGS: 00010206
>>> [1548779.222879] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
>>> 0000000000000000
>>> [1548779.230543] RDX: ffffc90059d64000 RSI: ffffc90059f2efd0 RDI:
>>> 000000007f183000
>>> [1548779.238211] RBP: ffff8801a806bbc0 R08: ffff8801a806bd60 R09:
>>> 0000000000000100
>>> [1548779.245878] R10: 0000000000003649 R11: 0000000000000006 R12:
>>> ffff88027a7f0000
>>> [1548779.253524] R13: ffffffffa04d9ac0 R14: ffff8801a806bfd8 R15:
>>> 0000000000000000
>>> [1548779.261125] FS:  00007f3dd6aff700(0000) GS:ffff883ffe3c0000(0000)
>>> knlGS:0000000000000000
>>> [1548779.269733] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [1548779.275795] CR2: 000000007f183000 CR3: 00000001f50ab000 CR4:
>>> 00000000003427e0
>>> [1548779.283444] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>>> 0000000000000000
>>> [1548779.291040] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
>>> 0000000000000400
>>> [1548779.298689] Stack:
>>> [1548779.301045]  0000000000000000 0000000000000297 0000000009b989ac
>>> 0000000000000000
>>> [1548779.309033]  ffff8801a806bc48 ffffffffa04d9206 00ff8801a806bd10
>>> ffff8801a806bd60
>>> [1548779.317023]  0000000000000100 00000000000bffff 0000000100000001
>>> 00000000000396fa
>>> [1548779.325012] Call Trace:
>>> [1548779.327856]  [<ffffffffa04d9206>] slot_handle_level_range+0x76/0x100 [kvm]
>>> [1548779.335090]  [<ffffffffa04dac34>] kvm_mmu_slot_leaf_clear_dirty+0x54/0x80
>>> [kvm]
>>> [1548779.342922]  [<ffffffffa03f8967>] vmx_slot_enable_log_dirty+0x17/0x30
>>> [kvm_intel]
>>> [1548779.350943]  [<ffffffffa04d5d2f>] kvm_arch_commit_memory_region+0x5f/0xe0
>>> [kvm]
>>> [1548779.358776]  [<ffffffffa04be7c5>] __kvm_set_memory_region+0x665/0xa90 [kvm]
>>> [1548779.366099]  [<ffffffffa04be160>] ? kvm_kvzalloc+0x30/0x30 [kvm]
>>> [1548779.372457]  [<ffffffffa04bec1b>] kvm_set_memory_region+0x2b/0x40 [kvm]
>>> [1548779.379423]  [<ffffffffa04bf084>] kvm_vm_ioctl+0x454/0x700 [kvm]
>>> [1548779.385777]  [<ffffffff8112f083>] ? ftrace_ops_list_func+0x83/0x110
>>> [1548779.392406]  [<ffffffffa04bec30>] ? kvm_set_memory_region+0x40/0x40 [kvm]
>>> [1548779.399538]  [<ffffffff811fceb5>] do_vfs_ioctl+0x2e5/0x4c0
>>> [1548779.405340]  [<ffffffff811fd131>] SyS_ioctl+0xa1/0xc0
>>> [1548779.410736]  [<ffffffff81652809>] system_call_fastpath+0x16/0x1b
>>> [1548779.417086] Code: c0 48 85 ff 0f 84 8f 00 00 00 40 f6 c7 01 0f 84 a5 00
>>> 00 00 48 83 e7 fe c7 45 e8 00 00 00 00 48 89 7d e0 48 8b 3f 48 85 ff 74 6e
>>> <48> 8b 17 f6 c2 01 74 5d 48 8b 05 c7 40 04 00 48 89 d1 31 db 48
>>> [1548779.438015] RIP  [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
>>> [1548779.445109]  RSP <ffff8801a806bba0>
>>> [1548779.448915] CR2: 000000007f183000
>>> [1548779.453070] ---[ end trace d99f28b71290c64c ]---
>>> [1548779.523937] Kernel panic - not syncing: Fatal exception
>>> [1548779.523945] perf interrupt took too long (124114 > 17857), lowering
>>> kernel.perf_event_max_sample_rate to 7000
>>> [1548779.524826] perf interrupt took too long (123163 > 31250), lowering
>>> kernel.perf_event_max_sample_rate to 4000
>>> [1548779.525717] perf interrupt took too long (122216 > 62500), lowering
>>> kernel.perf_event_max_sample_rate to 2000
>>> [1548779.603824] die even has been record!
>>
>
>
>
> .
>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration
  2017-02-20  2:17       ` [Qemu-devel] " hangaohuai
  (?)
@ 2017-02-20  4:38       ` Han, Huaitong
  2017-02-22  3:08         ` Chris Friesen
  -1 siblings, 1 reply; 39+ messages in thread
From: Han, Huaitong @ 2017-02-20  4:38 UTC (permalink / raw)
  To: hangaohuai
  Cc: kvm, fangying1, herongguang.he, Friesen, Chris B (Wind River),
	xudong.hao, qemu-devel, pbonzini, wangxinxin.wang, kai.huang,
	rkrcmar, guangrong.xiao

Hi, Gaohuai

I tried to debug the problem, and I found the indirect cause may be that
the rmap value is not cleared when KVM mmu page is freed. I have read
code without the root cause. Can you stable reproduce the the issue?
Many guesses need to be verified.


On Mon, 2017-02-20 at 10:17 +0800, hangaohuai wrote:
> Hi, Kai Huang and Xiao Guangrong.
> 
> For the problem mentioned above, there may be a bug related to PML and probably on Broadwell CPUs.
> 
> I've been reading the code for PML for days, but I haven't found any clews. Do you have any idea about this BUG ?
> 
> Hope you can help!
> 
> On 2017/2/10 23:28, Chris Friesen wrote:
> >
> > Well, not so much solved as worked around it.
> >
> > It seems that the problem only showed up on Broadwell, which made us wonder about something hardware specific.
> >
> > Setting "kvm-intel.eptad=0" in the kernel boot args seems to mask the problem for us.
> >
> > Chris
> >
> > On 02/10/2017 03:11 AM, Herongguang (Stephen) wrote:
> >> Hi, Chris Friesen, did you solve the problem?
> >>
> >> On 2017/2/9 22:37, Herongguang (Stephen) wrote:
> >>> Hi.
> >>> I had a problem when I just repeatedly live migrate a vm between two compute
> >>> nodes.
> >>> The phenomenon was that the KVM module was crashed and then the host rebooted.
> >>> However I cannot reliably trigger this BUG.
> >>>
> >>> The backtrace is the same as http://www.spinics.net/lists/kvm/msg138475.html.
> >>>
> >>> The crash is triggered when function __rmap_clear_dirty is invoked and an
> >>> invalid page(0x000000007f183000) is visited.
> >>> The value %rdi is 0x000000007f183000, which is obviously not a valid kernel
> >>> pointer for x86_64.
> >>> The assembly of __rmap_clear_dirty is:
> >>>      0xffffffffa04d9ac0 <__rmap_clear_dirty>:        callq 0xffffffff816543d0
> >>> <ftrace_caller>
> >>>      0xffffffffa04d9ac5 <__rmap_clear_dirty+5>:      push %rbp
> >>>      0xffffffffa04d9ac6 <__rmap_clear_dirty+6>:      mov %rsp,%rbp
> >>>      0xffffffffa04d9ac9 <__rmap_clear_dirty+9>:      push %rbx
> >>>      0xffffffffa04d9aca <__rmap_clear_dirty+10>:     sub $0x18,%rsp
> >>>      0xffffffffa04d9ace <__rmap_clear_dirty+14>:     mov (%rsi),%rdi
> >>>      0xffffffffa04d9ad1 <__rmap_clear_dirty+17>:     mov %gs:0x28,%rax
> >>>      0xffffffffa04d9ada <__rmap_clear_dirty+26>:     mov %rax,-0x10(%rbp)
> >>>      0xffffffffa04d9ade <__rmap_clear_dirty+30>:     xor %eax,%eax
> >>>      0xffffffffa04d9ae0 <__rmap_clear_dirty+32>:     test %rdi,%rdi
> >>>      0xffffffffa04d9ae3 <__rmap_clear_dirty+35>:     je 0xffffffffa04d9b78
> >>>      0xffffffffa04d9ae9 <__rmap_clear_dirty+41>:     test $0x1,%dil
> >>>      0xffffffffa04d9aed <__rmap_clear_dirty+45>:     je 0xffffffffa04d9b98
> >>>      0xffffffffa04d9af3 <__rmap_clear_dirty+51>:     and $0xfffffffffffffffe,%rdi
> >>>      0xffffffffa04d9af7 <__rmap_clear_dirty+55>:     movl $0x0,-0x18(%rbp)
> >>>      0xffffffffa04d9afe <__rmap_clear_dirty+62>:     mov %rdi,-0x20(%rbp)
> >>>      0xffffffffa04d9b02 <__rmap_clear_dirty+66>:     mov (%rdi),%rdi
> >>>      0xffffffffa04d9b05 <__rmap_clear_dirty+69>:     test %rdi,%rdi
> >>>                                     ...
> >>>
> >>> The details of the environment is:
> >>> Host info: x86_64 CentOS 7 kernel (3.10.0-327.36.58.10_2.x86_64, slightly
> >>> modified). The CPU is Broadwell Intel(R) Xeon(R) CPU E5-2618L v4 @ 2.20GHz.
> >>> Kmod info: version is 4.4.36
> >>> What I know is that the parameter PML(Page Modification Logging) is enabled by
> >>> default.
> >>>      # cat /sys/module/kvm_intel/parameters/pml
> >>>      # Y
> >>>
> >>> Below is the crash message:
> >>> [1548777.924180] kvm: zapping shadow pages for mmio generation wraparound
> >>> [1548777.947844] HTB: quantum of class 10001 is big. Consider r2q change.
> >>> [1548778.185389] kvm: zapping shadow pages for mmio generation wraparound
> >>> [1548778.994801] BUG: unable to handle kernel paging request at 000000007f183000
> >>> [1548779.002135] IP: [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
> >>> [1548779.009151] PGD 1f8452067 PUD 0
> >>> [1548779.012774] Thread overran stack, or stack corrupted
> >>> [1548779.018076] Oops: 0000 [#1] SMP
> >>> [1548779.027050] collected_len = 1048570, LOG_BUF_LEN_LOCAL = 1048576
> >>> [1548779.042039] kbox: no notify die func register. no need to notify
> >>> [1548779.048392] do nothing after die!
> >>> [1548779.052071] Modules linked in: kboxdriver(O) kbox(O) sch_htb
> >>> ip_set_hash_net ip6table_filter ip6_tables iptable_filter igb_uio(OE) uio
> >>> bridge dm_service_time dm_multipath iscsi_tcp libiscsi_tcp libiscsi
> >>> scsi_transport_iscsi 8021q garp stp mrp llc vfat fat isofs ext4 jbd2 xfs
> >>> sha512_generic dev_connlimit(O) bum(O) ip_set nfnetlink prio(O) nat(O)
> >>> vport_vxlan(O) openvswitch(O) nf_defrag_ipv6 gre ib_uverbs(OVE) hotpatch(OE)
> >>> sigma_serial(O) pmcint(O) guest_kbox_ram(O) signo_catch(O) mlx4_ib(OVE)
> >>> mlx4_en(OVE) vxlan ip6_udp_tunnel udp_tunnel ib_sa(OVE) ib_mad(OVE) ptp
> >>> ib_core(OVE) ib_addr(OVE) pps_core ib_netlink(OVE) intel_powerclamp coretemp
> >>> intel_rapl crc32_pclmul crc32c_intel ipmi_devintf(OVE) ghash_clmulni_intel
> >>> mlx4_core(OVE) aesni_intel kvm_intel(O) igb(OVE) lrw gf128mul i2c_algo_bit
> >>> glue_helper
> >>> [1548779.125577]  iTCO_wdt ablk_helper kvm(O) iTCO_vendor_support sg cryptd
> >>> compat(OVE) dca sb_edac i2c_i801 kbox_pci(OVE) i2c_core edac_core pcspkr
> >>> lpc_ich shpchp mfd_core ipmi_si(OVE) ipmi_msghandler(OVE) nf_conntrack_ipv4
> >>> nf_defrag_ipv4 vhost_net(O) tun(O) vhost(O) macvtap macvlan vfio_pci irqbypass
> >>> vfio_iommu_type1 vfio xt_sctp nf_conntrack_proto_sctp nf_nat_proto_sctp nf_nat
> >>> nf_conntrack sctp libcrc32c ip_tables ext3 mbcache jbd sd_mod crc_t10dif
> >>> crct10dif_generic ahci crct10dif_pclmul crct10dif_common libahci mpt2sas(OVE)
> >>> libata raid_class scsi_transport_sas dm_mod [last unloaded: kbox]
> >>> [1548779.178704] CPU: 35 PID: 33769 Comm: migration Tainted: G           OE
> >>> ----V-------   3.10.0-327.36.58.10_2.x86_64 #1
> >>> [1548779.189855] Hardware name: HUAWEI TECHNOLOGIES CO.,LTD.
> >>> CH80GPUB8/CH80GPUB8, BIOS GPUBV201 06/18/2015
> >>> [1548779.199585] task: ffff880123da0000 ti: ffff8801a8068000 task.ti:
> >>> ffff8801a8068000
> >>> [1548779.207596] RIP: 0010:[<ffffffffa04d9b0a>] [<ffffffffa04d9b0a>]
> >>> __rmap_clear_dirty+0x4a/0xf0 [kvm]
> >>> [1548779.217209] RSP: 0018:ffff8801a806bba0  EFLAGS: 00010206
> >>> [1548779.222879] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
> >>> 0000000000000000
> >>> [1548779.230543] RDX: ffffc90059d64000 RSI: ffffc90059f2efd0 RDI:
> >>> 000000007f183000
> >>> [1548779.238211] RBP: ffff8801a806bbc0 R08: ffff8801a806bd60 R09:
> >>> 0000000000000100
> >>> [1548779.245878] R10: 0000000000003649 R11: 0000000000000006 R12:
> >>> ffff88027a7f0000
> >>> [1548779.253524] R13: ffffffffa04d9ac0 R14: ffff8801a806bfd8 R15:
> >>> 0000000000000000
> >>> [1548779.261125] FS:  00007f3dd6aff700(0000) GS:ffff883ffe3c0000(0000)
> >>> knlGS:0000000000000000
> >>> [1548779.269733] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>> [1548779.275795] CR2: 000000007f183000 CR3: 00000001f50ab000 CR4:
> >>> 00000000003427e0
> >>> [1548779.283444] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> >>> 0000000000000000
> >>> [1548779.291040] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> >>> 0000000000000400
> >>> [1548779.298689] Stack:
> >>> [1548779.301045]  0000000000000000 0000000000000297 0000000009b989ac
> >>> 0000000000000000
> >>> [1548779.309033]  ffff8801a806bc48 ffffffffa04d9206 00ff8801a806bd10
> >>> ffff8801a806bd60
> >>> [1548779.317023]  0000000000000100 00000000000bffff 0000000100000001
> >>> 00000000000396fa
> >>> [1548779.325012] Call Trace:
> >>> [1548779.327856]  [<ffffffffa04d9206>] slot_handle_level_range+0x76/0x100 [kvm]
> >>> [1548779.335090]  [<ffffffffa04dac34>] kvm_mmu_slot_leaf_clear_dirty+0x54/0x80
> >>> [kvm]
> >>> [1548779.342922]  [<ffffffffa03f8967>] vmx_slot_enable_log_dirty+0x17/0x30
> >>> [kvm_intel]
> >>> [1548779.350943]  [<ffffffffa04d5d2f>] kvm_arch_commit_memory_region+0x5f/0xe0
> >>> [kvm]
> >>> [1548779.358776]  [<ffffffffa04be7c5>] __kvm_set_memory_region+0x665/0xa90 [kvm]
> >>> [1548779.366099]  [<ffffffffa04be160>] ? kvm_kvzalloc+0x30/0x30 [kvm]
> >>> [1548779.372457]  [<ffffffffa04bec1b>] kvm_set_memory_region+0x2b/0x40 [kvm]
> >>> [1548779.379423]  [<ffffffffa04bf084>] kvm_vm_ioctl+0x454/0x700 [kvm]
> >>> [1548779.385777]  [<ffffffff8112f083>] ? ftrace_ops_list_func+0x83/0x110
> >>> [1548779.392406]  [<ffffffffa04bec30>] ? kvm_set_memory_region+0x40/0x40 [kvm]
> >>> [1548779.399538]  [<ffffffff811fceb5>] do_vfs_ioctl+0x2e5/0x4c0
> >>> [1548779.405340]  [<ffffffff811fd131>] SyS_ioctl+0xa1/0xc0
> >>> [1548779.410736]  [<ffffffff81652809>] system_call_fastpath+0x16/0x1b
> >>> [1548779.417086] Code: c0 48 85 ff 0f 84 8f 00 00 00 40 f6 c7 01 0f 84 a5 00
> >>> 00 00 48 83 e7 fe c7 45 e8 00 00 00 00 48 89 7d e0 48 8b 3f 48 85 ff 74 6e
> >>> <48> 8b 17 f6 c2 01 74 5d 48 8b 05 c7 40 04 00 48 89 d1 31 db 48
> >>> [1548779.438015] RIP  [<ffffffffa04d9b0a>] __rmap_clear_dirty+0x4a/0xf0 [kvm]
> >>> [1548779.445109]  RSP <ffff8801a806bba0>
> >>> [1548779.448915] CR2: 000000007f183000
> >>> [1548779.453070] ---[ end trace d99f28b71290c64c ]---
> >>> [1548779.523937] Kernel panic - not syncing: Fatal exception
> >>> [1548779.523945] perf interrupt took too long (124114 > 17857), lowering
> >>> kernel.perf_event_max_sample_rate to 7000
> >>> [1548779.524826] perf interrupt took too long (123163 > 31250), lowering
> >>> kernel.perf_event_max_sample_rate to 4000
> >>> [1548779.525717] perf interrupt took too long (122216 > 62500), lowering
> >>> kernel.perf_event_max_sample_rate to 2000
> >>> [1548779.603824] die even has been record!
> >>
> >
> >
> >
> > .
> >
> 


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration
  2017-02-20  4:38       ` Han, Huaitong
@ 2017-02-22  3:08         ` Chris Friesen
  2017-02-22 11:15           ` Paolo Bonzini
  0 siblings, 1 reply; 39+ messages in thread
From: Chris Friesen @ 2017-02-22  3:08 UTC (permalink / raw)
  To: Han, Huaitong, hangaohuai
  Cc: kvm, fangying1, herongguang.he, xudong.hao, qemu-devel, pbonzini,
	wangxinxin.wang, kai.huang, rkrcmar, guangrong.xiao

On 02/19/2017 10:38 PM, Han, Huaitong wrote:
> Hi, Gaohuai
>
> I tried to debug the problem, and I found the indirect cause may be that
> the rmap value is not cleared when KVM mmu page is freed. I have read
> code without the root cause. Can you stable reproduce the the issue?
> Many guesses need to be verified.

In both cases it seems to have been triggered by repeatedly live-migrating a KVM 
virtual machine between two hypervisors with Broadwell CPUs running the latest 
CentOS 7.

It's a race of some sort, it doesn't happen every time.

Chris

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration
  2017-02-22  3:08         ` Chris Friesen
@ 2017-02-22 11:15           ` Paolo Bonzini
  2017-02-22 13:31             ` Chris Friesen
  0 siblings, 1 reply; 39+ messages in thread
From: Paolo Bonzini @ 2017-02-22 11:15 UTC (permalink / raw)
  To: Chris Friesen, Han, Huaitong, hangaohuai
  Cc: kvm, fangying1, herongguang.he, xudong.hao, qemu-devel,
	wangxinxin.wang, kai.huang, rkrcmar, guangrong.xiao



On 22/02/2017 04:08, Chris Friesen wrote:
> On 02/19/2017 10:38 PM, Han, Huaitong wrote:
>> Hi, Gaohuai
>>
>> I tried to debug the problem, and I found the indirect cause may be that
>> the rmap value is not cleared when KVM mmu page is freed. I have read
>> code without the root cause. Can you stable reproduce the the issue?
>> Many guesses need to be verified.
> 
> In both cases it seems to have been triggered by repeatedly
> live-migrating a KVM virtual machine between two hypervisors with
> Broadwell CPUs running the latest CentOS 7.
> 
> It's a race of some sort, it doesn't happen every time.

Can you reproduce it with kernel 4.8+?  I'm suspecting commmit
4e59516a12a6 ("kvm: vmx: ensure VMCS is current while enabling PML",
2016-07-14) to be the fix.

Paolo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration
  2017-02-22 11:15           ` Paolo Bonzini
@ 2017-02-22 13:31             ` Chris Friesen
  2017-02-22 14:43               ` Paolo Bonzini
  0 siblings, 1 reply; 39+ messages in thread
From: Chris Friesen @ 2017-02-22 13:31 UTC (permalink / raw)
  To: Paolo Bonzini, Han, Huaitong, hangaohuai
  Cc: kvm, fangying1, herongguang.he, xudong.hao, qemu-devel,
	wangxinxin.wang, kai.huang, rkrcmar, guangrong.xiao

On 02/22/2017 05:15 AM, Paolo Bonzini wrote:
>
>
> On 22/02/2017 04:08, Chris Friesen wrote:
>> On 02/19/2017 10:38 PM, Han, Huaitong wrote:
>>> Hi, Gaohuai
>>>
>>> I tried to debug the problem, and I found the indirect cause may be that
>>> the rmap value is not cleared when KVM mmu page is freed. I have read
>>> code without the root cause. Can you stable reproduce the the issue?
>>> Many guesses need to be verified.
>>
>> In both cases it seems to have been triggered by repeatedly
>> live-migrating a KVM virtual machine between two hypervisors with
>> Broadwell CPUs running the latest CentOS 7.
>>
>> It's a race of some sort, it doesn't happen every time.
>
> Can you reproduce it with kernel 4.8+?  I'm suspecting commmit
> 4e59516a12a6 ("kvm: vmx: ensure VMCS is current while enabling PML",
> 2016-07-14) to be the fix.

I can't easily try with a newer kernel, the software package we're using has 
kernel patches that would have to be ported.

I'm at a conference, don't really have time to set up a pair of test machines 
from scratch with a custom kernel.

Chris

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration
  2017-02-22 13:31             ` Chris Friesen
@ 2017-02-22 14:43               ` Paolo Bonzini
  2017-02-24  2:23                   ` Herongguang (Stephen)
  0 siblings, 1 reply; 39+ messages in thread
From: Paolo Bonzini @ 2017-02-22 14:43 UTC (permalink / raw)
  To: Chris Friesen, Han, Huaitong, hangaohuai
  Cc: kvm, fangying1, herongguang.he, xudong.hao, qemu-devel,
	wangxinxin.wang, kai.huang, rkrcmar, guangrong.xiao



On 22/02/2017 14:31, Chris Friesen wrote:
>>>
>>
>> Can you reproduce it with kernel 4.8+?  I'm suspecting commmit
>> 4e59516a12a6 ("kvm: vmx: ensure VMCS is current while enabling PML",
>> 2016-07-14) to be the fix.
> 
> I can't easily try with a newer kernel, the software package we're using
> has kernel patches that would have to be ported.
> 
> I'm at a conference, don't really have time to set up a pair of test
> machines from scratch with a custom kernel.

Hopefully Gaohuai and Rongguang can help with this too.

Paolo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration
  2017-02-22 14:43               ` Paolo Bonzini
@ 2017-02-24  2:23                   ` Herongguang (Stephen)
  0 siblings, 0 replies; 39+ messages in thread
From: Herongguang (Stephen) @ 2017-02-24  2:23 UTC (permalink / raw)
  To: Paolo Bonzini, Chris Friesen, Han, Huaitong, hangaohuai, stable
  Cc: kvm, fangying1, xudong.hao, qemu-devel, wangxinxin.wang,
	kai.huang, rkrcmar, guangrong.xiao, linux-kernel



On 2017/2/22 22:43, Paolo Bonzini wrote:
>
>
> On 22/02/2017 14:31, Chris Friesen wrote:
>>>>
>>>
>>> Can you reproduce it with kernel 4.8+?  I'm suspecting commmit
>>> 4e59516a12a6 ("kvm: vmx: ensure VMCS is current while enabling PML",
>>> 2016-07-14) to be the fix.
>>
>> I can't easily try with a newer kernel, the software package we're using
>> has kernel patches that would have to be ported.
>>
>> I'm at a conference, don't really have time to set up a pair of test
>> machines from scratch with a custom kernel.
>
> Hopefully Gaohuai and Rongguang can help with this too.
>
> Paolo
>
> .
>
Yes, we are looking into and testing this.

I think this can result in any memory corruption, if VM1 writes its
PML buffer into VM2’s VMCS (since sched_in/sched_out notifier of VM1
is not registered yet), then VM1 is destroyed (hence its PML buffer
is freed back to kernel), after that, VM2 starts migration, so CPU
logs VM2’s dirty GFNS into a freed memory, results in any memory corruption.

As its severity, this commit (http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=4e59516a12a6ef6dcb660cb3a3f70c64bd60cfec)
is eligible to back port to kernel stable.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration
@ 2017-02-24  2:23                   ` Herongguang (Stephen)
  0 siblings, 0 replies; 39+ messages in thread
From: Herongguang (Stephen) @ 2017-02-24  2:23 UTC (permalink / raw)
  To: Paolo Bonzini, Chris Friesen, Han, Huaitong, hangaohuai, stable
  Cc: kvm, fangying1, xudong.hao, qemu-devel, wangxinxin.wang,
	kai.huang, rkrcmar, guangrong.xiao, linux-kernel



On 2017/2/22 22:43, Paolo Bonzini wrote:
>
>
> On 22/02/2017 14:31, Chris Friesen wrote:
>>>>
>>>
>>> Can you reproduce it with kernel 4.8+?  I'm suspecting commmit
>>> 4e59516a12a6 ("kvm: vmx: ensure VMCS is current while enabling PML",
>>> 2016-07-14) to be the fix.
>>
>> I can't easily try with a newer kernel, the software package we're using
>> has kernel patches that would have to be ported.
>>
>> I'm at a conference, don't really have time to set up a pair of test
>> machines from scratch with a custom kernel.
>
> Hopefully Gaohuai and Rongguang can help with this too.
>
> Paolo
>
> .
>
Yes, we are looking into and testing this.

I think this can result in any memory corruption, if VM1 writes its
PML buffer into VM2’s VMCS (since sched_in/sched_out notifier of VM1
is not registered yet), then VM1 is destroyed (hence its PML buffer
is freed back to kernel), after that, VM2 starts migration, so CPU
logs VM2’s dirty GFNS into a freed memory, results in any memory corruption.

As its severity, this commit (http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=4e59516a12a6ef6dcb660cb3a3f70c64bd60cfec)
is eligible to back port to kernel stable.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration
  2017-02-24  2:23                   ` Herongguang (Stephen)
@ 2017-02-24  9:35                     ` Herongguang (Stephen)
  -1 siblings, 0 replies; 39+ messages in thread
From: Herongguang (Stephen) @ 2017-02-24  9:35 UTC (permalink / raw)
  To: gregkh
  Cc: Paolo Bonzini, Chris Friesen, Han, Huaitong, hangaohuai, stable,
	kvm, fangying1, xudong.hao, qemu-devel, wangxinxin.wang,
	kai.huang, rkrcmar, guangrong.xiao, linux-kernel



On 2017/2/24 10:23, Herongguang (Stephen) wrote:
>
>
> On 2017/2/22 22:43, Paolo Bonzini wrote:
>>
>>
>> On 22/02/2017 14:31, Chris Friesen wrote:
>>>>>
>>>>
>>>> Can you reproduce it with kernel 4.8+?  I'm suspecting commmit
>>>> 4e59516a12a6 ("kvm: vmx: ensure VMCS is current while enabling PML",
>>>> 2016-07-14) to be the fix.
>>>
>>> I can't easily try with a newer kernel, the software package we're using
>>> has kernel patches that would have to be ported.
>>>
>>> I'm at a conference, don't really have time to set up a pair of test
>>> machines from scratch with a custom kernel.
>>
>> Hopefully Gaohuai and Rongguang can help with this too.
>>
>> Paolo
>>
>> .
>>
> Yes, we are looking into and testing this.
>
> I think this can result in any memory corruption, if VM1 writes its
> PML buffer into VM2’s VMCS (since sched_in/sched_out notifier of VM1
> is not registered yet), then VM1 is destroyed (hence its PML buffer
> is freed back to kernel), after that, VM2 starts migration, so CPU
> logs VM2’s dirty GFNS into a freed memory, results in any memory corruption.
>
> As its severity, this commit (http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=4e59516a12a6ef6dcb660cb3a3f70c64bd60cfec)
> is eligible to back port to kernel stable.

Hi, Greg, can you cherry pick commit 4e59516a12a6ef6dcb660cb3a3f70c64bd60cfec to 4.4-y?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration
@ 2017-02-24  9:35                     ` Herongguang (Stephen)
  0 siblings, 0 replies; 39+ messages in thread
From: Herongguang (Stephen) @ 2017-02-24  9:35 UTC (permalink / raw)
  To: gregkh
  Cc: Paolo Bonzini, Chris Friesen, Han, Huaitong, hangaohuai, stable,
	kvm, fangying1, xudong.hao, qemu-devel, wangxinxin.wang,
	kai.huang, rkrcmar, guangrong.xiao, linux-kernel



On 2017/2/24 10:23, Herongguang (Stephen) wrote:
>
>
> On 2017/2/22 22:43, Paolo Bonzini wrote:
>>
>>
>> On 22/02/2017 14:31, Chris Friesen wrote:
>>>>>
>>>>
>>>> Can you reproduce it with kernel 4.8+?  I'm suspecting commmit
>>>> 4e59516a12a6 ("kvm: vmx: ensure VMCS is current while enabling PML",
>>>> 2016-07-14) to be the fix.
>>>
>>> I can't easily try with a newer kernel, the software package we're using
>>> has kernel patches that would have to be ported.
>>>
>>> I'm at a conference, don't really have time to set up a pair of test
>>> machines from scratch with a custom kernel.
>>
>> Hopefully Gaohuai and Rongguang can help with this too.
>>
>> Paolo
>>
>> .
>>
> Yes, we are looking into and testing this.
>
> I think this can result in any memory corruption, if VM1 writes its
> PML buffer into VM2’s VMCS (since sched_in/sched_out notifier of VM1
> is not registered yet), then VM1 is destroyed (hence its PML buffer
> is freed back to kernel), after that, VM2 starts migration, so CPU
> logs VM2’s dirty GFNS into a freed memory, results in any memory corruption.
>
> As its severity, this commit (http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=4e59516a12a6ef6dcb660cb3a3f70c64bd60cfec)
> is eligible to back port to kernel stable.

Hi, Greg, can you cherry pick commit 4e59516a12a6ef6dcb660cb3a3f70c64bd60cfec to 4.4-y?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration
  2017-02-24  9:35                     ` Herongguang (Stephen)
  (?)
@ 2017-02-24  9:59                     ` Greg KH
  2017-02-24 10:00                       ` Paolo Bonzini
  -1 siblings, 1 reply; 39+ messages in thread
From: Greg KH @ 2017-02-24  9:59 UTC (permalink / raw)
  To: Herongguang (Stephen)
  Cc: Paolo Bonzini, Chris Friesen, Han, Huaitong, hangaohuai, stable,
	kvm, fangying1, xudong.hao, qemu-devel, wangxinxin.wang,
	kai.huang, rkrcmar, guangrong.xiao, linux-kernel

On Fri, Feb 24, 2017 at 05:35:17PM +0800, Herongguang (Stephen) wrote:
> 
> 
> On 2017/2/24 10:23, Herongguang (Stephen) wrote:
> > 
> > 
> > On 2017/2/22 22:43, Paolo Bonzini wrote:
> > > 
> > > 
> > > On 22/02/2017 14:31, Chris Friesen wrote:
> > > > > > 
> > > > > 
> > > > > Can you reproduce it with kernel 4.8+?  I'm suspecting commmit
> > > > > 4e59516a12a6 ("kvm: vmx: ensure VMCS is current while enabling PML",
> > > > > 2016-07-14) to be the fix.
> > > > 
> > > > I can't easily try with a newer kernel, the software package we're using
> > > > has kernel patches that would have to be ported.
> > > > 
> > > > I'm at a conference, don't really have time to set up a pair of test
> > > > machines from scratch with a custom kernel.
> > > 
> > > Hopefully Gaohuai and Rongguang can help with this too.
> > > 
> > > Paolo
> > > 
> > > .
> > > 
> > Yes, we are looking into and testing this.
> > 
> > I think this can result in any memory corruption, if VM1 writes its
> > PML buffer into VM2’s VMCS (since sched_in/sched_out notifier of VM1
> > is not registered yet), then VM1 is destroyed (hence its PML buffer
> > is freed back to kernel), after that, VM2 starts migration, so CPU
> > logs VM2’s dirty GFNS into a freed memory, results in any memory corruption.
> > 
> > As its severity, this commit (http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=4e59516a12a6ef6dcb660cb3a3f70c64bd60cfec)
> > is eligible to back port to kernel stable.
> 
> Hi, Greg, can you cherry pick commit 4e59516a12a6ef6dcb660cb3a3f70c64bd60cfec to 4.4-y?

If the KVM maintainers say it is ok to do so, yes, I will.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration
  2017-02-24  9:59                     ` Greg KH
@ 2017-02-24 10:00                       ` Paolo Bonzini
  2017-02-24 10:07                           ` Greg KH
  0 siblings, 1 reply; 39+ messages in thread
From: Paolo Bonzini @ 2017-02-24 10:00 UTC (permalink / raw)
  To: Greg KH, Herongguang (Stephen)
  Cc: Chris Friesen, Han, Huaitong, hangaohuai, stable, kvm, fangying1,
	xudong.hao, qemu-devel, wangxinxin.wang, kai.huang, rkrcmar,
	guangrong.xiao, linux-kernel



On 24/02/2017 10:59, Greg KH wrote:
> On Fri, Feb 24, 2017 at 05:35:17PM +0800, Herongguang (Stephen) wrote:
>>
>>
>> On 2017/2/24 10:23, Herongguang (Stephen) wrote:
>>>
>>>
>>> On 2017/2/22 22:43, Paolo Bonzini wrote:
>>>>
>>>>
>>>> On 22/02/2017 14:31, Chris Friesen wrote:
>>>>>>>
>>>>>>
>>>>>> Can you reproduce it with kernel 4.8+?  I'm suspecting commmit
>>>>>> 4e59516a12a6 ("kvm: vmx: ensure VMCS is current while enabling PML",
>>>>>> 2016-07-14) to be the fix.
>>>>>
>>>>> I can't easily try with a newer kernel, the software package we're using
>>>>> has kernel patches that would have to be ported.
>>>>>
>>>>> I'm at a conference, don't really have time to set up a pair of test
>>>>> machines from scratch with a custom kernel.
>>>>
>>>> Hopefully Gaohuai and Rongguang can help with this too.
>>>>
>>>> Paolo
>>>>
>>>> .
>>>>
>>> Yes, we are looking into and testing this.
>>>
>>> I think this can result in any memory corruption, if VM1 writes its
>>> PML buffer into VM2’s VMCS (since sched_in/sched_out notifier of VM1
>>> is not registered yet), then VM1 is destroyed (hence its PML buffer
>>> is freed back to kernel), after that, VM2 starts migration, so CPU
>>> logs VM2’s dirty GFNS into a freed memory, results in any memory corruption.
>>>
>>> As its severity, this commit (http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=4e59516a12a6ef6dcb660cb3a3f70c64bd60cfec)
>>> is eligible to back port to kernel stable.
>>
>> Hi, Greg, can you cherry pick commit 4e59516a12a6ef6dcb660cb3a3f70c64bd60cfec to 4.4-y?
> 
> If the KVM maintainers say it is ok to do so, yes, I will.

Yes, he beat me by minutes. :)

Paolo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration
  2017-02-24 10:00                       ` Paolo Bonzini
@ 2017-02-24 10:07                           ` Greg KH
  0 siblings, 0 replies; 39+ messages in thread
From: Greg KH @ 2017-02-24 10:07 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Herongguang (Stephen),
	Chris Friesen, Han, Huaitong, hangaohuai, stable, kvm, fangying1,
	xudong.hao, qemu-devel, wangxinxin.wang, kai.huang, rkrcmar,
	guangrong.xiao, linux-kernel

On Fri, Feb 24, 2017 at 11:00:32AM +0100, Paolo Bonzini wrote:
> 
> 
> On 24/02/2017 10:59, Greg KH wrote:
> > On Fri, Feb 24, 2017 at 05:35:17PM +0800, Herongguang (Stephen) wrote:
> >>
> >>
> >> On 2017/2/24 10:23, Herongguang (Stephen) wrote:
> >>>
> >>>
> >>> On 2017/2/22 22:43, Paolo Bonzini wrote:
> >>>>
> >>>>
> >>>> On 22/02/2017 14:31, Chris Friesen wrote:
> >>>>>>>
> >>>>>>
> >>>>>> Can you reproduce it with kernel 4.8+?  I'm suspecting commmit
> >>>>>> 4e59516a12a6 ("kvm: vmx: ensure VMCS is current while enabling PML",
> >>>>>> 2016-07-14) to be the fix.
> >>>>>
> >>>>> I can't easily try with a newer kernel, the software package we're using
> >>>>> has kernel patches that would have to be ported.
> >>>>>
> >>>>> I'm at a conference, don't really have time to set up a pair of test
> >>>>> machines from scratch with a custom kernel.
> >>>>
> >>>> Hopefully Gaohuai and Rongguang can help with this too.
> >>>>
> >>>> Paolo
> >>>>
> >>>> .
> >>>>
> >>> Yes, we are looking into and testing this.
> >>>
> >>> I think this can result in any memory corruption, if VM1 writes its
> >>> PML buffer into VM2’s VMCS (since sched_in/sched_out notifier of VM1
> >>> is not registered yet), then VM1 is destroyed (hence its PML buffer
> >>> is freed back to kernel), after that, VM2 starts migration, so CPU
> >>> logs VM2’s dirty GFNS into a freed memory, results in any memory corruption.
> >>>
> >>> As its severity, this commit (http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=4e59516a12a6ef6dcb660cb3a3f70c64bd60cfec)
> >>> is eligible to back port to kernel stable.
> >>
> >> Hi, Greg, can you cherry pick commit 4e59516a12a6ef6dcb660cb3a3f70c64bd60cfec to 4.4-y?
> > 
> > If the KVM maintainers say it is ok to do so, yes, I will.
> 
> Yes, he beat me by minutes. :)

Heh, ok, I'll go add it to the recently-announced 4.4.52-rc1 release.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: kvm bug in __rmap_clear_dirty during live migration
@ 2017-02-24 10:07                           ` Greg KH
  0 siblings, 0 replies; 39+ messages in thread
From: Greg KH @ 2017-02-24 10:07 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: hangaohuai, guangrong.xiao, kvm, rkrcmar, wangxinxin.wang,
	qemu-devel, stable, linux-kernel, kai.huang, Han, Huaitong,
	fangying1, Chris Friesen, Herongguang (Stephen),
	xudong.hao

On Fri, Feb 24, 2017 at 11:00:32AM +0100, Paolo Bonzini wrote:
> 
> 
> On 24/02/2017 10:59, Greg KH wrote:
> > On Fri, Feb 24, 2017 at 05:35:17PM +0800, Herongguang (Stephen) wrote:
> >>
> >>
> >> On 2017/2/24 10:23, Herongguang (Stephen) wrote:
> >>>
> >>>
> >>> On 2017/2/22 22:43, Paolo Bonzini wrote:
> >>>>
> >>>>
> >>>> On 22/02/2017 14:31, Chris Friesen wrote:
> >>>>>>>
> >>>>>>
> >>>>>> Can you reproduce it with kernel 4.8+?  I'm suspecting commmit
> >>>>>> 4e59516a12a6 ("kvm: vmx: ensure VMCS is current while enabling PML",
> >>>>>> 2016-07-14) to be the fix.
> >>>>>
> >>>>> I can't easily try with a newer kernel, the software package we're using
> >>>>> has kernel patches that would have to be ported.
> >>>>>
> >>>>> I'm at a conference, don't really have time to set up a pair of test
> >>>>> machines from scratch with a custom kernel.
> >>>>
> >>>> Hopefully Gaohuai and Rongguang can help with this too.
> >>>>
> >>>> Paolo
> >>>>
> >>>> .
> >>>>
> >>> Yes, we are looking into and testing this.
> >>>
> >>> I think this can result in any memory corruption, if VM1 writes its
> >>> PML buffer into VM2’s VMCS (since sched_in/sched_out notifier of VM1
> >>> is not registered yet), then VM1 is destroyed (hence its PML buffer
> >>> is freed back to kernel), after that, VM2 starts migration, so CPU
> >>> logs VM2’s dirty GFNS into a freed memory, results in any memory corruption.
> >>>
> >>> As its severity, this commit (http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=4e59516a12a6ef6dcb660cb3a3f70c64bd60cfec)
> >>> is eligible to back port to kernel stable.
> >>
> >> Hi, Greg, can you cherry pick commit 4e59516a12a6ef6dcb660cb3a3f70c64bd60cfec to 4.4-y?
> > 
> > If the KVM maintainers say it is ok to do so, yes, I will.
> 
> Yes, he beat me by minutes. :)

Heh, ok, I'll go add it to the recently-announced 4.4.52-rc1 release.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration
  2017-02-24  2:23                   ` Herongguang (Stephen)
@ 2017-02-24 15:10                     ` Chris Friesen
  -1 siblings, 0 replies; 39+ messages in thread
From: Chris Friesen @ 2017-02-24 15:10 UTC (permalink / raw)
  To: Herongguang (Stephen), Paolo Bonzini, Han, Huaitong, hangaohuai, stable
  Cc: kvm, fangying1, xudong.hao, qemu-devel, wangxinxin.wang,
	kai.huang, rkrcmar, guangrong.xiao, linux-kernel

On 02/23/2017 08:23 PM, Herongguang (Stephen) wrote:

> On 2017/2/22 22:43, Paolo Bonzini wrote:

>> Hopefully Gaohuai and Rongguang can help with this too.
>>
>> Paolo
>>
>> .
>>
> Yes, we are looking into and testing this.
>
> I think this can result in any memory corruption, if VM1 writes its
> PML buffer into VM2’s VMCS (since sched_in/sched_out notifier of VM1
> is not registered yet), then VM1 is destroyed (hence its PML buffer
> is freed back to kernel), after that, VM2 starts migration, so CPU
> logs VM2’s dirty GFNS into a freed memory, results in any memory corruption.
>
> As its severity, this commit
> (http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=4e59516a12a6ef6dcb660cb3a3f70c64bd60cfec)
>
> is eligible to back port to kernel stable.
>

Are we expecting that fix to resolve the original issue, or is it a separate 
issue that needs fixing in stable?

Chris

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration
@ 2017-02-24 15:10                     ` Chris Friesen
  0 siblings, 0 replies; 39+ messages in thread
From: Chris Friesen @ 2017-02-24 15:10 UTC (permalink / raw)
  To: Herongguang (Stephen), Paolo Bonzini, Han, Huaitong, hangaohuai, stable
  Cc: kvm, fangying1, xudong.hao, qemu-devel, wangxinxin.wang,
	kai.huang, rkrcmar, guangrong.xiao, linux-kernel

On 02/23/2017 08:23 PM, Herongguang (Stephen) wrote:

> On 2017/2/22 22:43, Paolo Bonzini wrote:

>> Hopefully Gaohuai and Rongguang can help with this too.
>>
>> Paolo
>>
>> .
>>
> Yes, we are looking into and testing this.
>
> I think this can result in any memory corruption, if VM1 writes its
> PML buffer into VM2’s VMCS (since sched_in/sched_out notifier of VM1
> is not registered yet), then VM1 is destroyed (hence its PML buffer
> is freed back to kernel), after that, VM2 starts migration, so CPU
> logs VM2’s dirty GFNS into a freed memory, results in any memory corruption.
>
> As its severity, this commit
> (http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=4e59516a12a6ef6dcb660cb3a3f70c64bd60cfec)
>
> is eligible to back port to kernel stable.
>

Are we expecting that fix to resolve the original issue, or is it a separate 
issue that needs fixing in stable?

Chris

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration
  2017-02-24 15:10                     ` Chris Friesen
@ 2017-02-24 15:14                       ` Paolo Bonzini
  -1 siblings, 0 replies; 39+ messages in thread
From: Paolo Bonzini @ 2017-02-24 15:14 UTC (permalink / raw)
  To: Chris Friesen, Herongguang (Stephen), Han, Huaitong, hangaohuai, stable
  Cc: kvm, fangying1, xudong.hao, qemu-devel, wangxinxin.wang,
	kai.huang, rkrcmar, guangrong.xiao, linux-kernel



On 24/02/2017 16:10, Chris Friesen wrote:
> On 02/23/2017 08:23 PM, Herongguang (Stephen) wrote:
> 
>> On 2017/2/22 22:43, Paolo Bonzini wrote:
> 
>>> Hopefully Gaohuai and Rongguang can help with this too.
>>>
>>> Paolo
>>
>> Yes, we are looking into and testing this.
>>
>> I think this can result in any memory corruption, if VM1 writes its
>> PML buffer into VM2’s VMCS (since sched_in/sched_out notifier of VM1
>> is not registered yet), then VM1 is destroyed (hence its PML buffer
>> is freed back to kernel), after that, VM2 starts migration, so CPU
>> logs VM2’s dirty GFNS into a freed memory, results in any memory
>> corruption.
>>
>> As its severity, this commit
>> (http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=4e59516a12a6ef6dcb660cb3a3f70c64bd60cfec)
>>
>> is eligible to back port to kernel stable.
> 
> Are we expecting that fix to resolve the original issue, or is it a
> separate issue that needs fixing in stable?

It should be the original issue.

Paolo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: kvm bug in __rmap_clear_dirty during live migration
@ 2017-02-24 15:14                       ` Paolo Bonzini
  0 siblings, 0 replies; 39+ messages in thread
From: Paolo Bonzini @ 2017-02-24 15:14 UTC (permalink / raw)
  To: Chris Friesen, Herongguang (Stephen), Han, Huaitong, hangaohuai, stable
  Cc: guangrong.xiao, kvm, rkrcmar, wangxinxin.wang, qemu-devel,
	linux-kernel, kai.huang, fangying1, xudong.hao



On 24/02/2017 16:10, Chris Friesen wrote:
> On 02/23/2017 08:23 PM, Herongguang (Stephen) wrote:
> 
>> On 2017/2/22 22:43, Paolo Bonzini wrote:
> 
>>> Hopefully Gaohuai and Rongguang can help with this too.
>>>
>>> Paolo
>>
>> Yes, we are looking into and testing this.
>>
>> I think this can result in any memory corruption, if VM1 writes its
>> PML buffer into VM2’s VMCS (since sched_in/sched_out notifier of VM1
>> is not registered yet), then VM1 is destroyed (hence its PML buffer
>> is freed back to kernel), after that, VM2 starts migration, so CPU
>> logs VM2’s dirty GFNS into a freed memory, results in any memory
>> corruption.
>>
>> As its severity, this commit
>> (http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=4e59516a12a6ef6dcb660cb3a3f70c64bd60cfec)
>>
>> is eligible to back port to kernel stable.
> 
> Are we expecting that fix to resolve the original issue, or is it a
> separate issue that needs fixing in stable?

It should be the original issue.

Paolo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration
  2017-02-24 15:14                       ` Paolo Bonzini
@ 2017-02-25  1:44                         ` Herongguang (Stephen)
  -1 siblings, 0 replies; 39+ messages in thread
From: Herongguang (Stephen) @ 2017-02-25  1:44 UTC (permalink / raw)
  To: Paolo Bonzini, Chris Friesen, Han, Huaitong, hangaohuai, stable
  Cc: kvm, fangying1, xudong.hao, qemu-devel, wangxinxin.wang,
	kai.huang, rkrcmar, guangrong.xiao, linux-kernel



On 2017/2/24 23:14, Paolo Bonzini wrote:
>
>
> On 24/02/2017 16:10, Chris Friesen wrote:
>> On 02/23/2017 08:23 PM, Herongguang (Stephen) wrote:
>>
>>> On 2017/2/22 22:43, Paolo Bonzini wrote:
>>
>>>> Hopefully Gaohuai and Rongguang can help with this too.
>>>>
>>>> Paolo
>>>
>>> Yes, we are looking into and testing this.
>>>
>>> I think this can result in any memory corruption, if VM1 writes its
>>> PML buffer into VM2’s VMCS (since sched_in/sched_out notifier of VM1
>>> is not registered yet), then VM1 is destroyed (hence its PML buffer
>>> is freed back to kernel), after that, VM2 starts migration, so CPU
>>> logs VM2’s dirty GFNS into a freed memory, results in any memory
>>> corruption.
>>>
>>> As its severity, this commit
>>> (http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=4e59516a12a6ef6dcb660cb3a3f70c64bd60cfec)
>>>
>>> is eligible to back port to kernel stable.
>>
>> Are we expecting that fix to resolve the original issue, or is it a
>> separate issue that needs fixing in stable?
>
> It should be the original issue.
>
> Paolo
>
> .
>
Yes, I agree, though we are still testing.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration
@ 2017-02-25  1:44                         ` Herongguang (Stephen)
  0 siblings, 0 replies; 39+ messages in thread
From: Herongguang (Stephen) @ 2017-02-25  1:44 UTC (permalink / raw)
  To: Paolo Bonzini, Chris Friesen, Han, Huaitong, hangaohuai, stable
  Cc: kvm, fangying1, xudong.hao, qemu-devel, wangxinxin.wang,
	kai.huang, rkrcmar, guangrong.xiao, linux-kernel



On 2017/2/24 23:14, Paolo Bonzini wrote:
>
>
> On 24/02/2017 16:10, Chris Friesen wrote:
>> On 02/23/2017 08:23 PM, Herongguang (Stephen) wrote:
>>
>>> On 2017/2/22 22:43, Paolo Bonzini wrote:
>>
>>>> Hopefully Gaohuai and Rongguang can help with this too.
>>>>
>>>> Paolo
>>>
>>> Yes, we are looking into and testing this.
>>>
>>> I think this can result in any memory corruption, if VM1 writes its
>>> PML buffer into VM2’s VMCS (since sched_in/sched_out notifier of VM1
>>> is not registered yet), then VM1 is destroyed (hence its PML buffer
>>> is freed back to kernel), after that, VM2 starts migration, so CPU
>>> logs VM2’s dirty GFNS into a freed memory, results in any memory
>>> corruption.
>>>
>>> As its severity, this commit
>>> (http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=4e59516a12a6ef6dcb660cb3a3f70c64bd60cfec)
>>>
>>> is eligible to back port to kernel stable.
>>
>> Are we expecting that fix to resolve the original issue, or is it a
>> separate issue that needs fixing in stable?
>
> It should be the original issue.
>
> Paolo
>
> .
>
Yes, I agree, though we are still testing.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration
  2017-02-25  1:44                         ` Herongguang (Stephen)
@ 2017-02-27  7:38                           ` Huang, Kai
  -1 siblings, 0 replies; 39+ messages in thread
From: Huang, Kai @ 2017-02-27  7:38 UTC (permalink / raw)
  To: Herongguang (Stephen),
	Paolo Bonzini, Chris Friesen, Han, Huaitong, hangaohuai, stable
  Cc: kvm, fangying1, xudong.hao, qemu-devel, wangxinxin.wang, rkrcmar,
	guangrong.xiao, linux-kernel



On 2/25/2017 2:44 PM, Herongguang (Stephen) wrote:
>
>
> On 2017/2/24 23:14, Paolo Bonzini wrote:
>>
>>
>> On 24/02/2017 16:10, Chris Friesen wrote:
>>> On 02/23/2017 08:23 PM, Herongguang (Stephen) wrote:
>>>
>>>> On 2017/2/22 22:43, Paolo Bonzini wrote:
>>>
>>>>> Hopefully Gaohuai and Rongguang can help with this too.
>>>>>
>>>>> Paolo
>>>>
>>>> Yes, we are looking into and testing this.
>>>>
>>>> I think this can result in any memory corruption, if VM1 writes its
>>>> PML buffer into VM2’s VMCS (since sched_in/sched_out notifier of VM1
>>>> is not registered yet), then VM1 is destroyed (hence its PML buffer
>>>> is freed back to kernel), after that, VM2 starts migration, so CPU
>>>> logs VM2’s dirty GFNS into a freed memory, results in any memory
>>>> corruption.
>>>>
>>>> As its severity, this commit
>>>> (http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=4e59516a12a6ef6dcb660cb3a3f70c64bd60cfec)
>>>>
>>>>
>>>> is eligible to back port to kernel stable.
>>>
>>> Are we expecting that fix to resolve the original issue, or is it a
>>> separate issue that needs fixing in stable?
>>
>> It should be the original issue.
>>
>> Paolo
>>
>> .
>>
> Yes, I agree, though we are still testing.
>
>
Hi Stephen,

Sorry for late reply. I was taking the whole week off last week. How's 
the test going?

Thanks,
-Kai

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: kvm bug in __rmap_clear_dirty during live migration
@ 2017-02-27  7:38                           ` Huang, Kai
  0 siblings, 0 replies; 39+ messages in thread
From: Huang, Kai @ 2017-02-27  7:38 UTC (permalink / raw)
  To: Herongguang (Stephen),
	Paolo Bonzini, Chris Friesen, Han, Huaitong, hangaohuai, stable
  Cc: guangrong.xiao, kvm, rkrcmar, wangxinxin.wang, qemu-devel,
	linux-kernel, fangying1, xudong.hao



On 2/25/2017 2:44 PM, Herongguang (Stephen) wrote:
>
>
> On 2017/2/24 23:14, Paolo Bonzini wrote:
>>
>>
>> On 24/02/2017 16:10, Chris Friesen wrote:
>>> On 02/23/2017 08:23 PM, Herongguang (Stephen) wrote:
>>>
>>>> On 2017/2/22 22:43, Paolo Bonzini wrote:
>>>
>>>>> Hopefully Gaohuai and Rongguang can help with this too.
>>>>>
>>>>> Paolo
>>>>
>>>> Yes, we are looking into and testing this.
>>>>
>>>> I think this can result in any memory corruption, if VM1 writes its
>>>> PML buffer into VM2’s VMCS (since sched_in/sched_out notifier of VM1
>>>> is not registered yet), then VM1 is destroyed (hence its PML buffer
>>>> is freed back to kernel), after that, VM2 starts migration, so CPU
>>>> logs VM2’s dirty GFNS into a freed memory, results in any memory
>>>> corruption.
>>>>
>>>> As its severity, this commit
>>>> (http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=4e59516a12a6ef6dcb660cb3a3f70c64bd60cfec)
>>>>
>>>>
>>>> is eligible to back port to kernel stable.
>>>
>>> Are we expecting that fix to resolve the original issue, or is it a
>>> separate issue that needs fixing in stable?
>>
>> It should be the original issue.
>>
>> Paolo
>>
>> .
>>
> Yes, I agree, though we are still testing.
>
>
Hi Stephen,

Sorry for late reply. I was taking the whole week off last week. How's 
the test going?

Thanks,
-Kai

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration
  2017-02-27  7:38                           ` Huang, Kai
@ 2017-03-13 14:58                             ` fangying
  -1 siblings, 0 replies; 39+ messages in thread
From: fangying @ 2017-03-13 14:58 UTC (permalink / raw)
  To: Huang, Kai, Herongguang (Stephen),
	Paolo Bonzini, Chris Friesen, Han, Huaitong, hangaohuai, stable
  Cc: kvm, xudong.hao, qemu-devel, wangxinxin.wang, rkrcmar,
	guangrong.xiao, linux-kernel

Hi, Huang Kai

After weeks of intensive testing, we think the problem is solved and 
this issue can be closed.

On 2017/2/27 15:38, Huang, Kai wrote:
>
>
> On 2/25/2017 2:44 PM, Herongguang (Stephen) wrote:
>>
>>
>> On 2017/2/24 23:14, Paolo Bonzini wrote:
>>>
>>>
>>> On 24/02/2017 16:10, Chris Friesen wrote:
>>>> On 02/23/2017 08:23 PM, Herongguang (Stephen) wrote:
>>>>
>>>>> On 2017/2/22 22:43, Paolo Bonzini wrote:
>>>>
>>>>>> Hopefully Gaohuai and Rongguang can help with this too.
>>>>>>
>>>>>> Paolo
>>>>>
>>>>> Yes, we are looking into and testing this.
>>>>>
>>>>> I think this can result in any memory corruption, if VM1 writes its
>>>>> PML buffer into VM2’s VMCS (since sched_in/sched_out notifier of VM1
>>>>> is not registered yet), then VM1 is destroyed (hence its PML buffer
>>>>> is freed back to kernel), after that, VM2 starts migration, so CPU
>>>>> logs VM2’s dirty GFNS into a freed memory, results in any memory
>>>>> corruption.
>>>>>
>>>>> As its severity, this commit
>>>>> (http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=4e59516a12a6ef6dcb660cb3a3f70c64bd60cfec)
>>>>>
>>>>>
>>>>>
>>>>> is eligible to back port to kernel stable.
>>>>
>>>> Are we expecting that fix to resolve the original issue, or is it a
>>>> separate issue that needs fixing in stable?
>>>
>>> It should be the original issue.
>>>
>>> Paolo
>>>
>>> .
>>>
>> Yes, I agree, though we are still testing.
>>
>>
> Hi Stephen,
>
> Sorry for late reply. I was taking the whole week off last week. How's
> the test going?
>
> Thanks,
> -Kai
>
> .
>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration
@ 2017-03-13 14:58                             ` fangying
  0 siblings, 0 replies; 39+ messages in thread
From: fangying @ 2017-03-13 14:58 UTC (permalink / raw)
  To: Huang, Kai, Herongguang (Stephen),
	Paolo Bonzini, Chris Friesen, Han, Huaitong, hangaohuai, stable
  Cc: kvm, xudong.hao, qemu-devel, wangxinxin.wang, rkrcmar,
	guangrong.xiao, linux-kernel

Hi, Huang Kai

After weeks of intensive testing, we think the problem is solved and 
this issue can be closed.

On 2017/2/27 15:38, Huang, Kai wrote:
>
>
> On 2/25/2017 2:44 PM, Herongguang (Stephen) wrote:
>>
>>
>> On 2017/2/24 23:14, Paolo Bonzini wrote:
>>>
>>>
>>> On 24/02/2017 16:10, Chris Friesen wrote:
>>>> On 02/23/2017 08:23 PM, Herongguang (Stephen) wrote:
>>>>
>>>>> On 2017/2/22 22:43, Paolo Bonzini wrote:
>>>>
>>>>>> Hopefully Gaohuai and Rongguang can help with this too.
>>>>>>
>>>>>> Paolo
>>>>>
>>>>> Yes, we are looking into and testing this.
>>>>>
>>>>> I think this can result in any memory corruption, if VM1 writes its
>>>>> PML buffer into VM2’s VMCS (since sched_in/sched_out notifier of VM1
>>>>> is not registered yet), then VM1 is destroyed (hence its PML buffer
>>>>> is freed back to kernel), after that, VM2 starts migration, so CPU
>>>>> logs VM2’s dirty GFNS into a freed memory, results in any memory
>>>>> corruption.
>>>>>
>>>>> As its severity, this commit
>>>>> (http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=4e59516a12a6ef6dcb660cb3a3f70c64bd60cfec)
>>>>>
>>>>>
>>>>>
>>>>> is eligible to back port to kernel stable.
>>>>
>>>> Are we expecting that fix to resolve the original issue, or is it a
>>>> separate issue that needs fixing in stable?
>>>
>>> It should be the original issue.
>>>
>>> Paolo
>>>
>>> .
>>>
>> Yes, I agree, though we are still testing.
>>
>>
> Hi Stephen,
>
> Sorry for late reply. I was taking the whole week off last week. How's
> the test going?
>
> Thanks,
> -Kai
>
> .
>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration
  2017-03-13 14:58                             ` fangying
  (?)
@ 2017-03-13 16:57                             ` Paolo Bonzini
  2017-03-16  1:40                               ` Huang, Kai
  -1 siblings, 1 reply; 39+ messages in thread
From: Paolo Bonzini @ 2017-03-13 16:57 UTC (permalink / raw)
  To: fangying, Huang, Kai, Herongguang (Stephen),
	Chris Friesen, Han, Huaitong, hangaohuai, stable
  Cc: kvm, xudong.hao, qemu-devel, wangxinxin.wang, rkrcmar,
	guangrong.xiao, linux-kernel



On 13/03/2017 15:58, fangying wrote:
> Hi, Huang Kai
> 
> After weeks of intensive testing, we think the problem is solved and
> this issue can be closed.

Thanks for the update.  We got to the same conclusion.

Paolo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration
  2017-03-13 16:57                             ` Paolo Bonzini
@ 2017-03-16  1:40                               ` Huang, Kai
  0 siblings, 0 replies; 39+ messages in thread
From: Huang, Kai @ 2017-03-16  1:40 UTC (permalink / raw)
  To: Paolo Bonzini, fangying, Herongguang (Stephen),
	Chris Friesen, Han, Huaitong, hangaohuai, stable
  Cc: kvm, xudong.hao, qemu-devel, wangxinxin.wang, rkrcmar,
	guangrong.xiao, linux-kernel

Thanks!

Thanks,
-Kai

On 3/14/2017 5:57 AM, Paolo Bonzini wrote:
>
>
> On 13/03/2017 15:58, fangying wrote:
>> Hi, Huang Kai
>>
>> After weeks of intensive testing, we think the problem is solved and
>> this issue can be closed.
>
> Thanks for the update.  We got to the same conclusion.
>
> Paolo
>

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2017-03-16  1:40 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-09 14:37 kvm bug in __rmap_clear_dirty during live migration Herongguang (Stephen)
2017-02-09 14:37 ` [Qemu-devel] " Herongguang (Stephen)
2017-02-10  8:50 ` Herongguang (Stephen)
2017-02-10  8:50   ` [Qemu-devel] " Herongguang (Stephen)
2017-02-10  9:11 ` Herongguang (Stephen)
2017-02-10  9:11   ` [Qemu-devel] " Herongguang (Stephen)
2017-02-10 15:28   ` Chris Friesen
2017-02-10 15:28     ` [Qemu-devel] " Chris Friesen
2017-02-10 16:14     ` Paolo Bonzini
2017-02-10 16:14       ` [Qemu-devel] " Paolo Bonzini
2017-02-13 13:19     ` hangaohuai
2017-02-13 13:19       ` hangaohuai
2017-02-20  2:17     ` hangaohuai
2017-02-20  2:17       ` [Qemu-devel] " hangaohuai
2017-02-20  4:38       ` Han, Huaitong
2017-02-22  3:08         ` Chris Friesen
2017-02-22 11:15           ` Paolo Bonzini
2017-02-22 13:31             ` Chris Friesen
2017-02-22 14:43               ` Paolo Bonzini
2017-02-24  2:23                 ` Herongguang (Stephen)
2017-02-24  2:23                   ` Herongguang (Stephen)
2017-02-24  9:35                   ` Herongguang (Stephen)
2017-02-24  9:35                     ` Herongguang (Stephen)
2017-02-24  9:59                     ` Greg KH
2017-02-24 10:00                       ` Paolo Bonzini
2017-02-24 10:07                         ` Greg KH
2017-02-24 10:07                           ` Greg KH
2017-02-24 15:10                   ` [Qemu-devel] " Chris Friesen
2017-02-24 15:10                     ` Chris Friesen
2017-02-24 15:14                     ` Paolo Bonzini
2017-02-24 15:14                       ` Paolo Bonzini
2017-02-25  1:44                       ` [Qemu-devel] " Herongguang (Stephen)
2017-02-25  1:44                         ` Herongguang (Stephen)
2017-02-27  7:38                         ` Huang, Kai
2017-02-27  7:38                           ` Huang, Kai
2017-03-13 14:58                           ` [Qemu-devel] " fangying
2017-03-13 14:58                             ` fangying
2017-03-13 16:57                             ` Paolo Bonzini
2017-03-16  1:40                               ` Huang, Kai

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.