From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Zhanghaoyu (A)" Subject: RE: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with ETP enabled Date: Tue, 30 Jul 2013 09:04:56 +0000 Message-ID: References: <51DEA2FC02000048000DF593@novprvoes0310.provo.novell.com> <20130729234716.GA8136@amt.cnet> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7BIT Cc: Bruce Rogers , "paolo.bonzini@gmail.com" , qemu-devel , "Michael S. Tsirkin" , KVM , Avi Kivity , "xiaoguangrong@linux.vnet.ibm.com" , Gleb Natapov , =?iso-8859-1?Q?Andreas_F=E4rber?= , Hanweidong , Luonengjun , "Huangweidong (C)" , Zanghongyong , Xiejunyong , Xiahai , Yi Li , Xin Rong Fu To: Marcelo Tosatti Return-path: Received: from szxga02-in.huawei.com ([119.145.14.65]:43084 "EHLO szxga02-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758289Ab3G3JJK convert rfc822-to-8bit (ORCPT ); Tue, 30 Jul 2013 05:09:10 -0400 In-Reply-To: <20130729234716.GA8136@amt.cnet> Content-Language: en-US Sender: kvm-owner@vger.kernel.org List-ID: >> >> hi all, >> >> >> >> I met similar problem to these, while performing live migration or >> >> save-restore test on the kvm platform (qemu:1.4.0, host:suse11sp2, >> >> guest:suse11sp2), running tele-communication software suite in >> >> guest, >> >> https://lists.gnu.org/archive/html/qemu-devel/2013-05/msg00098.html >> >> http://comments.gmane.org/gmane.comp.emulators.kvm.devel/102506 >> >> http://thread.gmane.org/gmane.comp.emulators.kvm.devel/100592 >> >> https://bugzilla.kernel.org/show_bug.cgi?id=58771 >> >> >> >> After live migration or virsh restore [savefile], one process's CPU >> >> utilization went up by about 30%, resulted in throughput >> >> degradation of this process. >> >> >> >> If EPT disabled, this problem gone. >> >> >> >> I suspect that kvm hypervisor has business with this problem. >> >> Based on above suspect, I want to find the two adjacent versions of >> >> kvm-kmod which triggers this problem or not (e.g. 2.6.39, 3.0-rc1), >> >> and analyze the differences between this two versions, or apply the >> >> patches between this two versions by bisection method, finally find the key patches. >> >> >> >> Any better ideas? >> >> >> >> Thanks, >> >> Zhang Haoyu >> > >> >I've attempted to duplicate this on a number of machines that are as similar to yours as I am able to get my hands on, and so far have not been able to see any performance degradation. And from what I've read in the above links, huge pages do not seem to be part of the problem. >> > >> >So, if you are in a position to bisect the kernel changes, that would probably be the best avenue to pursue in my opinion. >> > >> >Bruce >> >> I found the first bad >> commit([612819c3c6e67bac8fceaa7cc402f13b1b63f7e4] KVM: propagate fault r/w information to gup(), allow read-only memory) which triggers this problem by git bisecting the kvm kernel (download from https://git.kernel.org/pub/scm/virt/kvm/kvm.git) changes. >> >> And, >> git log 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4 -n 1 -p > >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4.log >> git diff >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4~1..612819c3c6e67bac8fceaa7cc4 >> 02f13b1b63f7e4 > 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4.diff >> >> Then, I diffed 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4.log and >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4.diff, >> came to a conclusion that all of the differences between >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4~1 and >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4 >> are contributed by no other than 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4, so this commit is the peace-breaker which directly or indirectly causes the degradation. >> >> Does the map_writable flag passed to mmu_set_spte() function have effect on PTE's PAT flag or increase the VMEXITs induced by that guest tried to write read-only memory? >> >> Thanks, >> Zhang Haoyu >> > >There should be no read-only memory maps backing guest RAM. > >Can you confirm map_writable = false is being passed to __direct_map? (this should not happen, for guest RAM). >And if it is false, please capture the associated GFN. > I added below check and printk at the start of __direct_map() at the fist bad commit version, --- kvm-612819c3c6e67bac8fceaa7cc402f13b1b63f7e4/arch/x86/kvm/mmu.c 2013-07-26 18:44:05.000000000 +0800 +++ kvm-612819/arch/x86/kvm/mmu.c 2013-07-31 00:05:48.000000000 +0800 @@ -2223,6 +2223,9 @@ static int __direct_map(struct kvm_vcpu int pt_write = 0; gfn_t pseudo_gfn; + if (!map_writable) + printk(KERN_ERR "%s: %s: gfn = %llu \n", __FILE__, __func__, gfn); + for_each_shadow_entry(vcpu, (u64)gfn << PAGE_SHIFT, iterator) { if (iterator.level == level) { unsigned pte_access = ACC_ALL; I virsh-save the VM, and then virsh-restore it, so many GFNs were printed, you can absolutely describe it as flooding. >Its probably an issue with an older get_user_pages variant (either in kvm-kmod or the older kernel). Is there any indication of a similar issue with upstream kernel? I will test the upstream kvm host(https://git.kernel.org/pub/scm/virt/kvm/kvm.git) later, if the problem is still there, I will revert the first bad commit patch: 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4 on the upstream, then test it again. And, I collected the VMEXITs statistics in pre-save and post-restore period at first bad commit version, pre-save: COTS-F10S03:~ # perf stat -e "kvm:*" -a sleep 30 Performance counter stats for 'sleep 30': 1222318 kvm:kvm_entry 0 kvm:kvm_hypercall 0 kvm:kvm_hv_hypercall 351755 kvm:kvm_pio 6703 kvm:kvm_cpuid 692502 kvm:kvm_apic 1234173 kvm:kvm_exit 223956 kvm:kvm_inj_virq 0 kvm:kvm_inj_exception 16028 kvm:kvm_page_fault 59872 kvm:kvm_msr 0 kvm:kvm_cr 169596 kvm:kvm_pic_set_irq 81455 kvm:kvm_apic_ipi 245103 kvm:kvm_apic_accept_irq 0 kvm:kvm_nested_vmrun 0 kvm:kvm_nested_intercepts 0 kvm:kvm_nested_vmexit 0 kvm:kvm_nested_vmexit_inject 0 kvm:kvm_nested_intr_vmexit 0 kvm:kvm_invlpga 0 kvm:kvm_skinit 853020 kvm:kvm_emulate_insn 171140 kvm:kvm_set_irq 171534 kvm:kvm_ioapic_set_irq 0 kvm:kvm_msi_set_irq 99276 kvm:kvm_ack_irq 971166 kvm:kvm_mmio 33722 kvm:kvm_fpu 0 kvm:kvm_age_page 0 kvm:kvm_try_async_get_page 0 kvm:kvm_async_pf_not_present 0 kvm:kvm_async_pf_ready 0 kvm:kvm_async_pf_completed 0 kvm:kvm_async_pf_doublefault 30.019069018 seconds time elapsed post-restore: COTS-F10S03:~ # perf stat -e "kvm:*" -a sleep 30 Performance counter stats for 'sleep 30': 1327880 kvm:kvm_entry 0 kvm:kvm_hypercall 0 kvm:kvm_hv_hypercall 375189 kvm:kvm_pio 6925 kvm:kvm_cpuid 804414 kvm:kvm_apic 1339352 kvm:kvm_exit 245922 kvm:kvm_inj_virq 0 kvm:kvm_inj_exception 15856 kvm:kvm_page_fault 39500 kvm:kvm_msr 1 kvm:kvm_cr 179150 kvm:kvm_pic_set_irq 98436 kvm:kvm_apic_ipi 247430 kvm:kvm_apic_accept_irq 0 kvm:kvm_nested_vmrun 0 kvm:kvm_nested_intercepts 0 kvm:kvm_nested_vmexit 0 kvm:kvm_nested_vmexit_inject 0 kvm:kvm_nested_intr_vmexit 0 kvm:kvm_invlpga 0 kvm:kvm_skinit 955410 kvm:kvm_emulate_insn 182240 kvm:kvm_set_irq 182562 kvm:kvm_ioapic_set_irq 0 kvm:kvm_msi_set_irq 105267 kvm:kvm_ack_irq 1113999 kvm:kvm_mmio 37789 kvm:kvm_fpu 0 kvm:kvm_age_page 0 kvm:kvm_try_async_get_page 0 kvm:kvm_async_pf_not_present 0 kvm:kvm_async_pf_ready 0 kvm:kvm_async_pf_completed 0 kvm:kvm_async_pf_doublefault 30.000779718 seconds time elapsed Thanks, Zhang Haoyu From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:38723) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1V45vd-0002Ut-V3 for qemu-devel@nongnu.org; Tue, 30 Jul 2013 05:09:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1V45vY-0003Sc-BU for qemu-devel@nongnu.org; Tue, 30 Jul 2013 05:09:13 -0400 Received: from szxga02-in.huawei.com ([119.145.14.65]:41440) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1V45vX-0003Ni-B1 for qemu-devel@nongnu.org; Tue, 30 Jul 2013 05:09:08 -0400 From: "Zhanghaoyu (A)" Date: Tue, 30 Jul 2013 09:04:56 +0000 Message-ID: References: <51DEA2FC02000048000DF593@novprvoes0310.provo.novell.com> <20130729234716.GA8136@amt.cnet> In-Reply-To: <20130729234716.GA8136@amt.cnet> Content-Language: en-US Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with ETP enabled List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Marcelo Tosatti Cc: Xiejunyong , "Huangweidong (C)" , Gleb Natapov , KVM , "Michael S. Tsirkin" , Luonengjun , Xiahai , Hanweidong , "paolo.bonzini@gmail.com" , qemu-devel , Bruce Rogers , Zanghongyong , Xin Rong Fu , Avi Kivity , "xiaoguangrong@linux.vnet.ibm.com" , =?iso-8859-1?Q?Andreas_F=E4rber?= , Yi Li >> >> hi all, >> >>=20 >> >> I met similar problem to these, while performing live migration or=20 >> >> save-restore test on the kvm platform (qemu:1.4.0, host:suse11sp2,=20 >> >> guest:suse11sp2), running tele-communication software suite in=20 >> >> guest,=20 >> >> https://lists.gnu.org/archive/html/qemu-devel/2013-05/msg00098.html >> >> http://comments.gmane.org/gmane.comp.emulators.kvm.devel/102506 >> >> http://thread.gmane.org/gmane.comp.emulators.kvm.devel/100592 >> >> https://bugzilla.kernel.org/show_bug.cgi?id=3D58771 >> >>=20 >> >> After live migration or virsh restore [savefile], one process's CPU=20 >> >> utilization went up by about 30%, resulted in throughput=20 >> >> degradation of this process. >> >>=20 >> >> If EPT disabled, this problem gone. >> >>=20 >> >> I suspect that kvm hypervisor has business with this problem. >> >> Based on above suspect, I want to find the two adjacent versions of=20 >> >> kvm-kmod which triggers this problem or not (e.g. 2.6.39, 3.0-rc1),=20 >> >> and analyze the differences between this two versions, or apply the=20 >> >> patches between this two versions by bisection method, finally find t= he key patches. >> >>=20 >> >> Any better ideas? >> >>=20 >> >> Thanks, >> >> Zhang Haoyu >> > >> >I've attempted to duplicate this on a number of machines that are as si= milar to yours as I am able to get my hands on, and so far have not been ab= le to see any performance degradation. And from what I've read in the above= links, huge pages do not seem to be part of the problem. >> > >> >So, if you are in a position to bisect the kernel changes, that would p= robably be the best avenue to pursue in my opinion. >> > >> >Bruce >>=20 >> I found the first bad=20 >> commit([612819c3c6e67bac8fceaa7cc402f13b1b63f7e4] KVM: propagate fault r= /w information to gup(), allow read-only memory) which triggers this proble= m by git bisecting the kvm kernel (download from https://git.kernel.org/pub= /scm/virt/kvm/kvm.git) changes. >>=20 >> And, >> git log 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4 -n 1 -p >=20 >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4.log >> git diff=20 >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4~1..612819c3c6e67bac8fceaa7cc4 >> 02f13b1b63f7e4 > 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4.diff >>=20 >> Then, I diffed 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4.log and=20 >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4.diff, >> came to a conclusion that all of the differences between=20 >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4~1 and=20 >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4 >> are contributed by no other than 612819c3c6e67bac8fceaa7cc402f13b1b63f7e= 4, so this commit is the peace-breaker which directly or indirectly causes = the degradation. >>=20 >> Does the map_writable flag passed to mmu_set_spte() function have effect= on PTE's PAT flag or increase the VMEXITs induced by that guest tried to w= rite read-only memory? >>=20 >> Thanks, >> Zhang Haoyu >>=20 > >There should be no read-only memory maps backing guest RAM. > >Can you confirm map_writable =3D false is being passed to __direct_map? (t= his should not happen, for guest RAM). >And if it is false, please capture the associated GFN. > I added below check and printk at the start of __direct_map() at the fist b= ad commit version, --- kvm-612819c3c6e67bac8fceaa7cc402f13b1b63f7e4/arch/x86/kvm/mmu.c 201= 3-07-26 18:44:05.000000000 +0800 +++ kvm-612819/arch/x86/kvm/mmu.c 2013-07-31 00:05:48.000000000 +0800 @@ -2223,6 +2223,9 @@ static int __direct_map(struct kvm_vcpu int pt_write =3D 0; gfn_t pseudo_gfn; + if (!map_writable) + printk(KERN_ERR "%s: %s: gfn =3D %llu \n", __FILE__, __fun= c__, gfn); + for_each_shadow_entry(vcpu, (u64)gfn << PAGE_SHIFT, iterator) { if (iterator.level =3D=3D level) { unsigned pte_access =3D ACC_ALL; I virsh-save the VM, and then virsh-restore it, so many GFNs were printed, = you can absolutely describe it as flooding. >Its probably an issue with an older get_user_pages variant (either in kvm-= kmod or the older kernel). Is there any indication of a similar issue with = upstream kernel? I will test the upstream kvm host(https://git.kernel.org/pub/scm/virt/kvm/k= vm.git) later, if the problem is still there,=20 I will revert the first bad commit patch: 612819c3c6e67bac8fceaa7cc402f13b1= b63f7e4 on the upstream, then test it again. And, I collected the VMEXITs statistics in pre-save and post-restore period= at first bad commit version, pre-save: COTS-F10S03:~ # perf stat -e "kvm:*" -a sleep 30 Performance counter stats for 'sleep 30': 1222318 kvm:kvm_entry 0 kvm:kvm_hypercall 0 kvm:kvm_hv_hypercall 351755 kvm:kvm_pio 6703 kvm:kvm_cpuid 692502 kvm:kvm_apic 1234173 kvm:kvm_exit 223956 kvm:kvm_inj_virq 0 kvm:kvm_inj_exception 16028 kvm:kvm_page_fault 59872 kvm:kvm_msr 0 kvm:kvm_cr 169596 kvm:kvm_pic_set_irq 81455 kvm:kvm_apic_ipi 245103 kvm:kvm_apic_accept_irq 0 kvm:kvm_nested_vmrun 0 kvm:kvm_nested_intercepts 0 kvm:kvm_nested_vmexit 0 kvm:kvm_nested_vmexit_inject 0 kvm:kvm_nested_intr_vmexit 0 kvm:kvm_invlpga 0 kvm:kvm_skinit 853020 kvm:kvm_emulate_insn 171140 kvm:kvm_set_irq 171534 kvm:kvm_ioapic_set_irq 0 kvm:kvm_msi_set_irq 99276 kvm:kvm_ack_irq 971166 kvm:kvm_mmio 33722 kvm:kvm_fpu 0 kvm:kvm_age_page 0 kvm:kvm_try_async_get_page 0 kvm:kvm_async_pf_not_present 0 kvm:kvm_async_pf_ready 0 kvm:kvm_async_pf_completed 0 kvm:kvm_async_pf_doublefault 30.019069018 seconds time elapsed post-restore: COTS-F10S03:~ # perf stat -e "kvm:*" -a sleep 30 Performance counter stats for 'sleep 30': 1327880 kvm:kvm_entry 0 kvm:kvm_hypercall 0 kvm:kvm_hv_hypercall 375189 kvm:kvm_pio 6925 kvm:kvm_cpuid 804414 kvm:kvm_apic 1339352 kvm:kvm_exit 245922 kvm:kvm_inj_virq 0 kvm:kvm_inj_exception 15856 kvm:kvm_page_fault 39500 kvm:kvm_msr 1 kvm:kvm_cr 179150 kvm:kvm_pic_set_irq 98436 kvm:kvm_apic_ipi 247430 kvm:kvm_apic_accept_irq 0 kvm:kvm_nested_vmrun 0 kvm:kvm_nested_intercepts 0 kvm:kvm_nested_vmexit 0 kvm:kvm_nested_vmexit_inject 0 kvm:kvm_nested_intr_vmexit 0 kvm:kvm_invlpga 0 kvm:kvm_skinit 955410 kvm:kvm_emulate_insn 182240 kvm:kvm_set_irq 182562 kvm:kvm_ioapic_set_irq 0 kvm:kvm_msi_set_irq 105267 kvm:kvm_ack_irq 1113999 kvm:kvm_mmio 37789 kvm:kvm_fpu 0 kvm:kvm_age_page 0 kvm:kvm_try_async_get_page 0 kvm:kvm_async_pf_not_present 0 kvm:kvm_async_pf_ready 0 kvm:kvm_async_pf_completed 0 kvm:kvm_async_pf_doublefault 30.000779718 seconds time elapsed Thanks, Zhang Haoyu