From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752687AbcGGKdq (ORCPT ); Thu, 7 Jul 2016 06:33:46 -0400 Received: from mail-wm0-f67.google.com ([74.125.82.67]:34750 "EHLO mail-wm0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751405AbcGGKdo (ORCPT ); Thu, 7 Jul 2016 06:33:44 -0400 Subject: Re: [PATCH v3 2/2] KVM: nVMX: Fix preemption timer bit set in vmcs02 even if L1 doesn't enable it To: Wanpeng Li References: <1467863216-5521-1-git-send-email-wanpeng.li@hotmail.com> <1467863216-5521-2-git-send-email-wanpeng.li@hotmail.com> Cc: "linux-kernel@vger.kernel.org" , kvm , Wanpeng Li , =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= , Yunhong Jiang , Jan Kiszka , Haozhong Zhang From: Paolo Bonzini Message-ID: Date: Thu, 7 Jul 2016 12:33:40 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.1.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 07/07/2016 10:31, Wanpeng Li wrote: > 2016-07-07 16:10 GMT+08:00 Paolo Bonzini : >> >> >> On 07/07/2016 05:46, Wanpeng Li wrote: >>> From: Wanpeng Li >>> >>> We will go to vcpu_run() loop after L0 emulates VMRESUME which incurs >>> kvm_sched_out and kvm_sched_in operations since cond_resched() will be >>> called once need resched. Preemption timer will be reprogrammed if vCPU >>> is scheduled to a different pCPU. Then the preemption timer bit of vmcs02 >>> will be set if L0 enable preemption timer to run L1 even if L1 doesn't >>> enable preemption timer to run L2. >>> >>> This patch fix it by don't reprogram preemption timer of vmcs02 if L1's >>> vCPU is scheduled on diffent pCPU when we are in the way to vmresume >>> nested guest. >> >> Again, this is wrong. There is no reason why L1's APIC timer cannot be >> emulated through the vmcs12's preemption timer setting. The only issue >> is getting the pin-based execution controls right. > > This patch doesn't intend to implement "L1 TSC deadline timer to > trigger while L2 is running", it just solves why vmcs02 is set even if > > exec_control = vmcs12->pin_based_vm_exec_control; > exec_control |= vmcs_config.pin_based_exec_ctrl; > exec_control &= ~PIN_BASED_VMX_PREEMPTION_TIMER; > > We should set pin-based execution controls right to implement "L1 TSC > deadline timer to trigger while L2 is running". Ok, now I get it, but I still cannot understand the logic in your patch. You write: if (!is_guest_mode(vcpu) && kvm_lapic_hv_timer_in_use(vcpu) && kvm_x86_ops->set_hv_timer(vcpu, kvm_get_lapic_tscdeadline_msr(vcpu))) kvm_lapic_switch_to_sw_timer(vcpu); but this means that while L2 runs you miss L1's APIC timer interrupt. Do you want this instead: if (kvm_lapic_hv_timer_in_use(vcpu) && (is_guest_mode(vcpu) || kvm_x86_ops->set_hv_timer(vcpu, kvm_get_lapic_tscdeadline_msr(vcpu)))) kvm_lapic_switch_to_sw_timer(vcpu); ? Thanks, Paolo