From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756199AbcIPBII (ORCPT ); Thu, 15 Sep 2016 21:08:08 -0400 Received: from mail-wm0-f66.google.com ([74.125.82.66]:35146 "EHLO mail-wm0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751502AbcIPBH5 (ORCPT ); Thu, 15 Sep 2016 21:07:57 -0400 MIME-Version: 1.0 In-Reply-To: <1473153633-4725-1-git-send-email-wanpeng.li@hotmail.com> References: <1473153633-4725-1-git-send-email-wanpeng.li@hotmail.com> From: Wanpeng Li Date: Fri, 16 Sep 2016 09:07:54 +0800 Message-ID: Subject: Re: [PATCH] KVM: nVMX: Fix reload apic access page warning To: "linux-kernel@vger.kernel.org" , kvm Cc: Wanpeng Li , Paolo Bonzini , =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= , Yunhong Jiang Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by mail.home.local id u8G18Gme009428 Ping, :) 2016-09-06 17:20 GMT+08:00 Wanpeng Li : > From: Wanpeng Li > WARNING: CPU: 1 PID: 4230 at kernel/sched/core.c:7564 __might_sleep+0x7e/0x80 > do not call blocking ops when !TASK_RUNNING; state=1 set at [] prepare_to_swait+0x39/0xa0 > CPU: 1 PID: 4230 Comm: qemu-system-x86 Not tainted 4.8.0-rc5+ #47 > Call Trace: > dump_stack+0x99/0xd0 > __warn+0xd1/0xf0 > warn_slowpath_fmt+0x4f/0x60 > ? prepare_to_swait+0x39/0xa0 > ? prepare_to_swait+0x39/0xa0 > __might_sleep+0x7e/0x80 > __gfn_to_pfn_memslot+0x156/0x480 [kvm] > gfn_to_pfn+0x2a/0x30 [kvm] > gfn_to_page+0xe/0x20 [kvm] > kvm_vcpu_reload_apic_access_page+0x32/0xa0 [kvm] > nested_vmx_vmexit+0x765/0xca0 [kvm_intel] > ? _raw_spin_unlock_irqrestore+0x36/0x80 > vmx_check_nested_events+0x49/0x1f0 [kvm_intel] > kvm_arch_vcpu_runnable+0x2d/0xe0 [kvm] > kvm_vcpu_check_block+0x12/0x60 [kvm] > kvm_vcpu_block+0x94/0x4c0 [kvm] > kvm_arch_vcpu_ioctl_run+0x619/0x1aa0 [kvm] > ? kvm_arch_vcpu_ioctl_run+0xdf1/0x1aa0 [kvm] > kvm_vcpu_ioctl+0x2d3/0x7c0 [kvm] > > =============================== > [ INFO: suspicious RCU usage. ] > 4.8.0-rc5+ #47 Not tainted > ------------------------------- > ./include/linux/kvm_host.h:535 suspicious rcu_dereference_check() usage! > > other info that might help us debug this: > > > rcu_scheduler_active = 1, debug_locks = 0 > 1 lock held by qemu-system-x86/4230: > #0: (&vcpu->mutex){+.+.+.}, at: [] vcpu_load+0x1c/0x60 [kvm] > > stack backtrace: > CPU: 1 PID: 4230 Comm: qemu-system-x86 Not tainted 4.8.0-rc5+ #47 > Call Trace: > dump_stack+0x99/0xd0 > lockdep_rcu_suspicious+0xe7/0x120 > gfn_to_memslot+0x12a/0x140 [kvm] > gfn_to_pfn+0x12/0x30 [kvm] > gfn_to_page+0xe/0x20 [kvm] > kvm_vcpu_reload_apic_access_page+0x32/0xa0 [kvm] > nested_vmx_vmexit+0x765/0xca0 [kvm_intel] > ? _raw_spin_unlock_irqrestore+0x36/0x80 > vmx_check_nested_events+0x49/0x1f0 [kvm_intel] > kvm_arch_vcpu_runnable+0x2d/0xe0 [kvm] > kvm_vcpu_check_block+0x12/0x60 [kvm] > kvm_vcpu_block+0x94/0x4c0 [kvm] > kvm_arch_vcpu_ioctl_run+0x619/0x1aa0 [kvm] > ? kvm_arch_vcpu_ioctl_run+0xdf1/0x1aa0 [kvm] > kvm_vcpu_ioctl+0x2d3/0x7c0 [kvm] > ? __fget+0xfd/0x210 > ? __lock_is_held+0x54/0x70 > do_vfs_ioctl+0x96/0x6a0 > ? __fget+0x11c/0x210 > ? __fget+0x5/0x210 > SyS_ioctl+0x79/0x90 > do_syscall_64+0x81/0x220 > entry_SYSCALL64_slow_path+0x25/0x25 > > These can be triggered by running kvm-unit-test: ./x86-run x86/vmx.flat > > The nested preemption timer is based on hrtimer which is started on L2 > entry, stopped on L2 exit and evaluated via the new check_nested_events > hook. The current logic adds vCPU to a simple waitqueue (TASK_INTERRUPTIBLE) > if need to yield pCPU and w/o holding srcu read lock when accesses memslots, > both can be in nested preemption timer evaluation path which results in > the warning above. > > This patch fix it by leveraging request bit to async reload APIC access > page before vmentry in order to avoid to reload directly during the nested > preemption timer evaluation, it is safe since the vmcs01 is loaded and > current is nested vmexit. > > Cc: Paolo Bonzini > Cc: Radim Krčmář > Cc: Yunhong Jiang > Signed-off-by: Wanpeng Li > --- > arch/x86/kvm/vmx.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > index 5cede40..ee059ce 100644 > --- a/arch/x86/kvm/vmx.c > +++ b/arch/x86/kvm/vmx.c > @@ -10793,7 +10793,7 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason, > * We are now running in L2, mmu_notifier will force to reload the > * page's hpa for L2 vmcs. Need to reload it for L1 before entering L1. > */ > - kvm_vcpu_reload_apic_access_page(vcpu); > + kvm_make_request(KVM_REQ_APIC_PAGE_RELOAD, vcpu); > > /* > * Exiting from L2 to L1, we're now back to L1 which thinks it just > -- > 1.9.1 > -- Regards, Wanpeng Li