All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Paolo Bonzini <pbonzini@redhat.com>,
	"Zhang, Yang Z" <yang.z.zhang@intel.com>,
	"Liu, RongrongX" <rongrongx.liu@intel.com>,
	Felipe Reyes <freyes@suse.com>,
	Wanpeng Li <wanpeng.li@linux.intel.com>
Subject: [PATCH 3.10 30/55] KVM: nVMX: fix "acknowledge interrupt on exit" when APICv is in use
Date: Wed,  3 Sep 2014 15:05:20 -0700	[thread overview]
Message-ID: <20140903220458.776892223@linuxfoundation.org> (raw)
In-Reply-To: <20140903220453.653576908@linuxfoundation.org>

3.10-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Wanpeng Li <wanpeng.li@linux.intel.com>

commit 56cc2406d68c0f09505c389e276f27a99f495cbd upstream.

After commit 77b0f5d (KVM: nVMX: Ack and write vector info to intr_info
if L1 asks us to), "Acknowledge interrupt on exit" behavior can be
emulated. To do so, KVM will ask the APIC for the interrupt vector if
during a nested vmexit if VM_EXIT_ACK_INTR_ON_EXIT is set.  With APICv,
kvm_get_apic_interrupt would return -1 and give the following WARNING:

Call Trace:
 [<ffffffff81493563>] dump_stack+0x49/0x5e
 [<ffffffff8103f0eb>] warn_slowpath_common+0x7c/0x96
 [<ffffffffa059709a>] ? nested_vmx_vmexit+0xa4/0x233 [kvm_intel]
 [<ffffffff8103f11a>] warn_slowpath_null+0x15/0x17
 [<ffffffffa059709a>] nested_vmx_vmexit+0xa4/0x233 [kvm_intel]
 [<ffffffffa0594295>] ? nested_vmx_exit_handled+0x6a/0x39e [kvm_intel]
 [<ffffffffa0537931>] ? kvm_apic_has_interrupt+0x80/0xd5 [kvm]
 [<ffffffffa05972ec>] vmx_check_nested_events+0xc3/0xd3 [kvm_intel]
 [<ffffffffa051ebe9>] inject_pending_event+0xd0/0x16e [kvm]
 [<ffffffffa051efa0>] vcpu_enter_guest+0x319/0x704 [kvm]

To fix this, we cannot rely on the processor's virtual interrupt delivery,
because "acknowledge interrupt on exit" must only update the virtual
ISR/PPR/IRR registers (and SVI, which is just a cache of the virtual ISR)
but it should not deliver the interrupt through the IDT.  Thus, KVM has
to deliver the interrupt "by hand", similar to the treatment of EOI in
commit fc57ac2c9ca8 (KVM: lapic: sync highest ISR to hardware apic on
EOI, 2014-05-14).

The patch modifies kvm_cpu_get_interrupt to always acknowledge an
interrupt; there are only two callers, and the other is not affected
because it is never reached with kvm_apic_vid_enabled() == true.  Then it
modifies apic_set_isr and apic_clear_irr to update SVI and RVI in addition
to the registers.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Suggested-by: "Zhang, Yang Z" <yang.z.zhang@intel.com>
Tested-by: Liu, RongrongX <rongrongx.liu@intel.com>
Tested-by: Felipe Reyes <freyes@suse.com>
Fixes: 77b0f5d67ff2781f36831cba79674c3e97bd7acf
Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kvm/irq.c   |    2 -
 arch/x86/kvm/lapic.c |   52 ++++++++++++++++++++++++++++++++++++++-------------
 2 files changed, 40 insertions(+), 14 deletions(-)

--- a/arch/x86/kvm/irq.c
+++ b/arch/x86/kvm/irq.c
@@ -108,7 +108,7 @@ int kvm_cpu_get_interrupt(struct kvm_vcp
 
 	vector = kvm_cpu_get_extint(v);
 
-	if (kvm_apic_vid_enabled(v->kvm) || vector != -1)
+	if (vector != -1)
 		return vector;			/* PIC */
 
 	return kvm_get_apic_interrupt(v);	/* APIC */
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -362,25 +362,46 @@ static inline int apic_find_highest_irr(
 
 static inline void apic_clear_irr(int vec, struct kvm_lapic *apic)
 {
-	apic->irr_pending = false;
+	struct kvm_vcpu *vcpu;
+
+	vcpu = apic->vcpu;
+
 	apic_clear_vector(vec, apic->regs + APIC_IRR);
-	if (apic_search_irr(apic) != -1)
-		apic->irr_pending = true;
+	if (unlikely(kvm_apic_vid_enabled(vcpu->kvm)))
+		/* try to update RVI */
+		kvm_make_request(KVM_REQ_EVENT, vcpu);
+	else {
+		vec = apic_search_irr(apic);
+		apic->irr_pending = (vec != -1);
+	}
 }
 
 static inline void apic_set_isr(int vec, struct kvm_lapic *apic)
 {
-	/* Note that we never get here with APIC virtualization enabled.  */
+	struct kvm_vcpu *vcpu;
+
+	if (__apic_test_and_set_vector(vec, apic->regs + APIC_ISR))
+		return;
+
+	vcpu = apic->vcpu;
 
-	if (!__apic_test_and_set_vector(vec, apic->regs + APIC_ISR))
-		++apic->isr_count;
-	BUG_ON(apic->isr_count > MAX_APIC_VECTOR);
 	/*
-	 * ISR (in service register) bit is set when injecting an interrupt.
-	 * The highest vector is injected. Thus the latest bit set matches
-	 * the highest bit in ISR.
+	 * With APIC virtualization enabled, all caching is disabled
+	 * because the processor can modify ISR under the hood.  Instead
+	 * just set SVI.
 	 */
-	apic->highest_isr_cache = vec;
+	if (unlikely(kvm_apic_vid_enabled(vcpu->kvm)))
+		kvm_x86_ops->hwapic_isr_update(vcpu->kvm, vec);
+	else {
+		++apic->isr_count;
+		BUG_ON(apic->isr_count > MAX_APIC_VECTOR);
+		/*
+		 * ISR (in service register) bit is set when injecting an interrupt.
+		 * The highest vector is injected. Thus the latest bit set matches
+		 * the highest bit in ISR.
+		 */
+		apic->highest_isr_cache = vec;
+	}
 }
 
 static inline int apic_find_highest_isr(struct kvm_lapic *apic)
@@ -1641,11 +1662,16 @@ int kvm_get_apic_interrupt(struct kvm_vc
 	int vector = kvm_apic_has_interrupt(vcpu);
 	struct kvm_lapic *apic = vcpu->arch.apic;
 
-	/* Note that we never get here with APIC virtualization enabled.  */
-
 	if (vector == -1)
 		return -1;
 
+	/*
+	 * We get here even with APIC virtualization enabled, if doing
+	 * nested virtualization and L1 runs with the "acknowledge interrupt
+	 * on exit" mode.  Then we cannot inject the interrupt via RVI,
+	 * because the process would deliver it through the IDT.
+	 */
+
 	apic_set_isr(vector, apic);
 	apic_update_ppr(apic);
 	apic_clear_irr(vector, apic);



  parent reply	other threads:[~2014-09-03 23:43 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-03 22:04 [PATCH 3.10 00/55] 3.10.54-stable review Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.10 01/55] stable_kernel_rules: Add pointer to netdev-FAQ for network patches Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.10 02/55] HID: logitech: perform bounds checking on device_id early enough Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.10 03/55] HID: fix a couple of off-by-ones Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.10 04/55] isofs: Fix unbounded recursion when processing relocated directories Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.10 05/55] USB: OHCI: dont lose track of EDs when a controller dies Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.10 06/55] USB: serial: ftdi_sio: Annotate the current Xsens PID assignments Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.10 07/55] USB: serial: ftdi_sio: Add support for new Xsens devices Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.10 08/55] USB: ehci-pci: USB host controller support for Intel Quark X1000 Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.10 09/55] USB: Fix persist resume of some SS USB devices Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 10/55] ALSA: hda - fix an external mic jack problem on a HP machine Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 11/55] ALSA: virtuoso: add Xonar Essence STX II support Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 12/55] ALSA: hda/ca0132 - Dont try loading firmware at resume when already failed Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 13/55] ALSA: hda/realtek - Avoid setting wrong COEF on ALC269 & co Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 14/55] mei: start disconnect request timer consistently Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 16/55] drm: omapdrm: fix compiler errors Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 17/55] hwmon: (sis5595) Prevent overflow problem when writing large limits Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 18/55] hwmon: (lm78) Fix overflow problems seen when writing large temperature limits Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 19/55] hwmon: (gpio-fan) Prevent overflow problem when writing large limits Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 20/55] hwmon: (ads1015) Fix off-by-one for valid channel index checking Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 21/55] hwmon: (lm85) Fix various errors on attribute writes Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 22/55] hwmon: (ads1015) Fix out-of-bounds array access Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 23/55] hwmon: (dme1737) Prevent overflow problem when writing large limits Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 24/55] drivers/i2c/busses: use correct type for dma_map/unmap Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 25/55] ext4: fix ext4_discard_allocated_blocks() if we cant allocate the pa struct Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 26/55] serial: core: Preserve termios c_cflag for console resume Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 27/55] crypto: ux500 - make interrupt mode plausible Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 28/55] KVM: x86: Inter-privilege level ret emulation is not implemeneted Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 29/55] KVM: x86: always exit on EOIs for interrupts listed in the IOAPIC redir table Greg Kroah-Hartman
2014-09-03 22:05 ` Greg Kroah-Hartman [this message]
2014-09-03 22:05 ` [PATCH 3.10 31/55] Revert "KVM: x86: Increase the number of fixed MTRR regs to 10" Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 32/55] kvm: iommu: fix the third parameter of kvm_iommu_put_pages (CVE-2014-3601) Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 33/55] ext4: fix BUG_ON in mb_free_blocks() Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 34/55] drm/radeon: add additional SI pci ids Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 35/55] x86: dont exclude low BIOS area when allocating address space for non-PCI cards Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 36/55] x86_64/vsyscall: Fix warn_bad_vsyscall log output Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 38/55] hpsa: fix bad -ENOMEM return value in hpsa_big_passthru_ioctl Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 39/55] Btrfs: fix csum tree corruption, duplicate and outdated checksums Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 40/55] mei: reset client state on queued connect request Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 41/55] mei: nfc: fix memory leak in error path Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 42/55] jbd2: fix infinite loop when recovering corrupt journal blocks Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 43/55] Staging: speakup: Update __speakup_paste_selection() tty (ab)usage to match vt Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 44/55] xhci: Treat not finding the event_seg on COMP_STOP the same as COMP_STOP_INVAL Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 45/55] usb: xhci: amd chipset also needs short TX quirk Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 46/55] ARM: OMAP2+: hwmod: Rearm wake-up interrupts for DT when MUSB is idled Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 47/55] USB: ftdi_sio: add Basic Micro ATOM Nano USB2Serial PID Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 49/55] USB: whiteheat: Added bounds checking for bulk command response Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 50/55] usb: hub: Prevent hub autosuspend if usbcore.autosuspend is -1 Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 51/55] NFSD: Decrease nfsd_users in nfsd_startup_generic fail Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 52/55] svcrdma: Select NFSv4.1 backchannel transport based on forward channel Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 53/55] NFSv3: Fix another acl regression Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 54/55] NFSv4: Fix problems with close in the presence of a delegation Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.10 55/55] vm_is_stack: use for_each_thread() rather then buggy while_each_thread() Greg Kroah-Hartman
2014-09-03 22:41 ` [PATCH 3.10 00/55] 3.10.54-stable review Greg Kroah-Hartman
2014-09-03 23:44 ` Greg Kroah-Hartman
2014-09-04  4:50   ` Guenter Roeck
2014-09-04 14:02     ` Greg Kroah-Hartman
2014-09-04 13:36 ` Shuah Khan
2014-09-04 13:52   ` Usyskin, Alexander
2014-09-04 13:52     ` Usyskin, Alexander

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140903220458.776892223@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=freyes@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=rongrongx.liu@intel.com \
    --cc=stable@vger.kernel.org \
    --cc=wanpeng.li@linux.intel.com \
    --cc=yang.z.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.