From mboxrd@z Thu Jan  1 00:00:00 1970
From: Steve Rutherford <srutherford@google.com>
Subject: Re: [RFC PATCH 2/4] KVM: x86: Add KVM exit for IOAPIC EOIs
Date: Thu, 28 May 2015 14:58:06 -0700
Message-ID: <20150528215806.GA330@google.com>
References: <1431481652-27268-1-git-send-email-srutherford@google.com>
 <1431481652-27268-2-git-send-email-srutherford@google.com>
 <5562004B.6010501@gmail.com>
 <20150527020635.GA26023@google.com>
 <556556D4.60303@gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: kvm@vger.kernel.org, ahonig@google.com
To: Avi Kivity <avi.kivity@gmail.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mail-ie0-f178.google.com ([209.85.223.178]:35549 "EHLO
	mail-ie0-f178.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1754667AbbE1V6L (ORCPT <rfc822;kvm@vger.kernel.org>);
	Thu, 28 May 2015 17:58:11 -0400
Received: by iesa3 with SMTP id a3so49322524ies.2
        for <kvm@vger.kernel.org>; Thu, 28 May 2015 14:58:10 -0700 (PDT)
Content-Disposition: inline
In-Reply-To: <556556D4.60303@gmail.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On Wed, May 27, 2015 at 08:32:04AM +0300, Avi Kivity wrote:
> On 05/27/2015 05:06 AM, Steve Rutherford wrote:
> >On Sun, May 24, 2015 at 07:46:03PM +0300, Avi Kivity wrote:
> >>On 05/13/2015 04:47 AM, Steve Rutherford wrote:
> >>>Adds KVM_EXIT_IOAPIC_EOI which passes the interrupt vector up to
> >>>userspace.
> >>>
> >>>Uses a per VCPU exit bitmap to decide whether or not the IOAPIC needs
> >>>to be informed (which is identical to the EOI_EXIT_BITMAP field used
> >>>by modern x86 processors, but can also be used to elide kvm IOAPIC EOI
> >>>exits on older processors).
> >>>
> >>>[Note: A prototype using ResampleFDs found that decoupling the EOI
> >>>from the VCPU's thread made it possible for the VCPU to not see a
> >>>recent EOI after reentering the guest. This does not match real
> >>>hardware.]
> >>>
> >>>Compile tested for Intel x86.
> >>>
> >>>Signed-off-by: Steve Rutherford <srutherford@google.com>
> >>>---
> >>>  Documentation/virtual/kvm/api.txt | 10 ++++++++++
> >>>  arch/x86/include/asm/kvm_host.h   |  3 +++
> >>>  arch/x86/kvm/lapic.c              |  9 +++++++++
> >>>  arch/x86/kvm/x86.c                | 11 +++++++++++
> >>>  include/linux/kvm_host.h          |  1 +
> >>>  include/uapi/linux/kvm.h          |  5 +++++
> >>>  6 files changed, 39 insertions(+)
> >>>
> >>>diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> >>>index 0744b4e..dd92996 100644
> >>>--- a/Documentation/virtual/kvm/api.txt
> >>>+++ b/Documentation/virtual/kvm/api.txt
> >>>@@ -3285,6 +3285,16 @@ Valid values for 'type' are:
> >>>  	 */
> >>>  	__u64 kvm_valid_regs;
> >>>  	__u64 kvm_dirty_regs;
> >>>+
> >>>+	/* KVM_EXIT_IOAPIC_EOI */
> >>>+        struct {
> >>>+	       __u8 vector;
> >>>+        } eoi;
> >>>+
> >>>+Indicates that an eoi of a level triggered IOAPIC interrupt on vector has
> >>>+occurred, which should be handled by the userspace IOAPIC. Triggers when
> >>>+the Irqchip has been split between userspace and the kernel.
> >>>+
> >>The ioapic is a global resource, so it doesn't make sense for
> >>information about it to be returned in a per-vcpu structure
> >EOI exits are a per-vcpu behavior, so this doesn't seem all that strange.
> >
> >>(or to block the vcpu while it is being processed).
> >Blocking doesn't feel clean, but doesn't seem all that bad, given
> >that these operations are relatively rare on modern configurations.
> 
> Agree, maybe the realtime people have an interest here.
> 
> >>The way I'd model it is to emulate the APIC bus that connects local
> >>APICs and the IOAPIC, using a socket pair.  When the user-space
> >>ioapic wants to inject an interrupt, it sends a message to the local
> >>APICs which then inject it, and when it's ack'ed the EOI is sent
> >>back on the same bus.
> >Although I'm not certain about this, it sounds to me like this would
> >require a kernel thread to be waiting (in some way) on this socket, which
> >seems rather heavy handed.
> 
> It's been a while since I did kernel programming, but I think you
> can queue a callback to be called when an I/O is ready, and not
> require a thread.  IIRC we do that with irqfd to cause an interrupt
> to be injected.
> 

This should be possible, but it's going to add a ton of complexity, and I don't really see any compelling benefits. If there is a compelling reason to switch to a socket based interface, I'm definitely willing to refactor.

Steve