All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ladi Prosek <lprosek@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: KVM list <kvm@vger.kernel.org>, Radim Krcmar <rkrcmar@redhat.com>
Subject: Re: [PATCH] KVM: x86: implement IOAPIC_REG_EOI for directed EOI support
Date: Wed, 12 Apr 2017 12:28:25 +0200	[thread overview]
Message-ID: <CABdb737CfM_1XH=eodQCC8Wd9WPUzo1c2+Z6iG1wiTpt=J=X9g@mail.gmail.com> (raw)
In-Reply-To: <CABdb735xBo01BvruS-ZrDsGZ6EL49suHbqGPivRdGsEpahbM-g@mail.gmail.com>

On Wed, Apr 12, 2017 at 11:37 AM, Ladi Prosek <lprosek@redhat.com> wrote:
> On Wed, Apr 12, 2017 at 11:06 AM, Peter Xu <peterx@redhat.com> wrote:
>> On Wed, Apr 12, 2017 at 09:36:58AM +0200, Ladi Prosek wrote:
>>> On Wed, Apr 12, 2017 at 8:40 AM, Peter Xu <peterx@redhat.com> wrote:
>>> > On Tue, Apr 11, 2017 at 04:11:15PM +0200, Ladi Prosek wrote:
>>> >> If the guest takes advantage of the directed EOI feature by setting
>>> >> APIC_SPIV_DIRECTED_EOI, it is expected to signal EOI by writing to
>>> >> the EOI register of the respective IOAPIC.
>>> >>
>>> >> From Intel's x2APIC Specification:
>>> >> "following the EOI to the local x2APIC unit for a level triggered
>>> >> interrupt, perform a directed EOI to the IOxAPIC generating the
>>> >> interrupt by writing to its EOI register."
>>> >>
>>> >> Commit fc61b800f950 ("KVM: Add Directed EOI support to APIC emulation")
>>> >> inhibited EOI on LAPIC EOI register write but didn't add the IOAPIC
>>> >> part. IOAPIC_REG_EOI writes were handled only on IA64 and the code
>>> >> was later removed with the rest of IA64 support.
>>> >>
>>> >> The bug has gone undetected for a long time because Linux writes to
>>> >> IOAPIC_REG_EOI only if the IOAPIC version is >=0x20. Windows doesn't
>>> >> seem to perform such a check.
>>> >
>>> > Hi, Ladi,
>>>
>>> Hi Peter,
>>>
>>> > Not sure I'm understanding it correctly... I see "direct EOI" is a
>>> > feature for IOAPIC version 0x20, while "suppress EOI-broadcast" is
>>> > another feature for APIC. They are not the same feature, so it may not
>>> > be required to have them all together. IIUC current x86 kvm is just
>>> > the case - it supports EOI broadcast suppression on APIC, but it does
>>> > not support direct EOI on kernel IOAPIC.
>>>
>>> Thanks, that makes perfect sense and explains why Linux behaves the
>>> way it does (__eoi_ioapic_pin in arch/x86/kernel/apic/io_apic.c).
>>>
>>> This document makes it look like suppress EOI-broadcast always implies
>>> directed EOI, though:
>>>
>>> http://www.intel.com/content/dam/doc/specification-update/64-architecture-x2apic-specification.pdf
>>>
>>> NB "The support for Directed EOI capability can be detected by means
>>> of bit 24 in the Local APIC Version Register. "
>>>
>>> There is no mention of APIC version or any other detection mechanism
>>> for directed EOI. Maybe the chip being x2APIC implies version >= 0x20
>>> but I don't see that in the document either.
>>>
>>> I suspect that Microsoft implemented EOI by following this same spec.
>>> Level-triggered interrupts don't work right on Windows Server 2016
>>> with Hyper-V enabled without this patch.
>>
>> Yes, the documents for IOAPIC is always hard to find, at least for
>> me...
>>
>> There is some pages mentioned IOAPIC in ICH9 manual on chap 13.5 here:
>> http://www.intel.com/content/dam/doc/datasheet/io-controller-hub-9-datasheet.pdf
>>
>> However I see nothing related to how the IOAPIC version is defined. In
>> that sense, the comment above __eoi_ioapic_pin() seems to be better. :)
>>
>>>
>>> > I think the problem is why the guest setup APIC_SPIV_DIRECTED_EOI even
>>> > if IOAPIC does not support direct EOI (the guest can know it by
>>> > probing IOAPIC version). Please correct if I'm wrong.
>>>
>>> Yes, I think that the guest is to blame here. We might add that to the
>>> commit message.
>>
>> Agreed.
>>
>>>
>>> >>
>>> >> This commit re-adds IOAPIC_REG_EOI and implements it in terms of
>>> >> __kvm_ioapic_update_eoi.
>>> >>
>>> >> Fixes: fc61b800f950 ("KVM: Add Directed EOI support to APIC emulation")
>>> >> Signed-off-by: Ladi Prosek <lprosek@redhat.com>
>>> >> ---
>>> >>  arch/x86/kvm/ioapic.c | 46 ++++++++++++++++++++++++++++------------------
>>> >>  arch/x86/kvm/ioapic.h |  1 +
>>> >>  2 files changed, 29 insertions(+), 18 deletions(-)
>>> >>
>>> >> diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c
>>> >> index 289270a..8df1c6c 100644
>>> >> --- a/arch/x86/kvm/ioapic.c
>>> >> +++ b/arch/x86/kvm/ioapic.c
>>> >> @@ -415,14 +415,15 @@ static void kvm_ioapic_eoi_inject_work(struct work_struct *work)
>>> >>  #define IOAPIC_SUCCESSIVE_IRQ_MAX_COUNT 10000
>>> >>
>>> >>  static void __kvm_ioapic_update_eoi(struct kvm_vcpu *vcpu,
>>> >> -                     struct kvm_ioapic *ioapic, int vector, int trigger_mode)
>>> >> +                     struct kvm_ioapic *ioapic, int vector, int trigger_mode,
>>> >> +                     bool directed)
>>> >>  {
>>> >>       struct dest_map *dest_map = &ioapic->rtc_status.dest_map;
>>> >>       struct kvm_lapic *apic = vcpu->arch.apic;
>>> >>       int i;
>>> >>
>>> >>       /* RTC special handling */
>>> >> -     if (test_bit(vcpu->vcpu_id, dest_map->map) &&
>>> >> +     if (!directed && test_bit(vcpu->vcpu_id, dest_map->map) &&
>>> >>           vector == dest_map->vectors[vcpu->vcpu_id])
>>> >>               rtc_irq_eoi(ioapic, vcpu);
>>> >>
>>> >> @@ -432,21 +433,23 @@ static void __kvm_ioapic_update_eoi(struct kvm_vcpu *vcpu,
>>> >>               if (ent->fields.vector != vector)
>>> >>                       continue;
>>> >>
>>> >> -             /*
>>> >> -              * We are dropping lock while calling ack notifiers because ack
>>> >> -              * notifier callbacks for assigned devices call into IOAPIC
>>> >> -              * recursively. Since remote_irr is cleared only after call
>>> >> -              * to notifiers if the same vector will be delivered while lock
>>> >> -              * is dropped it will be put into irr and will be delivered
>>> >> -              * after ack notifier returns.
>>> >> -              */
>>> >> -             spin_unlock(&ioapic->lock);
>>> >> -             kvm_notify_acked_irq(ioapic->kvm, KVM_IRQCHIP_IOAPIC, i);
>>> >> -             spin_lock(&ioapic->lock);
>>> >> -
>>> >> -             if (trigger_mode != IOAPIC_LEVEL_TRIG ||
>>> >> -                 kvm_lapic_get_reg(apic, APIC_SPIV) & APIC_SPIV_DIRECTED_EOI)
>>> >> -                     continue;
>>> >> +             if (!directed) {
>>> >
>>> > Could I ask why we need to skip this if the EOI is sent via direct EOI
>>> > register of IOAPIC?
>>>
>>> Because it's already been done as part of the local EOI. With directed
>>> EOI we hit this function twice, first time when doing the local EOI
>>> and then the newly added code path for IOAPIC EOI with directed=true.
>>>
>>> I, again, followed the above mentioned document which explicitly
>>> dictates the sequence. And I mechanically split the function to the
>>> "local part' - what it had been doing up to the continue statement -
>>> and the "directed part" - what it had been skipping. I'll admit that
>>> my familiarity with this code is limited and there may be a better way
>>> to do this.
>>
>> Instead of the "!directed" flag (which is imho duplicated with what
>> APIC_SPIV_DIRECTED_EOI means), do you like below fix?
>>
>> -----8<-----
>> diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c
>> index 6e219e5..78d3ec8 100644
>> --- a/arch/x86/kvm/ioapic.c
>> +++ b/arch/x86/kvm/ioapic.c
>> @@ -444,8 +444,7 @@ static void __kvm_ioapic_update_eoi(struct kvm_vcpu *vcpu,
>>                 kvm_notify_acked_irq(ioapic->kvm, KVM_IRQCHIP_IOAPIC, i);
>>                 spin_lock(&ioapic->lock);
>>
>> -               if (trigger_mode != IOAPIC_LEVEL_TRIG ||
>> -                   kvm_lapic_get_reg(apic, APIC_SPIV) & APIC_SPIV_DIRECTED_EOI)
>> +               if (trigger_mode != IOAPIC_LEVEL_TRIG)
>>                         continue;
>>
>>                 ASSERT(ent->fields.trig_mode == IOAPIC_LEVEL_TRIG);
>> @@ -473,10 +472,15 @@ static void __kvm_ioapic_update_eoi(struct kvm_vcpu *vcpu,
>>         }
>>  }
>>
>> +/* This should only be triggered by APIC EOI broadcast */
>>  void kvm_ioapic_update_eoi(struct kvm_vcpu *vcpu, int vector, int trigger_mode)
>>  {
>>         struct kvm_ioapic *ioapic = vcpu->kvm->arch.vioapic;
>>
>> +       /* If we'll be using direct EOI, skip broadcast */
>> +       if (kvm_lapic_get_reg(apic, APIC_SPIV) & APIC_SPIV_DIRECTED_EOI)
>> +               return;
>> +
>
> I've only seen the direct EOI sent for level irqs so I'm afraid that
> __kvm_ioapic_update_eoi needs to run for edge-triggered even if the
> APIC_SPIV_DIRECTED_EOI flag is set.
>
> Other than that it looks reasonable.

Although, wait, what if the guest uses APIC_SPIV_DIRECTED_EOI to
suppress the broadcast but then does EOI by writing to the IOAPIC
routing entry? You kind of indicated that this would be a valid use of
the feature. This is what __eoi_ioapic_pin does for version<0x20 and
on the host side we reset the remote_irr in ioapic_write_indirect if
I'm reading the code correctly. Wouldn't we want to deliver the
notification via kvm_notify_acked_irq in this case also?

Thanks!

>>         spin_lock(&ioapic->lock);
>>         __kvm_ioapic_update_eoi(vcpu, ioapic, vector, trigger_mode);
>>         spin_unlock(&ioapic->lock);
>> ---->8----
>>
>> This patch along will break kvm_notify_acked_irq() in some way I
>> guess, but if with your patch (though will possibly need to boost
>> IOAPIC version to 0x20 as well), it should work fine as long as guest
>> remembers to send the direct EOI.
>
> Not sure about the version boost, especially since we don't have a
> good spec to define what the version means. Maybe only if it helps
> Linux performance. In theory __eoi_ioapic_pin should be causing fewer
> vmexits with version>=0x20.
>
>> Thanks,
>>
>> --
>> Peter Xu

  reply	other threads:[~2017-04-12 10:28 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-11 14:11 [PATCH] KVM: x86: implement IOAPIC_REG_EOI for directed EOI support Ladi Prosek
2017-04-12  6:40 ` Peter Xu
2017-04-12  7:36   ` Ladi Prosek
2017-04-12  9:06     ` Peter Xu
2017-04-12  9:37       ` Ladi Prosek
2017-04-12 10:28         ` Ladi Prosek [this message]
2017-04-12 11:57           ` Peter Xu
2017-04-12 12:20             ` Ladi Prosek
2017-04-13  8:32               ` Peter Xu
2017-04-13 14:09                 ` Radim Krcmar
2017-04-13 15:11                   ` Ladi Prosek
2017-04-14  5:28                   ` Paolo Bonzini
2017-04-12 20:52       ` Radim Krcmar
2017-04-13  6:36         ` Ladi Prosek
2017-04-13 14:15           ` Radim Krcmar
2017-04-14  5:27           ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABdb737CfM_1XH=eodQCC8Wd9WPUzo1c2+Z6iG1wiTpt=J=X9g@mail.gmail.com' \
    --to=lprosek@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=peterx@redhat.com \
    --cc=rkrcmar@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.