All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com>
To: Wanpeng Li <kernellwp@gmail.com>
Cc: "Paolo Bonzini" <pbonzini@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>, kvm <kvm@vger.kernel.org>,
	"Radim Krčmář" <rkrcmar@redhat.com>
Subject: Re: [PATCH v4 2/5] KVM: LAPIC: inject lapic timer interrupt by posted interrupt
Date: Wed, 26 Jun 2019 13:44:08 -0300	[thread overview]
Message-ID: <20190626164401.GA2211@amt.cnet> (raw)
In-Reply-To: <CANRm+CzmraRUNQfTWNZ3Bu5dJhjvL1eE9+=c2i_vwtYYT9ao2w@mail.gmail.com>

On Wed, Jun 26, 2019 at 07:02:13PM +0800, Wanpeng Li wrote:
> On Wed, 26 Jun 2019 at 03:03, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> >
> > On Mon, Jun 24, 2019 at 04:53:53PM +0800, Wanpeng Li wrote:
> > > On Sat, 22 Jun 2019 at 06:11, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> > > >
> > > > On Fri, Jun 21, 2019 at 09:42:39AM +0800, Wanpeng Li wrote:
> > > > > On Thu, 20 Jun 2019 at 05:04, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> > > > > >
> > > > > > Hi Li,
> > > > > >
> > > > > > On Wed, Jun 19, 2019 at 08:36:06AM +0800, Wanpeng Li wrote:
> > > > > > > On Tue, 18 Jun 2019 at 21:36, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> > > > > > > >
> > > > > > > > On Mon, Jun 17, 2019 at 07:24:44PM +0800, Wanpeng Li wrote:
> > > > > > > > > From: Wanpeng Li <wanpengli@tencent.com>
> > > > > > > > >
> > > > > > > > > Dedicated instances are currently disturbed by unnecessary jitter due
> > > > > > > > > to the emulated lapic timers fire on the same pCPUs which vCPUs resident.
> > > > > > > > > There is no hardware virtual timer on Intel for guest like ARM. Both
> > > > > > > > > programming timer in guest and the emulated timer fires incur vmexits.
> > > > > > > > > This patch tries to avoid vmexit which is incurred by the emulated
> > > > > > > > > timer fires in dedicated instance scenario.
> > > > > > > > >
> > > > > > > > > When nohz_full is enabled in dedicated instances scenario, the emulated
> > > > > > > > > timers can be offload to the nearest busy housekeeping cpus since APICv
> > > > > > > > > is really common in recent years. The guest timer interrupt is injected
> > > > > > > > > by posted-interrupt which is delivered by housekeeping cpu once the emulated
> > > > > > > > > timer fires.
> > > > > > > > >
> > > > > > > > > The host admin should fine tuned, e.g. dedicated instances scenario w/
> > > > > > > > > nohz_full cover the pCPUs which vCPUs resident, several pCPUs surplus
> > > > > > > > > for busy housekeeping, disable mwait/hlt/pause vmexits to keep in non-root
> > > > > > > > > mode, ~3% redis performance benefit can be observed on Skylake server.
> > > > > > > > >
> > > > > > > > > w/o patch:
> > > > > > > > >
> > > > > > > > >             VM-EXIT  Samples  Samples%  Time%   Min Time  Max Time   Avg time
> > > > > > > > >
> > > > > > > > > EXTERNAL_INTERRUPT    42916    49.43%   39.30%   0.47us   106.09us   0.71us ( +-   1.09% )
> > > > > > > > >
> > > > > > > > > w/ patch:
> > > > > > > > >
> > > > > > > > >             VM-EXIT  Samples  Samples%  Time%   Min Time  Max Time         Avg time
> > > > > > > > >
> > > > > > > > > EXTERNAL_INTERRUPT    6871     9.29%     2.96%   0.44us    57.88us   0.72us ( +-   4.02% )
> > > > > > > > >
> > > > > > > > > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > > > > > > > > Cc: Radim Krčmář <rkrcmar@redhat.com>
> > > > > > > > > Cc: Marcelo Tosatti <mtosatti@redhat.com>
> > > > > > > > > Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
> > > > > > > > > ---
> > > > > > > > >  arch/x86/kvm/lapic.c            | 33 ++++++++++++++++++++++++++-------
> > > > > > > > >  arch/x86/kvm/lapic.h            |  1 +
> > > > > > > > >  arch/x86/kvm/vmx/vmx.c          |  3 ++-
> > > > > > > > >  arch/x86/kvm/x86.c              |  5 +++++
> > > > > > > > >  arch/x86/kvm/x86.h              |  2 ++
> > > > > > > > >  include/linux/sched/isolation.h |  2 ++
> > > > > > > > >  kernel/sched/isolation.c        |  6 ++++++
> > > > > > > > >  7 files changed, 44 insertions(+), 8 deletions(-)
> > > > > > > > >
> > > > > > > > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > > > > > > > > index 87ecb56..9ceeee5 100644
> > > > > > > > > --- a/arch/x86/kvm/lapic.c
> > > > > > > > > +++ b/arch/x86/kvm/lapic.c
> > > > > > > > > @@ -122,6 +122,13 @@ static inline u32 kvm_x2apic_id(struct kvm_lapic *apic)
> > > > > > > > >       return apic->vcpu->vcpu_id;
> > > > > > > > >  }
> > > > > > > > >
> > > > > > > > > +bool posted_interrupt_inject_timer(struct kvm_vcpu *vcpu)
> > > > > > > > > +{
> > > > > > > > > +     return pi_inject_timer && kvm_vcpu_apicv_active(vcpu) &&
> > > > > > > > > +             kvm_hlt_in_guest(vcpu->kvm);
> > > > > > > > > +}
> > > > > > > > > +EXPORT_SYMBOL_GPL(posted_interrupt_inject_timer);
> > > > > > > >
> > > > > > > > Paolo, can you explain the reasoning behind this?
> > > > > > > >
> > > > > > > > Should not be necessary...
> > > > >
> > > > > https://lkml.org/lkml/2019/6/5/436  "Here you need to check
> > > > > kvm_halt_in_guest, not kvm_mwait_in_guest, because you need to go
> > > > > through kvm_apic_expired if the guest needs to be woken up from
> > > > > kvm_vcpu_block."
> > > >
> > > > Ah, i think he means that a sleeping vcpu (in kvm_vcpu_block) must
> > > > be woken up, if it receives a timer interrupt.
> > > >
> > > > But your patch will go through:
> > > >
> > > > kvm_apic_inject_pending_timer_irqs
> > > > __apic_accept_irq ->
> > > > vmx_deliver_posted_interrupt ->
> > > > kvm_vcpu_trigger_posted_interrupt returns false
> > > > (because vcpu->mode != IN_GUEST_MODE) ->
> > > > kvm_vcpu_kick
> > > >
> > > > Which will wakeup the vcpu.
> > >
> > > Hi Marcelo,
> > >
> > > >
> > > > Apart from this oops, which triggers when running:
> > > > taskset -c 1 ./cyclictest -D 3600 -p 99 -t 1 -h 30 -m -n  -i 50000 -b 40
> > >
> > > I try both host and guest use latest kvm/queue  w/ CONFIG_PREEMPT
> > > enabled, and expose mwait as your config, however, there is no oops.
> > > Can you reproduce steadily or encounter casually? Can you reproduce
> > > w/o the patchset?
> >
> > Hi Li,
> 
> Hi Marcelo,
> 
> >
> > Steadily.
> >
> > Do you have this as well:
> 
> w/ or w/o below diff, testing on both SKX and HSW servers on hand, I
> didn't see any oops. Could you observe the oops disappear when w/o
> below diff? If the answer is yes, then the oops will not block to
> merge the patchset since Paolo prefers to add the kvm_hlt_in_guest()
> condition to guarantee be woken up from kvm_vcpu_block(). 

He agreed that its not necessary. Removing the HLT in guest widens 
the scope of the patch greatly.

> For the
> exitless injection if the guest is running(DPDK style workloads that
> busy-spin on network card) scenarios, we can find a solution later.

What is the use-case for HLT in guest again?

I'll find the source for the oops (or confirm can't reproduce with 
kvm/queue RSN).


  reply	other threads:[~2019-06-26 16:44 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-17 11:24 [PATCH v4 0/5] KVM: LAPIC: Implement Exitless Timer Wanpeng Li
2019-06-17 11:24 ` [PATCH v4 1/5] KVM: LAPIC: Make lapic timer unpinned Wanpeng Li
2019-06-17 11:48   ` Peter Xu
2019-06-18  0:38     ` Wanpeng Li
2019-06-17 11:24 ` [PATCH v4 2/5] KVM: LAPIC: inject lapic timer interrupt by posted interrupt Wanpeng Li
2019-06-18 13:35   ` Marcelo Tosatti
2019-06-19  0:36     ` Wanpeng Li
2019-06-19 21:03       ` Marcelo Tosatti
2019-06-20  0:52         ` Wanpeng Li
2019-06-21  1:42         ` Wanpeng Li
2019-06-21 21:42           ` Marcelo Tosatti
2019-06-24  8:53             ` Wanpeng Li
2019-06-25 19:00               ` Marcelo Tosatti
2019-06-26 11:02                 ` Wanpeng Li
2019-06-26 16:44                   ` Marcelo Tosatti [this message]
2019-06-28  8:26                     ` Wanpeng Li
2019-06-25 17:02             ` Paolo Bonzini
2019-06-17 11:24 ` [PATCH v4 3/5] KVM: LAPIC: Ignore timer migration when lapic timer is injected by pi Wanpeng Li
2019-06-17 11:24 ` [PATCH v4 4/5] KVM: LAPIC: Don't posted inject already-expired timer Wanpeng Li
2019-06-17 11:24 ` [PATCH v4 5/5] KVM: LAPIC: add advance timer support to pi_inject_timer Wanpeng Li
2019-06-17 21:32   ` Radim Krčmář
2019-06-18  0:44     ` Wanpeng Li
2019-06-18  0:57       ` Wanpeng Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190626164401.GA2211@amt.cnet \
    --to=mtosatti@redhat.com \
    --cc=kernellwp@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=rkrcmar@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.