All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christoffer Dall <christoffer.dall@linaro.org>
To: Marc Zyngier <marc.zyngier@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>,
	kvmarm@lists.cs.columbia.edu,
	linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org
Subject: Re: [PATCH v8 8/9] KVM: arm/arm64: Avoid work when userspace iqchips are not used
Date: Tue, 19 Dec 2017 15:18:57 +0100	[thread overview]
Message-ID: <20171219141857.GB5380@cbox> (raw)
In-Reply-To: <5e24dec5-613f-c10f-4577-df3787ad9d64@arm.com>

On Tue, Dec 19, 2017 at 01:55:25PM +0000, Marc Zyngier wrote:
> On 19/12/17 13:34, Christoffer Dall wrote:
> > On Wed, Dec 13, 2017 at 08:05:33PM +0000, Marc Zyngier wrote:
> >> On Wed, 13 Dec 2017 10:46:01 +0000,
> >> Christoffer Dall wrote:
> >>>
> >>> We currently check if the VM has a userspace irqchip on every exit from
> >>> the VCPU, and if so, we do some work to ensure correct timer behavior.
> >>> This is unfortunate, as we could avoid doing any work entirely, if we
> >>> didn't have to support irqchip in userspace.
> >>>
> >>> Realizing the userspace irqchip on ARM is mostly a developer or hobby
> >>> feature, and is unlikely to be used in servers or other scenarios where
> >>> performance is a priority, we can use a refcounted static key to only
> >>> check the irqchip configuration when we have at least one VM that uses
> >>> an irqchip in userspace.
> >>>
> >>> Reviewed-by: Eric Auger <eric.auger@redhat.com>
> >>> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
> >>
> >> On its own, this doesn't seem to be that useful. As far as I can see,
> >> it saves us a load from the kvm structure before giving up.
> > 
> > A load and a conditional.  But what I really wanted to also avoid was
> > the function call from the main run loop, which I neglected as well.  I
> > think I can achieve that with a static inline wrapper in the arch timer
> > header file which first evaluates the static key and then calls into the
> > arch timer code.
> > 
> > 
> >> I think it
> >> is more the cumulative effect of this load that could have an impact,
> >> but you're only dealing with it at a single location.
> >>
> >> How about making this a first class helper and redefine
> >> irqchip_in_kernel as such:
> >>
> >> static inline bool irqchip_in_kernel(struct kvm *kvm)
> >> {
> >> 	if (static_branch_unlikely(&userspace_irqchip_in_use) &&
> >> 	    unlikely(!irqchip_in_kernel(kvm)))
> >> 		return true;
> >>
> >> 	return false;
> >> }
> >>
> >> and move that static key to a more central location?
> >>
> > 
> > That's a neat idea.  The only problem is that creating a new VM would
> > then flip the static key, and then we'd have to flip it back when a vgic
> > is created on that VM, and I don't particularly like the idea of doing
> > this too often.
> 
> Fair enough.
> 
> > 
> > What I'd suggest then is to have two versions of the function:
> > irqchip_in_kernel() which is what it is today, and then
> > __irqchip_in_kernel() which can only be called from within the critical
> > path of the run loop, so that we can increment the static key on
> > kvm_vcpu_first_run_init() when we don't have a VGIC.
> > 
> > How does that sound?
> 
> OK, you only patch once per non-VGIC VM instead of twice per VGIC VM.
> But you now create a distinction between what can be used at runtime and
> what can be used at config time. The distinction is a bit annoying.
> 
> Also, does this actually show up on the radar?
> 

Honestly, I don't know for this particular version of the patch.

But when I did the VHE optimization work, which was before the userspace
irqchip support went in, getting rid of calling kvm_timer_sync_hwstate()
and the load+conditional in there (also prior to the level mapped
patches), was measurable, between 50 to 100 cycles.

Of course, that turned out to be buggy when rebooting VMs, so I never
actually included that in my measurements, but it left me wanting to get
rid of this.

It's a bit of a delicate balance.  On the one hand, it's silly to try to
over-optimize, but on the other hand it's exactly the cumulative effect
of optimizing every bit that managed to get us good results on VHE.

How about this:  I write up the patch in the complicated version as part
of the next version, and if you think it's too difficult to maintain, we
can just drop it an apply the series without it?

Thanks,
-Christoffer

WARNING: multiple messages have this Message-ID (diff)
From: christoffer.dall@linaro.org (Christoffer Dall)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v8 8/9] KVM: arm/arm64: Avoid work when userspace iqchips are not used
Date: Tue, 19 Dec 2017 15:18:57 +0100	[thread overview]
Message-ID: <20171219141857.GB5380@cbox> (raw)
In-Reply-To: <5e24dec5-613f-c10f-4577-df3787ad9d64@arm.com>

On Tue, Dec 19, 2017 at 01:55:25PM +0000, Marc Zyngier wrote:
> On 19/12/17 13:34, Christoffer Dall wrote:
> > On Wed, Dec 13, 2017 at 08:05:33PM +0000, Marc Zyngier wrote:
> >> On Wed, 13 Dec 2017 10:46:01 +0000,
> >> Christoffer Dall wrote:
> >>>
> >>> We currently check if the VM has a userspace irqchip on every exit from
> >>> the VCPU, and if so, we do some work to ensure correct timer behavior.
> >>> This is unfortunate, as we could avoid doing any work entirely, if we
> >>> didn't have to support irqchip in userspace.
> >>>
> >>> Realizing the userspace irqchip on ARM is mostly a developer or hobby
> >>> feature, and is unlikely to be used in servers or other scenarios where
> >>> performance is a priority, we can use a refcounted static key to only
> >>> check the irqchip configuration when we have at least one VM that uses
> >>> an irqchip in userspace.
> >>>
> >>> Reviewed-by: Eric Auger <eric.auger@redhat.com>
> >>> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
> >>
> >> On its own, this doesn't seem to be that useful. As far as I can see,
> >> it saves us a load from the kvm structure before giving up.
> > 
> > A load and a conditional.  But what I really wanted to also avoid was
> > the function call from the main run loop, which I neglected as well.  I
> > think I can achieve that with a static inline wrapper in the arch timer
> > header file which first evaluates the static key and then calls into the
> > arch timer code.
> > 
> > 
> >> I think it
> >> is more the cumulative effect of this load that could have an impact,
> >> but you're only dealing with it at a single location.
> >>
> >> How about making this a first class helper and redefine
> >> irqchip_in_kernel as such:
> >>
> >> static inline bool irqchip_in_kernel(struct kvm *kvm)
> >> {
> >> 	if (static_branch_unlikely(&userspace_irqchip_in_use) &&
> >> 	    unlikely(!irqchip_in_kernel(kvm)))
> >> 		return true;
> >>
> >> 	return false;
> >> }
> >>
> >> and move that static key to a more central location?
> >>
> > 
> > That's a neat idea.  The only problem is that creating a new VM would
> > then flip the static key, and then we'd have to flip it back when a vgic
> > is created on that VM, and I don't particularly like the idea of doing
> > this too often.
> 
> Fair enough.
> 
> > 
> > What I'd suggest then is to have two versions of the function:
> > irqchip_in_kernel() which is what it is today, and then
> > __irqchip_in_kernel() which can only be called from within the critical
> > path of the run loop, so that we can increment the static key on
> > kvm_vcpu_first_run_init() when we don't have a VGIC.
> > 
> > How does that sound?
> 
> OK, you only patch once per non-VGIC VM instead of twice per VGIC VM.
> But you now create a distinction between what can be used at runtime and
> what can be used at config time. The distinction is a bit annoying.
> 
> Also, does this actually show up on the radar?
> 

Honestly, I don't know for this particular version of the patch.

But when I did the VHE optimization work, which was before the userspace
irqchip support went in, getting rid of calling kvm_timer_sync_hwstate()
and the load+conditional in there (also prior to the level mapped
patches), was measurable, between 50 to 100 cycles.

Of course, that turned out to be buggy when rebooting VMs, so I never
actually included that in my measurements, but it left me wanting to get
rid of this.

It's a bit of a delicate balance.  On the one hand, it's silly to try to
over-optimize, but on the other hand it's exactly the cumulative effect
of optimizing every bit that managed to get us good results on VHE.

How about this:  I write up the patch in the complicated version as part
of the next version, and if you think it's too difficult to maintain, we
can just drop it an apply the series without it?

Thanks,
-Christoffer

  reply	other threads:[~2017-12-19 14:18 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-13 10:45 [PATCH v8 0/9] Handle forwarded level-triggered interrupts Christoffer Dall
2017-12-13 10:45 ` Christoffer Dall
2017-12-13 10:45 ` [PATCH v8 1/9] KVM: arm/arm64: Remove redundant preemptible checks Christoffer Dall
2017-12-13 10:45   ` Christoffer Dall
2017-12-13 10:45 ` [PATCH v8 2/9] KVM: arm/arm64: Factor out functionality to get vgic mmio requester_vcpu Christoffer Dall
2017-12-13 10:45   ` Christoffer Dall
2017-12-13 10:45 ` [PATCH v8 3/9] KVM: arm/arm64: Don't cache the timer IRQ level Christoffer Dall
2017-12-13 10:45   ` Christoffer Dall
2017-12-13 19:38   ` Marc Zyngier
2017-12-13 19:38     ` Marc Zyngier
2017-12-19 14:17   ` Julien Thierry
2017-12-19 14:17     ` Julien Thierry
2017-12-19 20:35     ` Christoffer Dall
2017-12-19 20:35       ` Christoffer Dall
2017-12-13 10:45 ` [PATCH v8 4/9] KVM: arm/arm64: vgic: Support level-triggered mapped interrupts Christoffer Dall
2017-12-13 10:45   ` Christoffer Dall
2017-12-13 10:45 ` [PATCH v8 5/9] KVM: arm/arm64: Support a vgic interrupt line level sample function Christoffer Dall
2017-12-13 10:45   ` Christoffer Dall
2017-12-13 10:45 ` [PATCH v8 6/9] KVM: arm/arm64: Support VGIC dist pend/active changes for mapped IRQs Christoffer Dall
2017-12-13 10:45   ` Christoffer Dall
2017-12-13 10:46 ` [PATCH v8 7/9] KVM: arm/arm64: Provide a get_input_level for the arch timer Christoffer Dall
2017-12-13 10:46   ` Christoffer Dall
2017-12-13 19:45   ` Marc Zyngier
2017-12-13 19:45     ` Marc Zyngier
2017-12-13 10:46 ` [PATCH v8 8/9] KVM: arm/arm64: Avoid work when userspace iqchips are not used Christoffer Dall
2017-12-13 10:46   ` Christoffer Dall
2017-12-13 20:05   ` Marc Zyngier
2017-12-13 20:05     ` Marc Zyngier
2017-12-19 13:34     ` Christoffer Dall
2017-12-19 13:34       ` Christoffer Dall
2017-12-19 13:55       ` Marc Zyngier
2017-12-19 13:55         ` Marc Zyngier
2017-12-19 14:18         ` Christoffer Dall [this message]
2017-12-19 14:18           ` Christoffer Dall
2017-12-19 14:32           ` Marc Zyngier
2017-12-19 14:32             ` Marc Zyngier
2017-12-13 10:46 ` [PATCH v8 9/9] KVM: arm/arm64: Update timer and forwarded irq documentation Christoffer Dall
2017-12-13 10:46   ` Christoffer Dall
2017-12-13 20:15   ` Marc Zyngier
2017-12-13 20:15     ` Marc Zyngier
2017-12-19 20:29     ` Christoffer Dall
2017-12-19 20:29       ` Christoffer Dall
2017-12-19 20:35       ` Marc Zyngier
2017-12-19 20:35         ` Marc Zyngier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171219141857.GB5380@cbox \
    --to=christoffer.dall@linaro.org \
    --cc=andre.przywara@arm.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=marc.zyngier@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.