All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v8 0/9] Handle forwarded level-triggered interrupts
@ 2017-12-13 10:45 ` Christoffer Dall
  0 siblings, 0 replies; 44+ messages in thread
From: Christoffer Dall @ 2017-12-13 10:45 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel; +Cc: Marc Zyngier, Andre Przywara, kvm

This series is an alternative approach to Eric Auger's direct EOI setup
patches [1] in terms of the KVM VGIC support.

The idea is to maintain existing semantics for the VGIC for mapped
level-triggered IRQs and also support the timer using mapped IRQs with
the same VGIC support as VFIO interrupts.

Based on v4.15-rc1.

Also available at:
git://git.kernel.org/pub/scm/linux/kernel/git/cdall/linux.git level-mapped-v8

Changes since v7:
 - Cleanup stale commentary
 - Updated documentation (patch 9/9 is new in this version)
 - Added Eric's reviewed-by tags

Changes since v6:
 - Removed double semi-colon
 - Changed another confusing conditional in patch 6
 - Fixed typos in commit message and comments

Changes since v5:
 - Rebased on v4.15-rc1
 - Changed comment on preemption code as suggested by Andre
 - Fixed white space and confusing conditionals as suggested by Drew

Changes since v4:
 - Rebased on the timer optimization series merged in the v4.15 merge
   window, which caused a fair amount of modifications to patch 3.
 - Added a static key to disable the sync operations when no VMs are
   using userspace irqchips to further optimize the performance
 - Fixed extra semicolon in vgic-mmio.c
 - Added commentary as requested during review
 - Dropped what was patch 4, because it was merged as part of GICv4
   support.
 - Factored out the VGIC input level function change as separate patch
   (helps bisect and debugging), before providing a function for the
   timer.

Changes since v3:
 - Added a number of patches and moved patches around a bit.
 - Check for uaccesses in the mmio handler functions
 - Fixed bugs in the mmio handler functions

Changes since v2:
 - Removed patch 5 from v2 and integrating the changes in what's now
   patch 5 to make it easier to reuse code when adding VFIO integration.
 - Changed the virtual distributor MMIO handling to use the
   pending_latch and more closely match the semantics of SPENDR and
   CPENDR for both level and edge mapped interrupts.

Changes since v1:
 - Added necessary changes to the timer (Patch 1)
 - Added handling of guest MMIO accesses to the virtual distributor
   (Patch 4)
 - Addressed Marc's comments from the initial RFC (mostly renames)

Thanks,
-Christoffer

[1]: https://lists.cs.columbia.edu/pipermail/kvmarm/2017-June/026072.html

Christoffer Dall (9):
  KVM: arm/arm64: Remove redundant preemptible checks
  KVM: arm/arm64: Factor out functionality to get vgic mmio
    requester_vcpu
  KVM: arm/arm64: Don't cache the timer IRQ level
  KVM: arm/arm64: vgic: Support level-triggered mapped interrupts
  KVM: arm/arm64: Support a vgic interrupt line level sample function
  KVM: arm/arm64: Support VGIC dist pend/active changes for mapped IRQs
  KVM: arm/arm64: Provide a get_input_level for the arch timer
  KVM: arm/arm64: Avoid work when userspace iqchips are not used
  KVM: arm/arm64: Update timer and forwarded irq documentation

 Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt |  50 +++++----
 include/kvm/arm_arch_timer.h                       |   2 +
 include/kvm/arm_vgic.h                             |  13 ++-
 virt/kvm/arm/arch_timer.c                          | 112 ++++++++++----------
 virt/kvm/arm/arm.c                                 |   2 -
 virt/kvm/arm/vgic/vgic-mmio.c                      | 115 +++++++++++++++++----
 virt/kvm/arm/vgic/vgic-v2.c                        |  29 ++++++
 virt/kvm/arm/vgic/vgic-v3.c                        |  29 ++++++
 virt/kvm/arm/vgic/vgic.c                           |  41 +++++++-
 virt/kvm/arm/vgic/vgic.h                           |   8 ++
 10 files changed, 292 insertions(+), 109 deletions(-)

-- 
2.14.2

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 0/9] Handle forwarded level-triggered interrupts
@ 2017-12-13 10:45 ` Christoffer Dall
  0 siblings, 0 replies; 44+ messages in thread
From: Christoffer Dall @ 2017-12-13 10:45 UTC (permalink / raw)
  To: linux-arm-kernel

This series is an alternative approach to Eric Auger's direct EOI setup
patches [1] in terms of the KVM VGIC support.

The idea is to maintain existing semantics for the VGIC for mapped
level-triggered IRQs and also support the timer using mapped IRQs with
the same VGIC support as VFIO interrupts.

Based on v4.15-rc1.

Also available at:
git://git.kernel.org/pub/scm/linux/kernel/git/cdall/linux.git level-mapped-v8

Changes since v7:
 - Cleanup stale commentary
 - Updated documentation (patch 9/9 is new in this version)
 - Added Eric's reviewed-by tags

Changes since v6:
 - Removed double semi-colon
 - Changed another confusing conditional in patch 6
 - Fixed typos in commit message and comments

Changes since v5:
 - Rebased on v4.15-rc1
 - Changed comment on preemption code as suggested by Andre
 - Fixed white space and confusing conditionals as suggested by Drew

Changes since v4:
 - Rebased on the timer optimization series merged in the v4.15 merge
   window, which caused a fair amount of modifications to patch 3.
 - Added a static key to disable the sync operations when no VMs are
   using userspace irqchips to further optimize the performance
 - Fixed extra semicolon in vgic-mmio.c
 - Added commentary as requested during review
 - Dropped what was patch 4, because it was merged as part of GICv4
   support.
 - Factored out the VGIC input level function change as separate patch
   (helps bisect and debugging), before providing a function for the
   timer.

Changes since v3:
 - Added a number of patches and moved patches around a bit.
 - Check for uaccesses in the mmio handler functions
 - Fixed bugs in the mmio handler functions

Changes since v2:
 - Removed patch 5 from v2 and integrating the changes in what's now
   patch 5 to make it easier to reuse code when adding VFIO integration.
 - Changed the virtual distributor MMIO handling to use the
   pending_latch and more closely match the semantics of SPENDR and
   CPENDR for both level and edge mapped interrupts.

Changes since v1:
 - Added necessary changes to the timer (Patch 1)
 - Added handling of guest MMIO accesses to the virtual distributor
   (Patch 4)
 - Addressed Marc's comments from the initial RFC (mostly renames)

Thanks,
-Christoffer

[1]: https://lists.cs.columbia.edu/pipermail/kvmarm/2017-June/026072.html

Christoffer Dall (9):
  KVM: arm/arm64: Remove redundant preemptible checks
  KVM: arm/arm64: Factor out functionality to get vgic mmio
    requester_vcpu
  KVM: arm/arm64: Don't cache the timer IRQ level
  KVM: arm/arm64: vgic: Support level-triggered mapped interrupts
  KVM: arm/arm64: Support a vgic interrupt line level sample function
  KVM: arm/arm64: Support VGIC dist pend/active changes for mapped IRQs
  KVM: arm/arm64: Provide a get_input_level for the arch timer
  KVM: arm/arm64: Avoid work when userspace iqchips are not used
  KVM: arm/arm64: Update timer and forwarded irq documentation

 Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt |  50 +++++----
 include/kvm/arm_arch_timer.h                       |   2 +
 include/kvm/arm_vgic.h                             |  13 ++-
 virt/kvm/arm/arch_timer.c                          | 112 ++++++++++----------
 virt/kvm/arm/arm.c                                 |   2 -
 virt/kvm/arm/vgic/vgic-mmio.c                      | 115 +++++++++++++++++----
 virt/kvm/arm/vgic/vgic-v2.c                        |  29 ++++++
 virt/kvm/arm/vgic/vgic-v3.c                        |  29 ++++++
 virt/kvm/arm/vgic/vgic.c                           |  41 +++++++-
 virt/kvm/arm/vgic/vgic.h                           |   8 ++
 10 files changed, 292 insertions(+), 109 deletions(-)

-- 
2.14.2

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 1/9] KVM: arm/arm64: Remove redundant preemptible checks
  2017-12-13 10:45 ` Christoffer Dall
@ 2017-12-13 10:45   ` Christoffer Dall
  -1 siblings, 0 replies; 44+ messages in thread
From: Christoffer Dall @ 2017-12-13 10:45 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel; +Cc: Marc Zyngier, Andre Przywara, kvm

The __this_cpu_read() and __this_cpu_write() functions already implement
checks for the required preemption levels when using
CONFIG_DEBUG_PREEMPT which gives you nice error messages and such.
Therefore there is no need to explicitly check this using a BUG_ON() in
the code (which we don't do for other uses of per cpu variables either).

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 virt/kvm/arm/arm.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index a6524ff27de4..859ff7e3a1eb 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -71,7 +71,6 @@ static DEFINE_PER_CPU(unsigned char, kvm_arm_hardware_enabled);
 
 static void kvm_arm_set_running_vcpu(struct kvm_vcpu *vcpu)
 {
-	BUG_ON(preemptible());
 	__this_cpu_write(kvm_arm_running_vcpu, vcpu);
 }
 
@@ -81,7 +80,6 @@ static void kvm_arm_set_running_vcpu(struct kvm_vcpu *vcpu)
  */
 struct kvm_vcpu *kvm_arm_get_running_vcpu(void)
 {
-	BUG_ON(preemptible());
 	return __this_cpu_read(kvm_arm_running_vcpu);
 }
 
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v8 1/9] KVM: arm/arm64: Remove redundant preemptible checks
@ 2017-12-13 10:45   ` Christoffer Dall
  0 siblings, 0 replies; 44+ messages in thread
From: Christoffer Dall @ 2017-12-13 10:45 UTC (permalink / raw)
  To: linux-arm-kernel

The __this_cpu_read() and __this_cpu_write() functions already implement
checks for the required preemption levels when using
CONFIG_DEBUG_PREEMPT which gives you nice error messages and such.
Therefore there is no need to explicitly check this using a BUG_ON() in
the code (which we don't do for other uses of per cpu variables either).

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 virt/kvm/arm/arm.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index a6524ff27de4..859ff7e3a1eb 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -71,7 +71,6 @@ static DEFINE_PER_CPU(unsigned char, kvm_arm_hardware_enabled);
 
 static void kvm_arm_set_running_vcpu(struct kvm_vcpu *vcpu)
 {
-	BUG_ON(preemptible());
 	__this_cpu_write(kvm_arm_running_vcpu, vcpu);
 }
 
@@ -81,7 +80,6 @@ static void kvm_arm_set_running_vcpu(struct kvm_vcpu *vcpu)
  */
 struct kvm_vcpu *kvm_arm_get_running_vcpu(void)
 {
-	BUG_ON(preemptible());
 	return __this_cpu_read(kvm_arm_running_vcpu);
 }
 
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v8 2/9] KVM: arm/arm64: Factor out functionality to get vgic mmio requester_vcpu
  2017-12-13 10:45 ` Christoffer Dall
@ 2017-12-13 10:45   ` Christoffer Dall
  -1 siblings, 0 replies; 44+ messages in thread
From: Christoffer Dall @ 2017-12-13 10:45 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel; +Cc: Marc Zyngier, Andre Przywara, kvm

We are about to distinguish between userspace accesses and mmio traps
for a number of the mmio handlers.  When the requester vcpu is NULL, it
means we are handling a userspace access.

Factor out the functionality to get the request vcpu into its own
function, mostly so we have a common place to document the semantics of
the return value.

Also take the chance to move the functionality outside of holding a
spinlock and instead explicitly disable and enable preemption.  This
supports PREEMPT_RT kernels as well.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 virt/kvm/arm/vgic/vgic-mmio.c | 44 +++++++++++++++++++++++++++----------------
 1 file changed, 28 insertions(+), 16 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-mmio.c b/virt/kvm/arm/vgic/vgic-mmio.c
index deb51ee16a3d..fdad95f62fa3 100644
--- a/virt/kvm/arm/vgic/vgic-mmio.c
+++ b/virt/kvm/arm/vgic/vgic-mmio.c
@@ -122,6 +122,27 @@ unsigned long vgic_mmio_read_pending(struct kvm_vcpu *vcpu,
 	return value;
 }
 
+/*
+ * This function will return the VCPU that performed the MMIO access and
+ * trapped from within the VM, and will return NULL if this is a userspace
+ * access.
+ *
+ * We can disable preemption locally around accessing the per-CPU variable,
+ * and use the resolved vcpu pointer after enabling preemption again, because
+ * even if the current thread is migrated to another CPU, reading the per-CPU
+ * value later will give us the same value as we update the per-CPU variable
+ * in the preempt notifier handlers.
+ */
+static struct kvm_vcpu *vgic_get_mmio_requester_vcpu(void)
+{
+	struct kvm_vcpu *vcpu;
+
+	preempt_disable();
+	vcpu = kvm_arm_get_running_vcpu();
+	preempt_enable();
+	return vcpu;
+}
+
 void vgic_mmio_write_spending(struct kvm_vcpu *vcpu,
 			      gpa_t addr, unsigned int len,
 			      unsigned long val)
@@ -184,24 +205,10 @@ unsigned long vgic_mmio_read_active(struct kvm_vcpu *vcpu,
 static void vgic_mmio_change_active(struct kvm_vcpu *vcpu, struct vgic_irq *irq,
 				    bool new_active_state)
 {
-	struct kvm_vcpu *requester_vcpu;
 	unsigned long flags;
-	spin_lock_irqsave(&irq->irq_lock, flags);
+	struct kvm_vcpu *requester_vcpu = vgic_get_mmio_requester_vcpu();
 
-	/*
-	 * The vcpu parameter here can mean multiple things depending on how
-	 * this function is called; when handling a trap from the kernel it
-	 * depends on the GIC version, and these functions are also called as
-	 * part of save/restore from userspace.
-	 *
-	 * Therefore, we have to figure out the requester in a reliable way.
-	 *
-	 * When accessing VGIC state from user space, the requester_vcpu is
-	 * NULL, which is fine, because we guarantee that no VCPUs are running
-	 * when accessing VGIC state from user space so irq->vcpu->cpu is
-	 * always -1.
-	 */
-	requester_vcpu = kvm_arm_get_running_vcpu();
+	spin_lock_irqsave(&irq->irq_lock, flags);
 
 	/*
 	 * If this virtual IRQ was written into a list register, we
@@ -213,6 +220,11 @@ static void vgic_mmio_change_active(struct kvm_vcpu *vcpu, struct vgic_irq *irq,
 	 * vgic_change_active_prepare)  and still has to sync back this IRQ,
 	 * so we release and re-acquire the spin_lock to let the other thread
 	 * sync back the IRQ.
+	 *
+	 * When accessing VGIC state from user space, requester_vcpu is
+	 * NULL, which is fine, because we guarantee that no VCPUs are running
+	 * when accessing VGIC state from user space so irq->vcpu->cpu is
+	 * always -1.
 	 */
 	while (irq->vcpu && /* IRQ may have state in an LR somewhere */
 	       irq->vcpu != requester_vcpu && /* Current thread is not the VCPU thread */
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v8 2/9] KVM: arm/arm64: Factor out functionality to get vgic mmio requester_vcpu
@ 2017-12-13 10:45   ` Christoffer Dall
  0 siblings, 0 replies; 44+ messages in thread
From: Christoffer Dall @ 2017-12-13 10:45 UTC (permalink / raw)
  To: linux-arm-kernel

We are about to distinguish between userspace accesses and mmio traps
for a number of the mmio handlers.  When the requester vcpu is NULL, it
means we are handling a userspace access.

Factor out the functionality to get the request vcpu into its own
function, mostly so we have a common place to document the semantics of
the return value.

Also take the chance to move the functionality outside of holding a
spinlock and instead explicitly disable and enable preemption.  This
supports PREEMPT_RT kernels as well.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 virt/kvm/arm/vgic/vgic-mmio.c | 44 +++++++++++++++++++++++++++----------------
 1 file changed, 28 insertions(+), 16 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-mmio.c b/virt/kvm/arm/vgic/vgic-mmio.c
index deb51ee16a3d..fdad95f62fa3 100644
--- a/virt/kvm/arm/vgic/vgic-mmio.c
+++ b/virt/kvm/arm/vgic/vgic-mmio.c
@@ -122,6 +122,27 @@ unsigned long vgic_mmio_read_pending(struct kvm_vcpu *vcpu,
 	return value;
 }
 
+/*
+ * This function will return the VCPU that performed the MMIO access and
+ * trapped from within the VM, and will return NULL if this is a userspace
+ * access.
+ *
+ * We can disable preemption locally around accessing the per-CPU variable,
+ * and use the resolved vcpu pointer after enabling preemption again, because
+ * even if the current thread is migrated to another CPU, reading the per-CPU
+ * value later will give us the same value as we update the per-CPU variable
+ * in the preempt notifier handlers.
+ */
+static struct kvm_vcpu *vgic_get_mmio_requester_vcpu(void)
+{
+	struct kvm_vcpu *vcpu;
+
+	preempt_disable();
+	vcpu = kvm_arm_get_running_vcpu();
+	preempt_enable();
+	return vcpu;
+}
+
 void vgic_mmio_write_spending(struct kvm_vcpu *vcpu,
 			      gpa_t addr, unsigned int len,
 			      unsigned long val)
@@ -184,24 +205,10 @@ unsigned long vgic_mmio_read_active(struct kvm_vcpu *vcpu,
 static void vgic_mmio_change_active(struct kvm_vcpu *vcpu, struct vgic_irq *irq,
 				    bool new_active_state)
 {
-	struct kvm_vcpu *requester_vcpu;
 	unsigned long flags;
-	spin_lock_irqsave(&irq->irq_lock, flags);
+	struct kvm_vcpu *requester_vcpu = vgic_get_mmio_requester_vcpu();
 
-	/*
-	 * The vcpu parameter here can mean multiple things depending on how
-	 * this function is called; when handling a trap from the kernel it
-	 * depends on the GIC version, and these functions are also called as
-	 * part of save/restore from userspace.
-	 *
-	 * Therefore, we have to figure out the requester in a reliable way.
-	 *
-	 * When accessing VGIC state from user space, the requester_vcpu is
-	 * NULL, which is fine, because we guarantee that no VCPUs are running
-	 * when accessing VGIC state from user space so irq->vcpu->cpu is
-	 * always -1.
-	 */
-	requester_vcpu = kvm_arm_get_running_vcpu();
+	spin_lock_irqsave(&irq->irq_lock, flags);
 
 	/*
 	 * If this virtual IRQ was written into a list register, we
@@ -213,6 +220,11 @@ static void vgic_mmio_change_active(struct kvm_vcpu *vcpu, struct vgic_irq *irq,
 	 * vgic_change_active_prepare)  and still has to sync back this IRQ,
 	 * so we release and re-acquire the spin_lock to let the other thread
 	 * sync back the IRQ.
+	 *
+	 * When accessing VGIC state from user space, requester_vcpu is
+	 * NULL, which is fine, because we guarantee that no VCPUs are running
+	 * when accessing VGIC state from user space so irq->vcpu->cpu is
+	 * always -1.
 	 */
 	while (irq->vcpu && /* IRQ may have state in an LR somewhere */
 	       irq->vcpu != requester_vcpu && /* Current thread is not the VCPU thread */
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v8 3/9] KVM: arm/arm64: Don't cache the timer IRQ level
  2017-12-13 10:45 ` Christoffer Dall
@ 2017-12-13 10:45   ` Christoffer Dall
  -1 siblings, 0 replies; 44+ messages in thread
From: Christoffer Dall @ 2017-12-13 10:45 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel; +Cc: Marc Zyngier, Andre Przywara, kvm

The timer was modeled after a strict idea of modelling an interrupt line
level in software, meaning that only transitions in the level needed to
be reported to the VGIC.  This works well for the timer, because the
arch timer code is in complete control of the device and can track the
transitions of the line.

However, as we are about to support using the HW bit in the VGIC not
just for the timer, but also for VFIO which cannot track transitions of
the interrupt line, we have to decide on an interface for level
triggered mapped interrupts to the GIC, which both the timer and VFIO
can use.

VFIO only sees an asserting transition of the physical interrupt line,
and tells the VGIC when that happens.  That means that part of the
interrupt flow is offloaded to the hardware.

To use the same interface for VFIO devices and the timer, we therefore
have to change the timer (we cannot change VFIO because it doesn't know
the details of the device it is assigning to a VM).

Luckily, changing the timer is simple, we just need to stop 'caching'
the line level, but instead let the VGIC know the state of the timer
every time there is a potential change in the line level, and when the
line level should be asserted from the timer ISR.  The VGIC can ignore
extra notifications using its validate mechanism.

Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 virt/kvm/arm/arch_timer.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 4151250ce8da..dd5aca05c500 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -99,11 +99,9 @@ static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id)
 	}
 	vtimer = vcpu_vtimer(vcpu);
 
-	if (!vtimer->irq.level) {
-		vtimer->cnt_ctl = read_sysreg_el0(cntv_ctl);
-		if (kvm_timer_irq_can_fire(vtimer))
-			kvm_timer_update_irq(vcpu, true, vtimer);
-	}
+	vtimer->cnt_ctl = read_sysreg_el0(cntv_ctl);
+	if (kvm_timer_irq_can_fire(vtimer))
+		kvm_timer_update_irq(vcpu, true, vtimer);
 
 	if (unlikely(!irqchip_in_kernel(vcpu->kvm)))
 		kvm_vtimer_update_mask_user(vcpu);
@@ -324,12 +322,20 @@ static void kvm_timer_update_state(struct kvm_vcpu *vcpu)
 	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
 	struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
 	struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
+	bool level;
 
 	if (unlikely(!timer->enabled))
 		return;
 
-	if (kvm_timer_should_fire(vtimer) != vtimer->irq.level)
-		kvm_timer_update_irq(vcpu, !vtimer->irq.level, vtimer);
+	/*
+	 * The vtimer virtual interrupt is a 'mapped' interrupt, meaning part
+	 * of its lifecycle is offloaded to the hardware, and we therefore may
+	 * not have lowered the irq.level value before having to signal a new
+	 * interrupt, but have to signal an interrupt every time the level is
+	 * asserted.
+	 */
+	level = kvm_timer_should_fire(vtimer);
+	kvm_timer_update_irq(vcpu, level, vtimer);
 
 	if (kvm_timer_should_fire(ptimer) != ptimer->irq.level)
 		kvm_timer_update_irq(vcpu, !ptimer->irq.level, ptimer);
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v8 3/9] KVM: arm/arm64: Don't cache the timer IRQ level
@ 2017-12-13 10:45   ` Christoffer Dall
  0 siblings, 0 replies; 44+ messages in thread
From: Christoffer Dall @ 2017-12-13 10:45 UTC (permalink / raw)
  To: linux-arm-kernel

The timer was modeled after a strict idea of modelling an interrupt line
level in software, meaning that only transitions in the level needed to
be reported to the VGIC.  This works well for the timer, because the
arch timer code is in complete control of the device and can track the
transitions of the line.

However, as we are about to support using the HW bit in the VGIC not
just for the timer, but also for VFIO which cannot track transitions of
the interrupt line, we have to decide on an interface for level
triggered mapped interrupts to the GIC, which both the timer and VFIO
can use.

VFIO only sees an asserting transition of the physical interrupt line,
and tells the VGIC when that happens.  That means that part of the
interrupt flow is offloaded to the hardware.

To use the same interface for VFIO devices and the timer, we therefore
have to change the timer (we cannot change VFIO because it doesn't know
the details of the device it is assigning to a VM).

Luckily, changing the timer is simple, we just need to stop 'caching'
the line level, but instead let the VGIC know the state of the timer
every time there is a potential change in the line level, and when the
line level should be asserted from the timer ISR.  The VGIC can ignore
extra notifications using its validate mechanism.

Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 virt/kvm/arm/arch_timer.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 4151250ce8da..dd5aca05c500 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -99,11 +99,9 @@ static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id)
 	}
 	vtimer = vcpu_vtimer(vcpu);
 
-	if (!vtimer->irq.level) {
-		vtimer->cnt_ctl = read_sysreg_el0(cntv_ctl);
-		if (kvm_timer_irq_can_fire(vtimer))
-			kvm_timer_update_irq(vcpu, true, vtimer);
-	}
+	vtimer->cnt_ctl = read_sysreg_el0(cntv_ctl);
+	if (kvm_timer_irq_can_fire(vtimer))
+		kvm_timer_update_irq(vcpu, true, vtimer);
 
 	if (unlikely(!irqchip_in_kernel(vcpu->kvm)))
 		kvm_vtimer_update_mask_user(vcpu);
@@ -324,12 +322,20 @@ static void kvm_timer_update_state(struct kvm_vcpu *vcpu)
 	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
 	struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
 	struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
+	bool level;
 
 	if (unlikely(!timer->enabled))
 		return;
 
-	if (kvm_timer_should_fire(vtimer) != vtimer->irq.level)
-		kvm_timer_update_irq(vcpu, !vtimer->irq.level, vtimer);
+	/*
+	 * The vtimer virtual interrupt is a 'mapped' interrupt, meaning part
+	 * of its lifecycle is offloaded to the hardware, and we therefore may
+	 * not have lowered the irq.level value before having to signal a new
+	 * interrupt, but have to signal an interrupt every time the level is
+	 * asserted.
+	 */
+	level = kvm_timer_should_fire(vtimer);
+	kvm_timer_update_irq(vcpu, level, vtimer);
 
 	if (kvm_timer_should_fire(ptimer) != ptimer->irq.level)
 		kvm_timer_update_irq(vcpu, !ptimer->irq.level, ptimer);
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v8 4/9] KVM: arm/arm64: vgic: Support level-triggered mapped interrupts
  2017-12-13 10:45 ` Christoffer Dall
@ 2017-12-13 10:45   ` Christoffer Dall
  -1 siblings, 0 replies; 44+ messages in thread
From: Christoffer Dall @ 2017-12-13 10:45 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel; +Cc: Marc Zyngier, Andre Przywara, kvm

Level-triggered mapped IRQs are special because we only observe rising
edges as input to the VGIC, and we don't set the EOI flag and therefore
are not told when the level goes down, so that we can re-queue a new
interrupt when the level goes up.

One way to solve this problem is to side-step the logic of the VGIC and
special case the validation in the injection path, but it has the
unfortunate drawback of having to peak into the physical GIC state
whenever we want to know if the interrupt is pending on the virtual
distributor.

Instead, we can maintain the current semantics of a level triggered
interrupt by sort of treating it as an edge-triggered interrupt,
following from the fact that we only observe an asserting edge.  This
requires us to be a bit careful when populating the LRs and when folding
the state back in though:

 * We lower the line level when populating the LR, so that when
   subsequently observing an asserting edge, the VGIC will do the right
   thing.

 * If the guest never acked the interrupt while running (for example if
   it had masked interrupts at the CPU level while running), we have
   to preserve the pending state of the LR and move it back to the
   line_level field of the struct irq when folding LR state.

   If the guest never acked the interrupt while running, but changed the
   device state and lowered the line (again with interrupts masked) then
   we need to observe this change in the line_level.

   Both of the above situations are solved by sampling the physical line
   and set the line level when folding the LR back.

 * Finally, if the guest never acked the interrupt while running and
   sampling the line reveals that the device state has changed and the
   line has been lowered, we must clear the physical active state, since
   we will otherwise never be told when the interrupt becomes asserted
   again.

This has the added benefit of making the timer optimization patches
(https://lists.cs.columbia.edu/pipermail/kvmarm/2017-July/026343.html) a
bit simpler, because the timer code doesn't have to clear the active
state on the sync anymore.  It also potentially improves the performance
of the timer implementation because the GIC knows the state or the LR
and only needs to clear the
active state when the pending bit in the LR is still set, where the
timer has to always clear it when returning from running the guest with
an injected timer interrupt.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 virt/kvm/arm/vgic/vgic-v2.c | 29 +++++++++++++++++++++++++++++
 virt/kvm/arm/vgic/vgic-v3.c | 29 +++++++++++++++++++++++++++++
 virt/kvm/arm/vgic/vgic.c    | 23 +++++++++++++++++++++++
 virt/kvm/arm/vgic/vgic.h    |  7 +++++++
 4 files changed, 88 insertions(+)

diff --git a/virt/kvm/arm/vgic/vgic-v2.c b/virt/kvm/arm/vgic/vgic-v2.c
index 80897102da26..c32d7b93ffd1 100644
--- a/virt/kvm/arm/vgic/vgic-v2.c
+++ b/virt/kvm/arm/vgic/vgic-v2.c
@@ -105,6 +105,26 @@ void vgic_v2_fold_lr_state(struct kvm_vcpu *vcpu)
 				irq->pending_latch = false;
 		}
 
+		/*
+		 * Level-triggered mapped IRQs are special because we only
+		 * observe rising edges as input to the VGIC.
+		 *
+		 * If the guest never acked the interrupt we have to sample
+		 * the physical line and set the line level, because the
+		 * device state could have changed or we simply need to
+		 * process the still pending interrupt later.
+		 *
+		 * If this causes us to lower the level, we have to also clear
+		 * the physical active state, since we will otherwise never be
+		 * told when the interrupt becomes asserted again.
+		 */
+		if (vgic_irq_is_mapped_level(irq) && (val & GICH_LR_PENDING_BIT)) {
+			irq->line_level = vgic_get_phys_line_level(irq);
+
+			if (!irq->line_level)
+				vgic_irq_set_phys_active(irq, false);
+		}
+
 		spin_unlock_irqrestore(&irq->irq_lock, flags);
 		vgic_put_irq(vcpu->kvm, irq);
 	}
@@ -162,6 +182,15 @@ void vgic_v2_populate_lr(struct kvm_vcpu *vcpu, struct vgic_irq *irq, int lr)
 			val |= GICH_LR_EOI;
 	}
 
+	/*
+	 * Level-triggered mapped IRQs are special because we only observe
+	 * rising edges as input to the VGIC.  We therefore lower the line
+	 * level here, so that we can take new virtual IRQs.  See
+	 * vgic_v2_fold_lr_state for more info.
+	 */
+	if (vgic_irq_is_mapped_level(irq) && (val & GICH_LR_PENDING_BIT))
+		irq->line_level = false;
+
 	/* The GICv2 LR only holds five bits of priority. */
 	val |= (irq->priority >> 3) << GICH_LR_PRIORITY_SHIFT;
 
diff --git a/virt/kvm/arm/vgic/vgic-v3.c b/virt/kvm/arm/vgic/vgic-v3.c
index 2f05f732d3fd..a14423a0d383 100644
--- a/virt/kvm/arm/vgic/vgic-v3.c
+++ b/virt/kvm/arm/vgic/vgic-v3.c
@@ -96,6 +96,26 @@ void vgic_v3_fold_lr_state(struct kvm_vcpu *vcpu)
 				irq->pending_latch = false;
 		}
 
+		/*
+		 * Level-triggered mapped IRQs are special because we only
+		 * observe rising edges as input to the VGIC.
+		 *
+		 * If the guest never acked the interrupt we have to sample
+		 * the physical line and set the line level, because the
+		 * device state could have changed or we simply need to
+		 * process the still pending interrupt later.
+		 *
+		 * If this causes us to lower the level, we have to also clear
+		 * the physical active state, since we will otherwise never be
+		 * told when the interrupt becomes asserted again.
+		 */
+		if (vgic_irq_is_mapped_level(irq) && (val & ICH_LR_PENDING_BIT)) {
+			irq->line_level = vgic_get_phys_line_level(irq);
+
+			if (!irq->line_level)
+				vgic_irq_set_phys_active(irq, false);
+		}
+
 		spin_unlock_irqrestore(&irq->irq_lock, flags);
 		vgic_put_irq(vcpu->kvm, irq);
 	}
@@ -145,6 +165,15 @@ void vgic_v3_populate_lr(struct kvm_vcpu *vcpu, struct vgic_irq *irq, int lr)
 			val |= ICH_LR_EOI;
 	}
 
+	/*
+	 * Level-triggered mapped IRQs are special because we only observe
+	 * rising edges as input to the VGIC.  We therefore lower the line
+	 * level here, so that we can take new virtual IRQs.  See
+	 * vgic_v3_fold_lr_state for more info.
+	 */
+	if (vgic_irq_is_mapped_level(irq) && (val & ICH_LR_PENDING_BIT))
+		irq->line_level = false;
+
 	/*
 	 * We currently only support Group1 interrupts, which is a
 	 * known defect. This needs to be addressed at some point.
diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index b168a328a9e0..607cbbc27a1c 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -144,6 +144,29 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
 	kfree(irq);
 }
 
+/* Get the input level of a mapped IRQ directly from the physical GIC */
+bool vgic_get_phys_line_level(struct vgic_irq *irq)
+{
+	bool line_level;
+
+	BUG_ON(!irq->hw);
+
+	WARN_ON(irq_get_irqchip_state(irq->host_irq,
+				      IRQCHIP_STATE_PENDING,
+				      &line_level));
+	return line_level;
+}
+
+/* Set/Clear the physical active state */
+void vgic_irq_set_phys_active(struct vgic_irq *irq, bool active)
+{
+
+	BUG_ON(!irq->hw);
+	WARN_ON(irq_set_irqchip_state(irq->host_irq,
+				      IRQCHIP_STATE_ACTIVE,
+				      active));
+}
+
 /**
  * kvm_vgic_target_oracle - compute the target vcpu for an irq
  *
diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
index efbcf8f96f9c..d0787983a357 100644
--- a/virt/kvm/arm/vgic/vgic.h
+++ b/virt/kvm/arm/vgic/vgic.h
@@ -104,6 +104,11 @@ static inline bool irq_is_pending(struct vgic_irq *irq)
 		return irq->pending_latch || irq->line_level;
 }
 
+static inline bool vgic_irq_is_mapped_level(struct vgic_irq *irq)
+{
+	return irq->config == VGIC_CONFIG_LEVEL && irq->hw;
+}
+
 /*
  * This struct provides an intermediate representation of the fields contained
  * in the GICH_VMCR and ICH_VMCR registers, such that code exporting the GIC
@@ -140,6 +145,8 @@ vgic_get_mmio_region(struct kvm_vcpu *vcpu, struct vgic_io_device *iodev,
 struct vgic_irq *vgic_get_irq(struct kvm *kvm, struct kvm_vcpu *vcpu,
 			      u32 intid);
 void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq);
+bool vgic_get_phys_line_level(struct vgic_irq *irq);
+void vgic_irq_set_phys_active(struct vgic_irq *irq, bool active);
 bool vgic_queue_irq_unlock(struct kvm *kvm, struct vgic_irq *irq,
 			   unsigned long flags);
 void vgic_kick_vcpus(struct kvm *kvm);
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v8 4/9] KVM: arm/arm64: vgic: Support level-triggered mapped interrupts
@ 2017-12-13 10:45   ` Christoffer Dall
  0 siblings, 0 replies; 44+ messages in thread
From: Christoffer Dall @ 2017-12-13 10:45 UTC (permalink / raw)
  To: linux-arm-kernel

Level-triggered mapped IRQs are special because we only observe rising
edges as input to the VGIC, and we don't set the EOI flag and therefore
are not told when the level goes down, so that we can re-queue a new
interrupt when the level goes up.

One way to solve this problem is to side-step the logic of the VGIC and
special case the validation in the injection path, but it has the
unfortunate drawback of having to peak into the physical GIC state
whenever we want to know if the interrupt is pending on the virtual
distributor.

Instead, we can maintain the current semantics of a level triggered
interrupt by sort of treating it as an edge-triggered interrupt,
following from the fact that we only observe an asserting edge.  This
requires us to be a bit careful when populating the LRs and when folding
the state back in though:

 * We lower the line level when populating the LR, so that when
   subsequently observing an asserting edge, the VGIC will do the right
   thing.

 * If the guest never acked the interrupt while running (for example if
   it had masked interrupts at the CPU level while running), we have
   to preserve the pending state of the LR and move it back to the
   line_level field of the struct irq when folding LR state.

   If the guest never acked the interrupt while running, but changed the
   device state and lowered the line (again with interrupts masked) then
   we need to observe this change in the line_level.

   Both of the above situations are solved by sampling the physical line
   and set the line level when folding the LR back.

 * Finally, if the guest never acked the interrupt while running and
   sampling the line reveals that the device state has changed and the
   line has been lowered, we must clear the physical active state, since
   we will otherwise never be told when the interrupt becomes asserted
   again.

This has the added benefit of making the timer optimization patches
(https://lists.cs.columbia.edu/pipermail/kvmarm/2017-July/026343.html) a
bit simpler, because the timer code doesn't have to clear the active
state on the sync anymore.  It also potentially improves the performance
of the timer implementation because the GIC knows the state or the LR
and only needs to clear the
active state when the pending bit in the LR is still set, where the
timer has to always clear it when returning from running the guest with
an injected timer interrupt.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 virt/kvm/arm/vgic/vgic-v2.c | 29 +++++++++++++++++++++++++++++
 virt/kvm/arm/vgic/vgic-v3.c | 29 +++++++++++++++++++++++++++++
 virt/kvm/arm/vgic/vgic.c    | 23 +++++++++++++++++++++++
 virt/kvm/arm/vgic/vgic.h    |  7 +++++++
 4 files changed, 88 insertions(+)

diff --git a/virt/kvm/arm/vgic/vgic-v2.c b/virt/kvm/arm/vgic/vgic-v2.c
index 80897102da26..c32d7b93ffd1 100644
--- a/virt/kvm/arm/vgic/vgic-v2.c
+++ b/virt/kvm/arm/vgic/vgic-v2.c
@@ -105,6 +105,26 @@ void vgic_v2_fold_lr_state(struct kvm_vcpu *vcpu)
 				irq->pending_latch = false;
 		}
 
+		/*
+		 * Level-triggered mapped IRQs are special because we only
+		 * observe rising edges as input to the VGIC.
+		 *
+		 * If the guest never acked the interrupt we have to sample
+		 * the physical line and set the line level, because the
+		 * device state could have changed or we simply need to
+		 * process the still pending interrupt later.
+		 *
+		 * If this causes us to lower the level, we have to also clear
+		 * the physical active state, since we will otherwise never be
+		 * told when the interrupt becomes asserted again.
+		 */
+		if (vgic_irq_is_mapped_level(irq) && (val & GICH_LR_PENDING_BIT)) {
+			irq->line_level = vgic_get_phys_line_level(irq);
+
+			if (!irq->line_level)
+				vgic_irq_set_phys_active(irq, false);
+		}
+
 		spin_unlock_irqrestore(&irq->irq_lock, flags);
 		vgic_put_irq(vcpu->kvm, irq);
 	}
@@ -162,6 +182,15 @@ void vgic_v2_populate_lr(struct kvm_vcpu *vcpu, struct vgic_irq *irq, int lr)
 			val |= GICH_LR_EOI;
 	}
 
+	/*
+	 * Level-triggered mapped IRQs are special because we only observe
+	 * rising edges as input to the VGIC.  We therefore lower the line
+	 * level here, so that we can take new virtual IRQs.  See
+	 * vgic_v2_fold_lr_state for more info.
+	 */
+	if (vgic_irq_is_mapped_level(irq) && (val & GICH_LR_PENDING_BIT))
+		irq->line_level = false;
+
 	/* The GICv2 LR only holds five bits of priority. */
 	val |= (irq->priority >> 3) << GICH_LR_PRIORITY_SHIFT;
 
diff --git a/virt/kvm/arm/vgic/vgic-v3.c b/virt/kvm/arm/vgic/vgic-v3.c
index 2f05f732d3fd..a14423a0d383 100644
--- a/virt/kvm/arm/vgic/vgic-v3.c
+++ b/virt/kvm/arm/vgic/vgic-v3.c
@@ -96,6 +96,26 @@ void vgic_v3_fold_lr_state(struct kvm_vcpu *vcpu)
 				irq->pending_latch = false;
 		}
 
+		/*
+		 * Level-triggered mapped IRQs are special because we only
+		 * observe rising edges as input to the VGIC.
+		 *
+		 * If the guest never acked the interrupt we have to sample
+		 * the physical line and set the line level, because the
+		 * device state could have changed or we simply need to
+		 * process the still pending interrupt later.
+		 *
+		 * If this causes us to lower the level, we have to also clear
+		 * the physical active state, since we will otherwise never be
+		 * told when the interrupt becomes asserted again.
+		 */
+		if (vgic_irq_is_mapped_level(irq) && (val & ICH_LR_PENDING_BIT)) {
+			irq->line_level = vgic_get_phys_line_level(irq);
+
+			if (!irq->line_level)
+				vgic_irq_set_phys_active(irq, false);
+		}
+
 		spin_unlock_irqrestore(&irq->irq_lock, flags);
 		vgic_put_irq(vcpu->kvm, irq);
 	}
@@ -145,6 +165,15 @@ void vgic_v3_populate_lr(struct kvm_vcpu *vcpu, struct vgic_irq *irq, int lr)
 			val |= ICH_LR_EOI;
 	}
 
+	/*
+	 * Level-triggered mapped IRQs are special because we only observe
+	 * rising edges as input to the VGIC.  We therefore lower the line
+	 * level here, so that we can take new virtual IRQs.  See
+	 * vgic_v3_fold_lr_state for more info.
+	 */
+	if (vgic_irq_is_mapped_level(irq) && (val & ICH_LR_PENDING_BIT))
+		irq->line_level = false;
+
 	/*
 	 * We currently only support Group1 interrupts, which is a
 	 * known defect. This needs to be addressed at some point.
diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index b168a328a9e0..607cbbc27a1c 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -144,6 +144,29 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
 	kfree(irq);
 }
 
+/* Get the input level of a mapped IRQ directly from the physical GIC */
+bool vgic_get_phys_line_level(struct vgic_irq *irq)
+{
+	bool line_level;
+
+	BUG_ON(!irq->hw);
+
+	WARN_ON(irq_get_irqchip_state(irq->host_irq,
+				      IRQCHIP_STATE_PENDING,
+				      &line_level));
+	return line_level;
+}
+
+/* Set/Clear the physical active state */
+void vgic_irq_set_phys_active(struct vgic_irq *irq, bool active)
+{
+
+	BUG_ON(!irq->hw);
+	WARN_ON(irq_set_irqchip_state(irq->host_irq,
+				      IRQCHIP_STATE_ACTIVE,
+				      active));
+}
+
 /**
  * kvm_vgic_target_oracle - compute the target vcpu for an irq
  *
diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
index efbcf8f96f9c..d0787983a357 100644
--- a/virt/kvm/arm/vgic/vgic.h
+++ b/virt/kvm/arm/vgic/vgic.h
@@ -104,6 +104,11 @@ static inline bool irq_is_pending(struct vgic_irq *irq)
 		return irq->pending_latch || irq->line_level;
 }
 
+static inline bool vgic_irq_is_mapped_level(struct vgic_irq *irq)
+{
+	return irq->config == VGIC_CONFIG_LEVEL && irq->hw;
+}
+
 /*
  * This struct provides an intermediate representation of the fields contained
  * in the GICH_VMCR and ICH_VMCR registers, such that code exporting the GIC
@@ -140,6 +145,8 @@ vgic_get_mmio_region(struct kvm_vcpu *vcpu, struct vgic_io_device *iodev,
 struct vgic_irq *vgic_get_irq(struct kvm *kvm, struct kvm_vcpu *vcpu,
 			      u32 intid);
 void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq);
+bool vgic_get_phys_line_level(struct vgic_irq *irq);
+void vgic_irq_set_phys_active(struct vgic_irq *irq, bool active);
 bool vgic_queue_irq_unlock(struct kvm *kvm, struct vgic_irq *irq,
 			   unsigned long flags);
 void vgic_kick_vcpus(struct kvm *kvm);
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v8 5/9] KVM: arm/arm64: Support a vgic interrupt line level sample function
  2017-12-13 10:45 ` Christoffer Dall
@ 2017-12-13 10:45   ` Christoffer Dall
  -1 siblings, 0 replies; 44+ messages in thread
From: Christoffer Dall @ 2017-12-13 10:45 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel; +Cc: Marc Zyngier, Andre Przywara, kvm

The GIC sometimes need to sample the physical line of a mapped
interrupt.  As we know this to be notoriously slow, provide a callback
function for devices (such as the timer) which can do this much faster
than talking to the distributor, for example by comparing a few
in-memory values.  Fall back to the good old method of poking the
physical GIC if no callback is provided.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 include/kvm/arm_vgic.h    | 13 ++++++++++++-
 virt/kvm/arm/arch_timer.c |  3 ++-
 virt/kvm/arm/vgic/vgic.c  | 13 +++++++++----
 3 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 8c896540a72c..cdbd142ca7f2 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -130,6 +130,17 @@ struct vgic_irq {
 	u8 priority;
 	enum vgic_irq_config config;	/* Level or edge */
 
+	/*
+	 * Callback function pointer to in-kernel devices that can tell us the
+	 * state of the input level of mapped level-triggered IRQ faster than
+	 * peaking into the physical GIC.
+	 *
+	 * Always called in non-preemptible section and the functions can use
+	 * kvm_arm_get_running_vcpu() to get the vcpu pointer for private
+	 * IRQs.
+	 */
+	bool (*get_input_level)(int vintid);
+
 	void *owner;			/* Opaque pointer to reserve an interrupt
 					   for in-kernel devices. */
 };
@@ -331,7 +342,7 @@ void kvm_vgic_init_cpu_hardware(void);
 int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
 			bool level, void *owner);
 int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
-			  u32 vintid);
+			  u32 vintid, bool (*get_input_level)(int vindid));
 int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, unsigned int vintid);
 bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int vintid);
 
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index dd5aca05c500..e78ba5e20f74 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -840,7 +840,8 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
 		return -EINVAL;
 	}
 
-	ret = kvm_vgic_map_phys_irq(vcpu, host_vtimer_irq, vtimer->irq.irq);
+	ret = kvm_vgic_map_phys_irq(vcpu, host_vtimer_irq, vtimer->irq.irq,
+				    NULL);
 	if (ret)
 		return ret;
 
diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index 607cbbc27a1c..80c5c609385a 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -144,13 +144,15 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
 	kfree(irq);
 }
 
-/* Get the input level of a mapped IRQ directly from the physical GIC */
 bool vgic_get_phys_line_level(struct vgic_irq *irq)
 {
 	bool line_level;
 
 	BUG_ON(!irq->hw);
 
+	if (irq->get_input_level)
+		return irq->get_input_level(irq->intid);
+
 	WARN_ON(irq_get_irqchip_state(irq->host_irq,
 				      IRQCHIP_STATE_PENDING,
 				      &line_level));
@@ -436,7 +438,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
 
 /* @irq->irq_lock must be held */
 static int kvm_vgic_map_irq(struct kvm_vcpu *vcpu, struct vgic_irq *irq,
-			    unsigned int host_irq)
+			    unsigned int host_irq,
+			    bool (*get_input_level)(int vindid))
 {
 	struct irq_desc *desc;
 	struct irq_data *data;
@@ -456,6 +459,7 @@ static int kvm_vgic_map_irq(struct kvm_vcpu *vcpu, struct vgic_irq *irq,
 	irq->hw = true;
 	irq->host_irq = host_irq;
 	irq->hwintid = data->hwirq;
+	irq->get_input_level = get_input_level;
 	return 0;
 }
 
@@ -464,10 +468,11 @@ static inline void kvm_vgic_unmap_irq(struct vgic_irq *irq)
 {
 	irq->hw = false;
 	irq->hwintid = 0;
+	irq->get_input_level = NULL;
 }
 
 int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
-			  u32 vintid)
+			  u32 vintid, bool (*get_input_level)(int vindid))
 {
 	struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, vintid);
 	unsigned long flags;
@@ -476,7 +481,7 @@ int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
 	BUG_ON(!irq);
 
 	spin_lock_irqsave(&irq->irq_lock, flags);
-	ret = kvm_vgic_map_irq(vcpu, irq, host_irq);
+	ret = kvm_vgic_map_irq(vcpu, irq, host_irq, get_input_level);
 	spin_unlock_irqrestore(&irq->irq_lock, flags);
 	vgic_put_irq(vcpu->kvm, irq);
 
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v8 5/9] KVM: arm/arm64: Support a vgic interrupt line level sample function
@ 2017-12-13 10:45   ` Christoffer Dall
  0 siblings, 0 replies; 44+ messages in thread
From: Christoffer Dall @ 2017-12-13 10:45 UTC (permalink / raw)
  To: linux-arm-kernel

The GIC sometimes need to sample the physical line of a mapped
interrupt.  As we know this to be notoriously slow, provide a callback
function for devices (such as the timer) which can do this much faster
than talking to the distributor, for example by comparing a few
in-memory values.  Fall back to the good old method of poking the
physical GIC if no callback is provided.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 include/kvm/arm_vgic.h    | 13 ++++++++++++-
 virt/kvm/arm/arch_timer.c |  3 ++-
 virt/kvm/arm/vgic/vgic.c  | 13 +++++++++----
 3 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 8c896540a72c..cdbd142ca7f2 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -130,6 +130,17 @@ struct vgic_irq {
 	u8 priority;
 	enum vgic_irq_config config;	/* Level or edge */
 
+	/*
+	 * Callback function pointer to in-kernel devices that can tell us the
+	 * state of the input level of mapped level-triggered IRQ faster than
+	 * peaking into the physical GIC.
+	 *
+	 * Always called in non-preemptible section and the functions can use
+	 * kvm_arm_get_running_vcpu() to get the vcpu pointer for private
+	 * IRQs.
+	 */
+	bool (*get_input_level)(int vintid);
+
 	void *owner;			/* Opaque pointer to reserve an interrupt
 					   for in-kernel devices. */
 };
@@ -331,7 +342,7 @@ void kvm_vgic_init_cpu_hardware(void);
 int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
 			bool level, void *owner);
 int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
-			  u32 vintid);
+			  u32 vintid, bool (*get_input_level)(int vindid));
 int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, unsigned int vintid);
 bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int vintid);
 
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index dd5aca05c500..e78ba5e20f74 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -840,7 +840,8 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
 		return -EINVAL;
 	}
 
-	ret = kvm_vgic_map_phys_irq(vcpu, host_vtimer_irq, vtimer->irq.irq);
+	ret = kvm_vgic_map_phys_irq(vcpu, host_vtimer_irq, vtimer->irq.irq,
+				    NULL);
 	if (ret)
 		return ret;
 
diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index 607cbbc27a1c..80c5c609385a 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -144,13 +144,15 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
 	kfree(irq);
 }
 
-/* Get the input level of a mapped IRQ directly from the physical GIC */
 bool vgic_get_phys_line_level(struct vgic_irq *irq)
 {
 	bool line_level;
 
 	BUG_ON(!irq->hw);
 
+	if (irq->get_input_level)
+		return irq->get_input_level(irq->intid);
+
 	WARN_ON(irq_get_irqchip_state(irq->host_irq,
 				      IRQCHIP_STATE_PENDING,
 				      &line_level));
@@ -436,7 +438,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
 
 /* @irq->irq_lock must be held */
 static int kvm_vgic_map_irq(struct kvm_vcpu *vcpu, struct vgic_irq *irq,
-			    unsigned int host_irq)
+			    unsigned int host_irq,
+			    bool (*get_input_level)(int vindid))
 {
 	struct irq_desc *desc;
 	struct irq_data *data;
@@ -456,6 +459,7 @@ static int kvm_vgic_map_irq(struct kvm_vcpu *vcpu, struct vgic_irq *irq,
 	irq->hw = true;
 	irq->host_irq = host_irq;
 	irq->hwintid = data->hwirq;
+	irq->get_input_level = get_input_level;
 	return 0;
 }
 
@@ -464,10 +468,11 @@ static inline void kvm_vgic_unmap_irq(struct vgic_irq *irq)
 {
 	irq->hw = false;
 	irq->hwintid = 0;
+	irq->get_input_level = NULL;
 }
 
 int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
-			  u32 vintid)
+			  u32 vintid, bool (*get_input_level)(int vindid))
 {
 	struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, vintid);
 	unsigned long flags;
@@ -476,7 +481,7 @@ int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
 	BUG_ON(!irq);
 
 	spin_lock_irqsave(&irq->irq_lock, flags);
-	ret = kvm_vgic_map_irq(vcpu, irq, host_irq);
+	ret = kvm_vgic_map_irq(vcpu, irq, host_irq, get_input_level);
 	spin_unlock_irqrestore(&irq->irq_lock, flags);
 	vgic_put_irq(vcpu->kvm, irq);
 
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v8 6/9] KVM: arm/arm64: Support VGIC dist pend/active changes for mapped IRQs
  2017-12-13 10:45 ` Christoffer Dall
@ 2017-12-13 10:45   ` Christoffer Dall
  -1 siblings, 0 replies; 44+ messages in thread
From: Christoffer Dall @ 2017-12-13 10:45 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel; +Cc: Marc Zyngier, Andre Przywara, kvm

For mapped IRQs (with the HW bit set in the LR) we have to follow some
rules of the architecture.  One of these rules is that VM must not be
allowed to deactivate a virtual interrupt with the HW bit set unless the
physical interrupt is also active.

This works fine when injecting mapped interrupts, because we leave it up
to the injector to either set EOImode==1 or manually set the active
state of the physical interrupt.

However, the guest can set virtual interrupt to be pending or active by
writing to the virtual distributor, which could lead to deactivating a
virtual interrupt with the HW bit set without the physical interrupt
being active.

We could set the physical interrupt to active whenever we are about to
enter the VM with a HW interrupt either pending or active, but that
would be really slow, especially on GICv2.  So we take the long way
around and do the hard work when needed, which is expected to be
extremely rare.

When the VM sets the pending state for a HW interrupt on the virtual
distributor we set the active state on the physical distributor, because
the virtual interrupt can become active and then the guest can
deactivate it.

When the VM clears the pending state we also clear it on the physical
side, because the injector might otherwise raise the interrupt.  We also
clear the physical active state when the virtual interrupt is not
active, since otherwise a SPEND/CPEND sequence from the guest would
prevent signaling of future interrupts.

Changing the state of mapped interrupts from userspace is not supported,
and it's expected that userspace unmaps devices from VFIO before
attempting to set the interrupt state, because the interrupt state is
driven by hardware.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 virt/kvm/arm/vgic/vgic-mmio.c | 71 +++++++++++++++++++++++++++++++++++++++----
 virt/kvm/arm/vgic/vgic.c      |  7 +++++
 virt/kvm/arm/vgic/vgic.h      |  1 +
 3 files changed, 73 insertions(+), 6 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-mmio.c b/virt/kvm/arm/vgic/vgic-mmio.c
index fdad95f62fa3..83d82bd7dc4e 100644
--- a/virt/kvm/arm/vgic/vgic-mmio.c
+++ b/virt/kvm/arm/vgic/vgic-mmio.c
@@ -16,6 +16,7 @@
 #include <linux/kvm.h>
 #include <linux/kvm_host.h>
 #include <kvm/iodev.h>
+#include <kvm/arm_arch_timer.h>
 #include <kvm/arm_vgic.h>
 
 #include "vgic.h"
@@ -143,10 +144,22 @@ static struct kvm_vcpu *vgic_get_mmio_requester_vcpu(void)
 	return vcpu;
 }
 
+/* Must be called with irq->irq_lock held */
+static void vgic_hw_irq_spending(struct kvm_vcpu *vcpu, struct vgic_irq *irq,
+				 bool is_uaccess)
+{
+	if (is_uaccess)
+		return;
+
+	irq->pending_latch = true;
+	vgic_irq_set_phys_active(irq, true);
+}
+
 void vgic_mmio_write_spending(struct kvm_vcpu *vcpu,
 			      gpa_t addr, unsigned int len,
 			      unsigned long val)
 {
+	bool is_uaccess = !vgic_get_mmio_requester_vcpu();
 	u32 intid = VGIC_ADDR_TO_INTID(addr, 1);
 	int i;
 	unsigned long flags;
@@ -155,17 +168,45 @@ void vgic_mmio_write_spending(struct kvm_vcpu *vcpu,
 		struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, intid + i);
 
 		spin_lock_irqsave(&irq->irq_lock, flags);
-		irq->pending_latch = true;
-
+		if (irq->hw)
+			vgic_hw_irq_spending(vcpu, irq, is_uaccess);
+		else
+			irq->pending_latch = true;
 		vgic_queue_irq_unlock(vcpu->kvm, irq, flags);
 		vgic_put_irq(vcpu->kvm, irq);
 	}
 }
 
+/* Must be called with irq->irq_lock held */
+static void vgic_hw_irq_cpending(struct kvm_vcpu *vcpu, struct vgic_irq *irq,
+				 bool is_uaccess)
+{
+	if (is_uaccess)
+		return;
+
+	irq->pending_latch = false;
+
+	/*
+	 * We don't want the guest to effectively mask the physical
+	 * interrupt by doing a write to SPENDR followed by a write to
+	 * CPENDR for HW interrupts, so we clear the active state on
+	 * the physical side if the virtual interrupt is not active.
+	 * This may lead to taking an additional interrupt on the
+	 * host, but that should not be a problem as the worst that
+	 * can happen is an additional vgic injection.  We also clear
+	 * the pending state to maintain proper semantics for edge HW
+	 * interrupts.
+	 */
+	vgic_irq_set_phys_pending(irq, false);
+	if (!irq->active)
+		vgic_irq_set_phys_active(irq, false);
+}
+
 void vgic_mmio_write_cpending(struct kvm_vcpu *vcpu,
 			      gpa_t addr, unsigned int len,
 			      unsigned long val)
 {
+	bool is_uaccess = !vgic_get_mmio_requester_vcpu();
 	u32 intid = VGIC_ADDR_TO_INTID(addr, 1);
 	int i;
 	unsigned long flags;
@@ -175,7 +216,10 @@ void vgic_mmio_write_cpending(struct kvm_vcpu *vcpu,
 
 		spin_lock_irqsave(&irq->irq_lock, flags);
 
-		irq->pending_latch = false;
+		if (irq->hw)
+			vgic_hw_irq_cpending(vcpu, irq, is_uaccess);
+		else
+			irq->pending_latch = false;
 
 		spin_unlock_irqrestore(&irq->irq_lock, flags);
 		vgic_put_irq(vcpu->kvm, irq);
@@ -202,8 +246,19 @@ unsigned long vgic_mmio_read_active(struct kvm_vcpu *vcpu,
 	return value;
 }
 
+/* Must be called with irq->irq_lock held */
+static void vgic_hw_irq_change_active(struct kvm_vcpu *vcpu, struct vgic_irq *irq,
+				      bool active, bool is_uaccess)
+{
+	if (is_uaccess)
+		return;
+
+	irq->active = active;
+	vgic_irq_set_phys_active(irq, active);
+}
+
 static void vgic_mmio_change_active(struct kvm_vcpu *vcpu, struct vgic_irq *irq,
-				    bool new_active_state)
+				    bool active)
 {
 	unsigned long flags;
 	struct kvm_vcpu *requester_vcpu = vgic_get_mmio_requester_vcpu();
@@ -231,8 +286,12 @@ static void vgic_mmio_change_active(struct kvm_vcpu *vcpu, struct vgic_irq *irq,
 	       irq->vcpu->cpu != -1) /* VCPU thread is running */
 		cond_resched_lock(&irq->irq_lock);
 
-	irq->active = new_active_state;
-	if (new_active_state)
+	if (irq->hw)
+		vgic_hw_irq_change_active(vcpu, irq, active, !requester_vcpu);
+	else
+		irq->active = active;
+
+	if (irq->active)
 		vgic_queue_irq_unlock(vcpu->kvm, irq, flags);
 	else
 		spin_unlock_irqrestore(&irq->irq_lock, flags);
diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index 80c5c609385a..870cdacd9e81 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -144,6 +144,13 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
 	kfree(irq);
 }
 
+void vgic_irq_set_phys_pending(struct vgic_irq *irq, bool pending)
+{
+	WARN_ON(irq_set_irqchip_state(irq->host_irq,
+				      IRQCHIP_STATE_PENDING,
+				      pending));
+}
+
 bool vgic_get_phys_line_level(struct vgic_irq *irq)
 {
 	bool line_level;
diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
index d0787983a357..12c37b89f7a3 100644
--- a/virt/kvm/arm/vgic/vgic.h
+++ b/virt/kvm/arm/vgic/vgic.h
@@ -146,6 +146,7 @@ struct vgic_irq *vgic_get_irq(struct kvm *kvm, struct kvm_vcpu *vcpu,
 			      u32 intid);
 void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq);
 bool vgic_get_phys_line_level(struct vgic_irq *irq);
+void vgic_irq_set_phys_pending(struct vgic_irq *irq, bool pending);
 void vgic_irq_set_phys_active(struct vgic_irq *irq, bool active);
 bool vgic_queue_irq_unlock(struct kvm *kvm, struct vgic_irq *irq,
 			   unsigned long flags);
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v8 6/9] KVM: arm/arm64: Support VGIC dist pend/active changes for mapped IRQs
@ 2017-12-13 10:45   ` Christoffer Dall
  0 siblings, 0 replies; 44+ messages in thread
From: Christoffer Dall @ 2017-12-13 10:45 UTC (permalink / raw)
  To: linux-arm-kernel

For mapped IRQs (with the HW bit set in the LR) we have to follow some
rules of the architecture.  One of these rules is that VM must not be
allowed to deactivate a virtual interrupt with the HW bit set unless the
physical interrupt is also active.

This works fine when injecting mapped interrupts, because we leave it up
to the injector to either set EOImode==1 or manually set the active
state of the physical interrupt.

However, the guest can set virtual interrupt to be pending or active by
writing to the virtual distributor, which could lead to deactivating a
virtual interrupt with the HW bit set without the physical interrupt
being active.

We could set the physical interrupt to active whenever we are about to
enter the VM with a HW interrupt either pending or active, but that
would be really slow, especially on GICv2.  So we take the long way
around and do the hard work when needed, which is expected to be
extremely rare.

When the VM sets the pending state for a HW interrupt on the virtual
distributor we set the active state on the physical distributor, because
the virtual interrupt can become active and then the guest can
deactivate it.

When the VM clears the pending state we also clear it on the physical
side, because the injector might otherwise raise the interrupt.  We also
clear the physical active state when the virtual interrupt is not
active, since otherwise a SPEND/CPEND sequence from the guest would
prevent signaling of future interrupts.

Changing the state of mapped interrupts from userspace is not supported,
and it's expected that userspace unmaps devices from VFIO before
attempting to set the interrupt state, because the interrupt state is
driven by hardware.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 virt/kvm/arm/vgic/vgic-mmio.c | 71 +++++++++++++++++++++++++++++++++++++++----
 virt/kvm/arm/vgic/vgic.c      |  7 +++++
 virt/kvm/arm/vgic/vgic.h      |  1 +
 3 files changed, 73 insertions(+), 6 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-mmio.c b/virt/kvm/arm/vgic/vgic-mmio.c
index fdad95f62fa3..83d82bd7dc4e 100644
--- a/virt/kvm/arm/vgic/vgic-mmio.c
+++ b/virt/kvm/arm/vgic/vgic-mmio.c
@@ -16,6 +16,7 @@
 #include <linux/kvm.h>
 #include <linux/kvm_host.h>
 #include <kvm/iodev.h>
+#include <kvm/arm_arch_timer.h>
 #include <kvm/arm_vgic.h>
 
 #include "vgic.h"
@@ -143,10 +144,22 @@ static struct kvm_vcpu *vgic_get_mmio_requester_vcpu(void)
 	return vcpu;
 }
 
+/* Must be called with irq->irq_lock held */
+static void vgic_hw_irq_spending(struct kvm_vcpu *vcpu, struct vgic_irq *irq,
+				 bool is_uaccess)
+{
+	if (is_uaccess)
+		return;
+
+	irq->pending_latch = true;
+	vgic_irq_set_phys_active(irq, true);
+}
+
 void vgic_mmio_write_spending(struct kvm_vcpu *vcpu,
 			      gpa_t addr, unsigned int len,
 			      unsigned long val)
 {
+	bool is_uaccess = !vgic_get_mmio_requester_vcpu();
 	u32 intid = VGIC_ADDR_TO_INTID(addr, 1);
 	int i;
 	unsigned long flags;
@@ -155,17 +168,45 @@ void vgic_mmio_write_spending(struct kvm_vcpu *vcpu,
 		struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, intid + i);
 
 		spin_lock_irqsave(&irq->irq_lock, flags);
-		irq->pending_latch = true;
-
+		if (irq->hw)
+			vgic_hw_irq_spending(vcpu, irq, is_uaccess);
+		else
+			irq->pending_latch = true;
 		vgic_queue_irq_unlock(vcpu->kvm, irq, flags);
 		vgic_put_irq(vcpu->kvm, irq);
 	}
 }
 
+/* Must be called with irq->irq_lock held */
+static void vgic_hw_irq_cpending(struct kvm_vcpu *vcpu, struct vgic_irq *irq,
+				 bool is_uaccess)
+{
+	if (is_uaccess)
+		return;
+
+	irq->pending_latch = false;
+
+	/*
+	 * We don't want the guest to effectively mask the physical
+	 * interrupt by doing a write to SPENDR followed by a write to
+	 * CPENDR for HW interrupts, so we clear the active state on
+	 * the physical side if the virtual interrupt is not active.
+	 * This may lead to taking an additional interrupt on the
+	 * host, but that should not be a problem as the worst that
+	 * can happen is an additional vgic injection.  We also clear
+	 * the pending state to maintain proper semantics for edge HW
+	 * interrupts.
+	 */
+	vgic_irq_set_phys_pending(irq, false);
+	if (!irq->active)
+		vgic_irq_set_phys_active(irq, false);
+}
+
 void vgic_mmio_write_cpending(struct kvm_vcpu *vcpu,
 			      gpa_t addr, unsigned int len,
 			      unsigned long val)
 {
+	bool is_uaccess = !vgic_get_mmio_requester_vcpu();
 	u32 intid = VGIC_ADDR_TO_INTID(addr, 1);
 	int i;
 	unsigned long flags;
@@ -175,7 +216,10 @@ void vgic_mmio_write_cpending(struct kvm_vcpu *vcpu,
 
 		spin_lock_irqsave(&irq->irq_lock, flags);
 
-		irq->pending_latch = false;
+		if (irq->hw)
+			vgic_hw_irq_cpending(vcpu, irq, is_uaccess);
+		else
+			irq->pending_latch = false;
 
 		spin_unlock_irqrestore(&irq->irq_lock, flags);
 		vgic_put_irq(vcpu->kvm, irq);
@@ -202,8 +246,19 @@ unsigned long vgic_mmio_read_active(struct kvm_vcpu *vcpu,
 	return value;
 }
 
+/* Must be called with irq->irq_lock held */
+static void vgic_hw_irq_change_active(struct kvm_vcpu *vcpu, struct vgic_irq *irq,
+				      bool active, bool is_uaccess)
+{
+	if (is_uaccess)
+		return;
+
+	irq->active = active;
+	vgic_irq_set_phys_active(irq, active);
+}
+
 static void vgic_mmio_change_active(struct kvm_vcpu *vcpu, struct vgic_irq *irq,
-				    bool new_active_state)
+				    bool active)
 {
 	unsigned long flags;
 	struct kvm_vcpu *requester_vcpu = vgic_get_mmio_requester_vcpu();
@@ -231,8 +286,12 @@ static void vgic_mmio_change_active(struct kvm_vcpu *vcpu, struct vgic_irq *irq,
 	       irq->vcpu->cpu != -1) /* VCPU thread is running */
 		cond_resched_lock(&irq->irq_lock);
 
-	irq->active = new_active_state;
-	if (new_active_state)
+	if (irq->hw)
+		vgic_hw_irq_change_active(vcpu, irq, active, !requester_vcpu);
+	else
+		irq->active = active;
+
+	if (irq->active)
 		vgic_queue_irq_unlock(vcpu->kvm, irq, flags);
 	else
 		spin_unlock_irqrestore(&irq->irq_lock, flags);
diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index 80c5c609385a..870cdacd9e81 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -144,6 +144,13 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
 	kfree(irq);
 }
 
+void vgic_irq_set_phys_pending(struct vgic_irq *irq, bool pending)
+{
+	WARN_ON(irq_set_irqchip_state(irq->host_irq,
+				      IRQCHIP_STATE_PENDING,
+				      pending));
+}
+
 bool vgic_get_phys_line_level(struct vgic_irq *irq)
 {
 	bool line_level;
diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
index d0787983a357..12c37b89f7a3 100644
--- a/virt/kvm/arm/vgic/vgic.h
+++ b/virt/kvm/arm/vgic/vgic.h
@@ -146,6 +146,7 @@ struct vgic_irq *vgic_get_irq(struct kvm *kvm, struct kvm_vcpu *vcpu,
 			      u32 intid);
 void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq);
 bool vgic_get_phys_line_level(struct vgic_irq *irq);
+void vgic_irq_set_phys_pending(struct vgic_irq *irq, bool pending);
 void vgic_irq_set_phys_active(struct vgic_irq *irq, bool active);
 bool vgic_queue_irq_unlock(struct kvm *kvm, struct vgic_irq *irq,
 			   unsigned long flags);
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v8 7/9] KVM: arm/arm64: Provide a get_input_level for the arch timer
  2017-12-13 10:45 ` Christoffer Dall
@ 2017-12-13 10:46   ` Christoffer Dall
  -1 siblings, 0 replies; 44+ messages in thread
From: Christoffer Dall @ 2017-12-13 10:46 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel; +Cc: Marc Zyngier, Andre Przywara, kvm

The VGIC can now support the life-cycle of mapped level-triggered
interrupts, and we no longer have to read back the timer state on every
exit from the VM if we had an asserted timer interrupt signal, because
the VGIC already knows if we hit the unlikely case where the guest
disables the timer without ACKing the virtual timer interrupt.

This means we rework a bit of the code to factor out the functionality
to snapshot the timer state from vtimer_save_state(), and we can reuse
this functionality in the sync path when we have an irqchip in
userspace, and also to support our implementation of the
get_input_level() function for the timer.

This change also means that we can no longer rely on the timer's view of
the interrupt line to set the active state, because we no longer
maintain this state for mapped interrupts when exiting from the guest.
Instead, we only set the active state if the virtual interrupt is
active, and otherwise we simply let the timer fire again and raise the
virtual interrupt from the ISR.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 include/kvm/arm_arch_timer.h |  2 ++
 virt/kvm/arm/arch_timer.c    | 82 +++++++++++++++++++-------------------------
 2 files changed, 38 insertions(+), 46 deletions(-)

diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
index 01ee473517e2..f57f795d704c 100644
--- a/include/kvm/arm_arch_timer.h
+++ b/include/kvm/arm_arch_timer.h
@@ -90,6 +90,8 @@ void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu);
 
 void kvm_timer_init_vhe(void);
 
+bool kvm_arch_timer_get_input_level(int vintid);
+
 #define vcpu_vtimer(v)	(&(v)->arch.timer_cpu.vtimer)
 #define vcpu_ptimer(v)	(&(v)->arch.timer_cpu.ptimer)
 
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index e78ba5e20f74..f8d09665ddce 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -343,6 +343,12 @@ static void kvm_timer_update_state(struct kvm_vcpu *vcpu)
 	phys_timer_emulate(vcpu);
 }
 
+static void __timer_snapshot_state(struct arch_timer_context *timer)
+{
+	timer->cnt_ctl = read_sysreg_el0(cntv_ctl);
+	timer->cnt_cval = read_sysreg_el0(cntv_cval);
+}
+
 static void vtimer_save_state(struct kvm_vcpu *vcpu)
 {
 	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
@@ -354,10 +360,8 @@ static void vtimer_save_state(struct kvm_vcpu *vcpu)
 	if (!vtimer->loaded)
 		goto out;
 
-	if (timer->enabled) {
-		vtimer->cnt_ctl = read_sysreg_el0(cntv_ctl);
-		vtimer->cnt_cval = read_sysreg_el0(cntv_cval);
-	}
+	if (timer->enabled)
+		__timer_snapshot_state(vtimer);
 
 	/* Disable the virtual timer */
 	write_sysreg_el0(0, cntv_ctl);
@@ -454,8 +458,7 @@ static void kvm_timer_vcpu_load_vgic(struct kvm_vcpu *vcpu)
 	bool phys_active;
 	int ret;
 
-	phys_active = vtimer->irq.level ||
-		      kvm_vgic_map_is_active(vcpu, vtimer->irq.irq);
+	phys_active = kvm_vgic_map_is_active(vcpu, vtimer->irq.irq);
 
 	ret = irq_set_irqchip_state(host_vtimer_irq,
 				    IRQCHIP_STATE_ACTIVE,
@@ -541,54 +544,25 @@ void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu)
 	set_cntvoff(0);
 }
 
-static void unmask_vtimer_irq(struct kvm_vcpu *vcpu)
+/*
+ * With a userspace irqchip we have to check if the guest de-asserted the
+ * timer and if so, unmask the timer irq signal on the host interrupt
+ * controller to ensure that we see future timer signals.
+ */
+static void unmask_vtimer_irq_user(struct kvm_vcpu *vcpu)
 {
 	struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
 
 	if (unlikely(!irqchip_in_kernel(vcpu->kvm))) {
-		kvm_vtimer_update_mask_user(vcpu);
-		return;
-	}
-
-	/*
-	 * If the guest disabled the timer without acking the interrupt, then
-	 * we must make sure the physical and virtual active states are in
-	 * sync by deactivating the physical interrupt, because otherwise we
-	 * wouldn't see the next timer interrupt in the host.
-	 */
-	if (!kvm_vgic_map_is_active(vcpu, vtimer->irq.irq)) {
-		int ret;
-		ret = irq_set_irqchip_state(host_vtimer_irq,
-					    IRQCHIP_STATE_ACTIVE,
-					    false);
-		WARN_ON(ret);
+		__timer_snapshot_state(vtimer);
+		if (!kvm_timer_should_fire(vtimer))
+			kvm_vtimer_update_mask_user(vcpu);
 	}
 }
 
-/**
- * kvm_timer_sync_hwstate - sync timer state from cpu
- * @vcpu: The vcpu pointer
- *
- * Check if any of the timers have expired while we were running in the guest,
- * and inject an interrupt if that was the case.
- */
 void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
 {
-	struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
-
-	/*
-	 * If we entered the guest with the vtimer output asserted we have to
-	 * check if the guest has modified the timer so that we should lower
-	 * the line at this point.
-	 */
-	if (vtimer->irq.level) {
-		vtimer->cnt_ctl = read_sysreg_el0(cntv_ctl);
-		vtimer->cnt_cval = read_sysreg_el0(cntv_cval);
-		if (!kvm_timer_should_fire(vtimer)) {
-			kvm_timer_update_irq(vcpu, false, vtimer);
-			unmask_vtimer_irq(vcpu);
-		}
-	}
+	unmask_vtimer_irq_user(vcpu);
 }
 
 int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu)
@@ -819,6 +793,22 @@ static bool timer_irqs_are_valid(struct kvm_vcpu *vcpu)
 	return true;
 }
 
+bool kvm_arch_timer_get_input_level(int vintid)
+{
+	struct kvm_vcpu *vcpu = kvm_arm_get_running_vcpu();
+	struct arch_timer_context *timer;
+
+	if (vintid == vcpu_vtimer(vcpu)->irq.irq)
+		timer = vcpu_vtimer(vcpu);
+	else
+		BUG(); /* We only map the vtimer so far */
+
+	if (timer->loaded)
+		__timer_snapshot_state(timer);
+
+	return kvm_timer_should_fire(timer);
+}
+
 int kvm_timer_enable(struct kvm_vcpu *vcpu)
 {
 	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
@@ -841,7 +831,7 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
 	}
 
 	ret = kvm_vgic_map_phys_irq(vcpu, host_vtimer_irq, vtimer->irq.irq,
-				    NULL);
+				    kvm_arch_timer_get_input_level);
 	if (ret)
 		return ret;
 
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v8 7/9] KVM: arm/arm64: Provide a get_input_level for the arch timer
@ 2017-12-13 10:46   ` Christoffer Dall
  0 siblings, 0 replies; 44+ messages in thread
From: Christoffer Dall @ 2017-12-13 10:46 UTC (permalink / raw)
  To: linux-arm-kernel

The VGIC can now support the life-cycle of mapped level-triggered
interrupts, and we no longer have to read back the timer state on every
exit from the VM if we had an asserted timer interrupt signal, because
the VGIC already knows if we hit the unlikely case where the guest
disables the timer without ACKing the virtual timer interrupt.

This means we rework a bit of the code to factor out the functionality
to snapshot the timer state from vtimer_save_state(), and we can reuse
this functionality in the sync path when we have an irqchip in
userspace, and also to support our implementation of the
get_input_level() function for the timer.

This change also means that we can no longer rely on the timer's view of
the interrupt line to set the active state, because we no longer
maintain this state for mapped interrupts when exiting from the guest.
Instead, we only set the active state if the virtual interrupt is
active, and otherwise we simply let the timer fire again and raise the
virtual interrupt from the ISR.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 include/kvm/arm_arch_timer.h |  2 ++
 virt/kvm/arm/arch_timer.c    | 82 +++++++++++++++++++-------------------------
 2 files changed, 38 insertions(+), 46 deletions(-)

diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
index 01ee473517e2..f57f795d704c 100644
--- a/include/kvm/arm_arch_timer.h
+++ b/include/kvm/arm_arch_timer.h
@@ -90,6 +90,8 @@ void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu);
 
 void kvm_timer_init_vhe(void);
 
+bool kvm_arch_timer_get_input_level(int vintid);
+
 #define vcpu_vtimer(v)	(&(v)->arch.timer_cpu.vtimer)
 #define vcpu_ptimer(v)	(&(v)->arch.timer_cpu.ptimer)
 
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index e78ba5e20f74..f8d09665ddce 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -343,6 +343,12 @@ static void kvm_timer_update_state(struct kvm_vcpu *vcpu)
 	phys_timer_emulate(vcpu);
 }
 
+static void __timer_snapshot_state(struct arch_timer_context *timer)
+{
+	timer->cnt_ctl = read_sysreg_el0(cntv_ctl);
+	timer->cnt_cval = read_sysreg_el0(cntv_cval);
+}
+
 static void vtimer_save_state(struct kvm_vcpu *vcpu)
 {
 	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
@@ -354,10 +360,8 @@ static void vtimer_save_state(struct kvm_vcpu *vcpu)
 	if (!vtimer->loaded)
 		goto out;
 
-	if (timer->enabled) {
-		vtimer->cnt_ctl = read_sysreg_el0(cntv_ctl);
-		vtimer->cnt_cval = read_sysreg_el0(cntv_cval);
-	}
+	if (timer->enabled)
+		__timer_snapshot_state(vtimer);
 
 	/* Disable the virtual timer */
 	write_sysreg_el0(0, cntv_ctl);
@@ -454,8 +458,7 @@ static void kvm_timer_vcpu_load_vgic(struct kvm_vcpu *vcpu)
 	bool phys_active;
 	int ret;
 
-	phys_active = vtimer->irq.level ||
-		      kvm_vgic_map_is_active(vcpu, vtimer->irq.irq);
+	phys_active = kvm_vgic_map_is_active(vcpu, vtimer->irq.irq);
 
 	ret = irq_set_irqchip_state(host_vtimer_irq,
 				    IRQCHIP_STATE_ACTIVE,
@@ -541,54 +544,25 @@ void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu)
 	set_cntvoff(0);
 }
 
-static void unmask_vtimer_irq(struct kvm_vcpu *vcpu)
+/*
+ * With a userspace irqchip we have to check if the guest de-asserted the
+ * timer and if so, unmask the timer irq signal on the host interrupt
+ * controller to ensure that we see future timer signals.
+ */
+static void unmask_vtimer_irq_user(struct kvm_vcpu *vcpu)
 {
 	struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
 
 	if (unlikely(!irqchip_in_kernel(vcpu->kvm))) {
-		kvm_vtimer_update_mask_user(vcpu);
-		return;
-	}
-
-	/*
-	 * If the guest disabled the timer without acking the interrupt, then
-	 * we must make sure the physical and virtual active states are in
-	 * sync by deactivating the physical interrupt, because otherwise we
-	 * wouldn't see the next timer interrupt in the host.
-	 */
-	if (!kvm_vgic_map_is_active(vcpu, vtimer->irq.irq)) {
-		int ret;
-		ret = irq_set_irqchip_state(host_vtimer_irq,
-					    IRQCHIP_STATE_ACTIVE,
-					    false);
-		WARN_ON(ret);
+		__timer_snapshot_state(vtimer);
+		if (!kvm_timer_should_fire(vtimer))
+			kvm_vtimer_update_mask_user(vcpu);
 	}
 }
 
-/**
- * kvm_timer_sync_hwstate - sync timer state from cpu
- * @vcpu: The vcpu pointer
- *
- * Check if any of the timers have expired while we were running in the guest,
- * and inject an interrupt if that was the case.
- */
 void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
 {
-	struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
-
-	/*
-	 * If we entered the guest with the vtimer output asserted we have to
-	 * check if the guest has modified the timer so that we should lower
-	 * the line at this point.
-	 */
-	if (vtimer->irq.level) {
-		vtimer->cnt_ctl = read_sysreg_el0(cntv_ctl);
-		vtimer->cnt_cval = read_sysreg_el0(cntv_cval);
-		if (!kvm_timer_should_fire(vtimer)) {
-			kvm_timer_update_irq(vcpu, false, vtimer);
-			unmask_vtimer_irq(vcpu);
-		}
-	}
+	unmask_vtimer_irq_user(vcpu);
 }
 
 int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu)
@@ -819,6 +793,22 @@ static bool timer_irqs_are_valid(struct kvm_vcpu *vcpu)
 	return true;
 }
 
+bool kvm_arch_timer_get_input_level(int vintid)
+{
+	struct kvm_vcpu *vcpu = kvm_arm_get_running_vcpu();
+	struct arch_timer_context *timer;
+
+	if (vintid == vcpu_vtimer(vcpu)->irq.irq)
+		timer = vcpu_vtimer(vcpu);
+	else
+		BUG(); /* We only map the vtimer so far */
+
+	if (timer->loaded)
+		__timer_snapshot_state(timer);
+
+	return kvm_timer_should_fire(timer);
+}
+
 int kvm_timer_enable(struct kvm_vcpu *vcpu)
 {
 	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
@@ -841,7 +831,7 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
 	}
 
 	ret = kvm_vgic_map_phys_irq(vcpu, host_vtimer_irq, vtimer->irq.irq,
-				    NULL);
+				    kvm_arch_timer_get_input_level);
 	if (ret)
 		return ret;
 
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v8 8/9] KVM: arm/arm64: Avoid work when userspace iqchips are not used
  2017-12-13 10:45 ` Christoffer Dall
@ 2017-12-13 10:46   ` Christoffer Dall
  -1 siblings, 0 replies; 44+ messages in thread
From: Christoffer Dall @ 2017-12-13 10:46 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel; +Cc: Marc Zyngier, Andre Przywara, kvm

We currently check if the VM has a userspace irqchip on every exit from
the VCPU, and if so, we do some work to ensure correct timer behavior.
This is unfortunate, as we could avoid doing any work entirely, if we
didn't have to support irqchip in userspace.

Realizing the userspace irqchip on ARM is mostly a developer or hobby
feature, and is unlikely to be used in servers or other scenarios where
performance is a priority, we can use a refcounted static key to only
check the irqchip configuration when we have at least one VM that uses
an irqchip in userspace.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 virt/kvm/arm/arch_timer.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index f8d09665ddce..73d262c4712b 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -51,6 +51,8 @@ static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level,
 				 struct arch_timer_context *timer_ctx);
 static bool kvm_timer_should_fire(struct arch_timer_context *timer_ctx);
 
+static DEFINE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
+
 u64 kvm_phys_timer_read(void)
 {
 	return timecounter->cc->read(timecounter->cc);
@@ -562,7 +564,8 @@ static void unmask_vtimer_irq_user(struct kvm_vcpu *vcpu)
 
 void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
 {
-	unmask_vtimer_irq_user(vcpu);
+	if (static_branch_unlikely(&userspace_irqchip_in_use))
+		unmask_vtimer_irq_user(vcpu);
 }
 
 int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu)
@@ -767,6 +770,8 @@ void kvm_timer_vcpu_terminate(struct kvm_vcpu *vcpu)
 	soft_timer_cancel(&timer->bg_timer, &timer->expired);
 	soft_timer_cancel(&timer->phys_timer, NULL);
 	kvm_vgic_unmap_phys_irq(vcpu, vtimer->irq.irq);
+	if (timer->enabled && !irqchip_in_kernel(vcpu->kvm))
+		static_branch_dec(&userspace_irqchip_in_use);
 }
 
 static bool timer_irqs_are_valid(struct kvm_vcpu *vcpu)
@@ -819,8 +824,10 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
 		return 0;
 
 	/* Without a VGIC we do not map virtual IRQs to physical IRQs */
-	if (!irqchip_in_kernel(vcpu->kvm))
+	if (!irqchip_in_kernel(vcpu->kvm)) {
+		static_branch_inc(&userspace_irqchip_in_use);
 		goto no_vgic;
+	}
 
 	if (!vgic_initialized(vcpu->kvm))
 		return -ENODEV;
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v8 8/9] KVM: arm/arm64: Avoid work when userspace iqchips are not used
@ 2017-12-13 10:46   ` Christoffer Dall
  0 siblings, 0 replies; 44+ messages in thread
From: Christoffer Dall @ 2017-12-13 10:46 UTC (permalink / raw)
  To: linux-arm-kernel

We currently check if the VM has a userspace irqchip on every exit from
the VCPU, and if so, we do some work to ensure correct timer behavior.
This is unfortunate, as we could avoid doing any work entirely, if we
didn't have to support irqchip in userspace.

Realizing the userspace irqchip on ARM is mostly a developer or hobby
feature, and is unlikely to be used in servers or other scenarios where
performance is a priority, we can use a refcounted static key to only
check the irqchip configuration when we have at least one VM that uses
an irqchip in userspace.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 virt/kvm/arm/arch_timer.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index f8d09665ddce..73d262c4712b 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -51,6 +51,8 @@ static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level,
 				 struct arch_timer_context *timer_ctx);
 static bool kvm_timer_should_fire(struct arch_timer_context *timer_ctx);
 
+static DEFINE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
+
 u64 kvm_phys_timer_read(void)
 {
 	return timecounter->cc->read(timecounter->cc);
@@ -562,7 +564,8 @@ static void unmask_vtimer_irq_user(struct kvm_vcpu *vcpu)
 
 void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
 {
-	unmask_vtimer_irq_user(vcpu);
+	if (static_branch_unlikely(&userspace_irqchip_in_use))
+		unmask_vtimer_irq_user(vcpu);
 }
 
 int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu)
@@ -767,6 +770,8 @@ void kvm_timer_vcpu_terminate(struct kvm_vcpu *vcpu)
 	soft_timer_cancel(&timer->bg_timer, &timer->expired);
 	soft_timer_cancel(&timer->phys_timer, NULL);
 	kvm_vgic_unmap_phys_irq(vcpu, vtimer->irq.irq);
+	if (timer->enabled && !irqchip_in_kernel(vcpu->kvm))
+		static_branch_dec(&userspace_irqchip_in_use);
 }
 
 static bool timer_irqs_are_valid(struct kvm_vcpu *vcpu)
@@ -819,8 +824,10 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
 		return 0;
 
 	/* Without a VGIC we do not map virtual IRQs to physical IRQs */
-	if (!irqchip_in_kernel(vcpu->kvm))
+	if (!irqchip_in_kernel(vcpu->kvm)) {
+		static_branch_inc(&userspace_irqchip_in_use);
 		goto no_vgic;
+	}
 
 	if (!vgic_initialized(vcpu->kvm))
 		return -ENODEV;
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v8 9/9] KVM: arm/arm64: Update timer and forwarded irq documentation
  2017-12-13 10:45 ` Christoffer Dall
@ 2017-12-13 10:46   ` Christoffer Dall
  -1 siblings, 0 replies; 44+ messages in thread
From: Christoffer Dall @ 2017-12-13 10:46 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel; +Cc: Marc Zyngier, Andre Przywara, kvm

Now when we've reworked how mapped level-triggered interrupts are
processed for the timer interrupts, we update the documentation
correspondingly.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt | 50 ++++++++++------------
 1 file changed, 23 insertions(+), 27 deletions(-)

diff --git a/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt b/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
index 38bca2835278..f68c7d95a341 100644
--- a/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
+++ b/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
@@ -7,9 +7,10 @@ allowing software to inject virtual interrupts to a VM, which the guest
 OS sees as regular interrupts.  The code is famously known as the VGIC.
 
 Some of these virtual interrupts, however, correspond to physical
-interrupts from real physical devices.  One example could be the
-architected timer, which itself supports virtualization, and therefore
-lets a guest OS program the hardware device directly to raise an
+interrupts from real physical devices.  One example could be the ARM
+Generic Timers (also known as the "architected timers"), which are
+directly assigned to a VM while it's running, and therefore
+makes it possible for guest OSes to program the timers directly to raise an
 interrupt at some point in time.  When such an interrupt is raised, the
 host OS initially handles the interrupt and must somehow signal this
 event as a virtual interrupt to the guest.  Another example could be a
@@ -37,7 +38,7 @@ inactive.
 
 The LRs include an extra bit, called the HW bit.  When this bit is set,
 KVM must also program an additional field in the LR, the physical IRQ
-number, to link the virtual with the physical IRQ.
+number, to link the virtual and physical IRQs together.
 
 When the HW bit is set, KVM must EITHER set the Pending OR the Active
 bit, never both at the same time.
@@ -59,21 +60,21 @@ The state of forwarded physical interrupts is managed in the following way:
   - LR.Pending will stay set as long as the guest has not acked the interrupt.
   - LR.Pending transitions to LR.Active on the guest read of the IAR, as
     expected.
-  - On guest EOI, the *physical distributor* active bit gets cleared,
+  - On guest deactivate, the *physical distributor* active bit gets cleared,
     but the LR.Active is left untouched (set).
   - KVM clears the LR on VM exits when the physical distributor
     active state has been cleared.
 
 (*): The host handling is slightly more complicated.  For some forwarded
-interrupts (shared), KVM directly sets the active state on the physical
-distributor before entering the guest, because the interrupt is never actually
-handled on the host (see details on the timer as an example below).  For other
-forwarded interrupts (non-shared) the host does not deactivate the interrupt
-when the host ISR completes, but leaves the interrupt active until the guest
-deactivates it.  Leaving the interrupt active is allowed, because Linux
-configures the physical GIC with EOIMode=1, which causes EOI operations to
-perform a priority drop allowing the GIC to receive other interrupts of the
-default priority.
+interrupts (shared), in some cases, KVM directly sets the active state
+on the physical distributor before entering the guest, because the
+interrupt is never actually handled on the host (see details on the
+timer as an example below).  In other cases, the host does not
+deactivate the interrupt when the host ISR completes, but leaves the
+interrupt active until the guest deactivates it.  Leaving the interrupt
+active is allowed, because Linux configures the physical GIC with
+EOIMode=1, which causes EOI operations to perform a priority drop
+allowing the GIC to receive other interrupts of the default priority.
 
 
 Forwarded Edge and Level Triggered PPIs and SPIs
@@ -170,18 +171,13 @@ instead:
 
 1.  KVM runs the VCPU
 2.  The guest programs the time to fire in T+100
-4.  At T+100 the timer fires and a physical IRQ causes the VM to exit
+3.  At T+100 the timer fires and a physical IRQ causes the VM to exit
     (note that this initially only traps to EL2 and does not run the host ISR
     until KVM has returned to the host).
-5.  With interrupts still disabled on the CPU coming back from the guest, KVM
-    stores the virtual timer state to memory and disables the virtual hw timer.
-6.  KVM looks at the timer state (in memory) and injects a forwarded physical
-    interrupt because it concludes the timer has expired.
-7.  KVM marks the timer interrupt as active on the physical distributor
-7.  KVM enables the timer, enables interrupts, and runs the VCPU
-
-Notice that again the forwarded physical interrupt is injected to the
-guest without having actually been handled on the host.  In this case it
-is because the physical interrupt is never actually seen by the host because the
-timer is disabled upon guest return, and the virtual forwarded interrupt is
-injected on the KVM guest entry path.
+4.  When KVM returns to EL1 and enables interrupts, the timer interrupt
+    fires again, and the kvm arch timer ISR runs and injects a virtual
+    interrupt to the guest.
+5.  Because the timer interrupt has the vcpu affinity set, as the ISR
+    completes, the physical interrupt stays active on the physical
+    distributor.
+6.  KVM enables the timer, enables interrupts, and runs the VCPU
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v8 9/9] KVM: arm/arm64: Update timer and forwarded irq documentation
@ 2017-12-13 10:46   ` Christoffer Dall
  0 siblings, 0 replies; 44+ messages in thread
From: Christoffer Dall @ 2017-12-13 10:46 UTC (permalink / raw)
  To: linux-arm-kernel

Now when we've reworked how mapped level-triggered interrupts are
processed for the timer interrupts, we update the documentation
correspondingly.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt | 50 ++++++++++------------
 1 file changed, 23 insertions(+), 27 deletions(-)

diff --git a/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt b/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
index 38bca2835278..f68c7d95a341 100644
--- a/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
+++ b/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
@@ -7,9 +7,10 @@ allowing software to inject virtual interrupts to a VM, which the guest
 OS sees as regular interrupts.  The code is famously known as the VGIC.
 
 Some of these virtual interrupts, however, correspond to physical
-interrupts from real physical devices.  One example could be the
-architected timer, which itself supports virtualization, and therefore
-lets a guest OS program the hardware device directly to raise an
+interrupts from real physical devices.  One example could be the ARM
+Generic Timers (also known as the "architected timers"), which are
+directly assigned to a VM while it's running, and therefore
+makes it possible for guest OSes to program the timers directly to raise an
 interrupt at some point in time.  When such an interrupt is raised, the
 host OS initially handles the interrupt and must somehow signal this
 event as a virtual interrupt to the guest.  Another example could be a
@@ -37,7 +38,7 @@ inactive.
 
 The LRs include an extra bit, called the HW bit.  When this bit is set,
 KVM must also program an additional field in the LR, the physical IRQ
-number, to link the virtual with the physical IRQ.
+number, to link the virtual and physical IRQs together.
 
 When the HW bit is set, KVM must EITHER set the Pending OR the Active
 bit, never both at the same time.
@@ -59,21 +60,21 @@ The state of forwarded physical interrupts is managed in the following way:
   - LR.Pending will stay set as long as the guest has not acked the interrupt.
   - LR.Pending transitions to LR.Active on the guest read of the IAR, as
     expected.
-  - On guest EOI, the *physical distributor* active bit gets cleared,
+  - On guest deactivate, the *physical distributor* active bit gets cleared,
     but the LR.Active is left untouched (set).
   - KVM clears the LR on VM exits when the physical distributor
     active state has been cleared.
 
 (*): The host handling is slightly more complicated.  For some forwarded
-interrupts (shared), KVM directly sets the active state on the physical
-distributor before entering the guest, because the interrupt is never actually
-handled on the host (see details on the timer as an example below).  For other
-forwarded interrupts (non-shared) the host does not deactivate the interrupt
-when the host ISR completes, but leaves the interrupt active until the guest
-deactivates it.  Leaving the interrupt active is allowed, because Linux
-configures the physical GIC with EOIMode=1, which causes EOI operations to
-perform a priority drop allowing the GIC to receive other interrupts of the
-default priority.
+interrupts (shared), in some cases, KVM directly sets the active state
+on the physical distributor before entering the guest, because the
+interrupt is never actually handled on the host (see details on the
+timer as an example below).  In other cases, the host does not
+deactivate the interrupt when the host ISR completes, but leaves the
+interrupt active until the guest deactivates it.  Leaving the interrupt
+active is allowed, because Linux configures the physical GIC with
+EOIMode=1, which causes EOI operations to perform a priority drop
+allowing the GIC to receive other interrupts of the default priority.
 
 
 Forwarded Edge and Level Triggered PPIs and SPIs
@@ -170,18 +171,13 @@ instead:
 
 1.  KVM runs the VCPU
 2.  The guest programs the time to fire in T+100
-4.  At T+100 the timer fires and a physical IRQ causes the VM to exit
+3.  At T+100 the timer fires and a physical IRQ causes the VM to exit
     (note that this initially only traps to EL2 and does not run the host ISR
     until KVM has returned to the host).
-5.  With interrupts still disabled on the CPU coming back from the guest, KVM
-    stores the virtual timer state to memory and disables the virtual hw timer.
-6.  KVM looks at the timer state (in memory) and injects a forwarded physical
-    interrupt because it concludes the timer has expired.
-7.  KVM marks the timer interrupt as active on the physical distributor
-7.  KVM enables the timer, enables interrupts, and runs the VCPU
-
-Notice that again the forwarded physical interrupt is injected to the
-guest without having actually been handled on the host.  In this case it
-is because the physical interrupt is never actually seen by the host because the
-timer is disabled upon guest return, and the virtual forwarded interrupt is
-injected on the KVM guest entry path.
+4.  When KVM returns to EL1 and enables interrupts, the timer interrupt
+    fires again, and the kvm arch timer ISR runs and injects a virtual
+    interrupt to the guest.
+5.  Because the timer interrupt has the vcpu affinity set, as the ISR
+    completes, the physical interrupt stays active on the physical
+    distributor.
+6.  KVM enables the timer, enables interrupts, and runs the VCPU
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH v8 3/9] KVM: arm/arm64: Don't cache the timer IRQ level
  2017-12-13 10:45   ` Christoffer Dall
@ 2017-12-13 19:38     ` Marc Zyngier
  -1 siblings, 0 replies; 44+ messages in thread
From: Marc Zyngier @ 2017-12-13 19:38 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: kvmarm, linux-arm-kernel, kvm, Andre Przywara, Eric Auger

On Wed, 13 Dec 2017 10:45:56 +0000,
Christoffer Dall wrote:
> 
> The timer was modeled after a strict idea of modelling an interrupt line
> level in software, meaning that only transitions in the level needed to
> be reported to the VGIC.  This works well for the timer, because the
> arch timer code is in complete control of the device and can track the
> transitions of the line.
> 
> However, as we are about to support using the HW bit in the VGIC not
> just for the timer, but also for VFIO which cannot track transitions of
> the interrupt line, we have to decide on an interface for level
> triggered mapped interrupts to the GIC, which both the timer and VFIO
> can use.
> 
> VFIO only sees an asserting transition of the physical interrupt line,
> and tells the VGIC when that happens.  That means that part of the
> interrupt flow is offloaded to the hardware.
> 
> To use the same interface for VFIO devices and the timer, we therefore
> have to change the timer (we cannot change VFIO because it doesn't know
> the details of the device it is assigning to a VM).
> 
> Luckily, changing the timer is simple, we just need to stop 'caching'
> the line level, but instead let the VGIC know the state of the timer
> every time there is a potential change in the line level, and when the
> line level should be asserted from the timer ISR.  The VGIC can ignore
> extra notifications using its validate mechanism.
> 
> Reviewed-by: Andre Przywara <andre.przywara@arm.com>
> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>

	M.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 3/9] KVM: arm/arm64: Don't cache the timer IRQ level
@ 2017-12-13 19:38     ` Marc Zyngier
  0 siblings, 0 replies; 44+ messages in thread
From: Marc Zyngier @ 2017-12-13 19:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 13 Dec 2017 10:45:56 +0000,
Christoffer Dall wrote:
> 
> The timer was modeled after a strict idea of modelling an interrupt line
> level in software, meaning that only transitions in the level needed to
> be reported to the VGIC.  This works well for the timer, because the
> arch timer code is in complete control of the device and can track the
> transitions of the line.
> 
> However, as we are about to support using the HW bit in the VGIC not
> just for the timer, but also for VFIO which cannot track transitions of
> the interrupt line, we have to decide on an interface for level
> triggered mapped interrupts to the GIC, which both the timer and VFIO
> can use.
> 
> VFIO only sees an asserting transition of the physical interrupt line,
> and tells the VGIC when that happens.  That means that part of the
> interrupt flow is offloaded to the hardware.
> 
> To use the same interface for VFIO devices and the timer, we therefore
> have to change the timer (we cannot change VFIO because it doesn't know
> the details of the device it is assigning to a VM).
> 
> Luckily, changing the timer is simple, we just need to stop 'caching'
> the line level, but instead let the VGIC know the state of the timer
> every time there is a potential change in the line level, and when the
> line level should be asserted from the timer ISR.  The VGIC can ignore
> extra notifications using its validate mechanism.
> 
> Reviewed-by: Andre Przywara <andre.przywara@arm.com>
> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>

	M.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v8 7/9] KVM: arm/arm64: Provide a get_input_level for the arch timer
  2017-12-13 10:46   ` Christoffer Dall
@ 2017-12-13 19:45     ` Marc Zyngier
  -1 siblings, 0 replies; 44+ messages in thread
From: Marc Zyngier @ 2017-12-13 19:45 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: Andre Przywara, kvmarm, linux-arm-kernel, kvm

On Wed, 13 Dec 2017 10:46:00 +0000,
Christoffer Dall wrote:
> 
> The VGIC can now support the life-cycle of mapped level-triggered
> interrupts, and we no longer have to read back the timer state on every
> exit from the VM if we had an asserted timer interrupt signal, because
> the VGIC already knows if we hit the unlikely case where the guest
> disables the timer without ACKing the virtual timer interrupt.
> 
> This means we rework a bit of the code to factor out the functionality
> to snapshot the timer state from vtimer_save_state(), and we can reuse
> this functionality in the sync path when we have an irqchip in
> userspace, and also to support our implementation of the
> get_input_level() function for the timer.
> 
> This change also means that we can no longer rely on the timer's view of
> the interrupt line to set the active state, because we no longer
> maintain this state for mapped interrupts when exiting from the guest.
> Instead, we only set the active state if the virtual interrupt is
> active, and otherwise we simply let the timer fire again and raise the
> virtual interrupt from the ISR.
> 
> Reviewed-by: Eric Auger <eric.auger@redhat.com>
> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>

	M.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 7/9] KVM: arm/arm64: Provide a get_input_level for the arch timer
@ 2017-12-13 19:45     ` Marc Zyngier
  0 siblings, 0 replies; 44+ messages in thread
From: Marc Zyngier @ 2017-12-13 19:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 13 Dec 2017 10:46:00 +0000,
Christoffer Dall wrote:
> 
> The VGIC can now support the life-cycle of mapped level-triggered
> interrupts, and we no longer have to read back the timer state on every
> exit from the VM if we had an asserted timer interrupt signal, because
> the VGIC already knows if we hit the unlikely case where the guest
> disables the timer without ACKing the virtual timer interrupt.
> 
> This means we rework a bit of the code to factor out the functionality
> to snapshot the timer state from vtimer_save_state(), and we can reuse
> this functionality in the sync path when we have an irqchip in
> userspace, and also to support our implementation of the
> get_input_level() function for the timer.
> 
> This change also means that we can no longer rely on the timer's view of
> the interrupt line to set the active state, because we no longer
> maintain this state for mapped interrupts when exiting from the guest.
> Instead, we only set the active state if the virtual interrupt is
> active, and otherwise we simply let the timer fire again and raise the
> virtual interrupt from the ISR.
> 
> Reviewed-by: Eric Auger <eric.auger@redhat.com>
> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>

	M.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v8 8/9] KVM: arm/arm64: Avoid work when userspace iqchips are not used
  2017-12-13 10:46   ` Christoffer Dall
@ 2017-12-13 20:05     ` Marc Zyngier
  -1 siblings, 0 replies; 44+ messages in thread
From: Marc Zyngier @ 2017-12-13 20:05 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: kvmarm, linux-arm-kernel, kvm, Andre Przywara, Eric Auger

On Wed, 13 Dec 2017 10:46:01 +0000,
Christoffer Dall wrote:
> 
> We currently check if the VM has a userspace irqchip on every exit from
> the VCPU, and if so, we do some work to ensure correct timer behavior.
> This is unfortunate, as we could avoid doing any work entirely, if we
> didn't have to support irqchip in userspace.
> 
> Realizing the userspace irqchip on ARM is mostly a developer or hobby
> feature, and is unlikely to be used in servers or other scenarios where
> performance is a priority, we can use a refcounted static key to only
> check the irqchip configuration when we have at least one VM that uses
> an irqchip in userspace.
> 
> Reviewed-by: Eric Auger <eric.auger@redhat.com>
> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>

On its own, this doesn't seem to be that useful. As far as I can see,
it saves us a load from the kvm structure before giving up. I think it
is more the cumulative effect of this load that could have an impact,
but you're only dealing with it at a single location.

How about making this a first class helper and redefine
irqchip_in_kernel as such:

static inline bool irqchip_in_kernel(struct kvm *kvm)
{
	if (static_branch_unlikely(&userspace_irqchip_in_use) &&
	    unlikely(!irqchip_in_kernel(kvm)))
		return true;

	return false;
}

and move that static key to a more central location?

> ---
>  virt/kvm/arm/arch_timer.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> index f8d09665ddce..73d262c4712b 100644
> --- a/virt/kvm/arm/arch_timer.c
> +++ b/virt/kvm/arm/arch_timer.c
> @@ -51,6 +51,8 @@ static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level,
>  				 struct arch_timer_context *timer_ctx);
>  static bool kvm_timer_should_fire(struct arch_timer_context *timer_ctx);
>  
> +static DEFINE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
> +
>  u64 kvm_phys_timer_read(void)
>  {
>  	return timecounter->cc->read(timecounter->cc);
> @@ -562,7 +564,8 @@ static void unmask_vtimer_irq_user(struct kvm_vcpu *vcpu)
>  
>  void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
>  {
> -	unmask_vtimer_irq_user(vcpu);
> +	if (static_branch_unlikely(&userspace_irqchip_in_use))
> +		unmask_vtimer_irq_user(vcpu);
>  }
>  
>  int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu)
> @@ -767,6 +770,8 @@ void kvm_timer_vcpu_terminate(struct kvm_vcpu *vcpu)
>  	soft_timer_cancel(&timer->bg_timer, &timer->expired);
>  	soft_timer_cancel(&timer->phys_timer, NULL);
>  	kvm_vgic_unmap_phys_irq(vcpu, vtimer->irq.irq);
> +	if (timer->enabled && !irqchip_in_kernel(vcpu->kvm))
> +		static_branch_dec(&userspace_irqchip_in_use);
>  }
>  
>  static bool timer_irqs_are_valid(struct kvm_vcpu *vcpu)
> @@ -819,8 +824,10 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
>  		return 0;
>  
>  	/* Without a VGIC we do not map virtual IRQs to physical IRQs */
> -	if (!irqchip_in_kernel(vcpu->kvm))
> +	if (!irqchip_in_kernel(vcpu->kvm)) {
> +		static_branch_inc(&userspace_irqchip_in_use);
>  		goto no_vgic;
> +	}
>  
>  	if (!vgic_initialized(vcpu->kvm))
>  		return -ENODEV;
> -- 
> 2.14.2
> 

Thanks,

	M.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 8/9] KVM: arm/arm64: Avoid work when userspace iqchips are not used
@ 2017-12-13 20:05     ` Marc Zyngier
  0 siblings, 0 replies; 44+ messages in thread
From: Marc Zyngier @ 2017-12-13 20:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 13 Dec 2017 10:46:01 +0000,
Christoffer Dall wrote:
> 
> We currently check if the VM has a userspace irqchip on every exit from
> the VCPU, and if so, we do some work to ensure correct timer behavior.
> This is unfortunate, as we could avoid doing any work entirely, if we
> didn't have to support irqchip in userspace.
> 
> Realizing the userspace irqchip on ARM is mostly a developer or hobby
> feature, and is unlikely to be used in servers or other scenarios where
> performance is a priority, we can use a refcounted static key to only
> check the irqchip configuration when we have at least one VM that uses
> an irqchip in userspace.
> 
> Reviewed-by: Eric Auger <eric.auger@redhat.com>
> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>

On its own, this doesn't seem to be that useful. As far as I can see,
it saves us a load from the kvm structure before giving up. I think it
is more the cumulative effect of this load that could have an impact,
but you're only dealing with it at a single location.

How about making this a first class helper and redefine
irqchip_in_kernel as such:

static inline bool irqchip_in_kernel(struct kvm *kvm)
{
	if (static_branch_unlikely(&userspace_irqchip_in_use) &&
	    unlikely(!irqchip_in_kernel(kvm)))
		return true;

	return false;
}

and move that static key to a more central location?

> ---
>  virt/kvm/arm/arch_timer.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> index f8d09665ddce..73d262c4712b 100644
> --- a/virt/kvm/arm/arch_timer.c
> +++ b/virt/kvm/arm/arch_timer.c
> @@ -51,6 +51,8 @@ static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level,
>  				 struct arch_timer_context *timer_ctx);
>  static bool kvm_timer_should_fire(struct arch_timer_context *timer_ctx);
>  
> +static DEFINE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
> +
>  u64 kvm_phys_timer_read(void)
>  {
>  	return timecounter->cc->read(timecounter->cc);
> @@ -562,7 +564,8 @@ static void unmask_vtimer_irq_user(struct kvm_vcpu *vcpu)
>  
>  void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
>  {
> -	unmask_vtimer_irq_user(vcpu);
> +	if (static_branch_unlikely(&userspace_irqchip_in_use))
> +		unmask_vtimer_irq_user(vcpu);
>  }
>  
>  int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu)
> @@ -767,6 +770,8 @@ void kvm_timer_vcpu_terminate(struct kvm_vcpu *vcpu)
>  	soft_timer_cancel(&timer->bg_timer, &timer->expired);
>  	soft_timer_cancel(&timer->phys_timer, NULL);
>  	kvm_vgic_unmap_phys_irq(vcpu, vtimer->irq.irq);
> +	if (timer->enabled && !irqchip_in_kernel(vcpu->kvm))
> +		static_branch_dec(&userspace_irqchip_in_use);
>  }
>  
>  static bool timer_irqs_are_valid(struct kvm_vcpu *vcpu)
> @@ -819,8 +824,10 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
>  		return 0;
>  
>  	/* Without a VGIC we do not map virtual IRQs to physical IRQs */
> -	if (!irqchip_in_kernel(vcpu->kvm))
> +	if (!irqchip_in_kernel(vcpu->kvm)) {
> +		static_branch_inc(&userspace_irqchip_in_use);
>  		goto no_vgic;
> +	}
>  
>  	if (!vgic_initialized(vcpu->kvm))
>  		return -ENODEV;
> -- 
> 2.14.2
> 

Thanks,

	M.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v8 9/9] KVM: arm/arm64: Update timer and forwarded irq documentation
  2017-12-13 10:46   ` Christoffer Dall
@ 2017-12-13 20:15     ` Marc Zyngier
  -1 siblings, 0 replies; 44+ messages in thread
From: Marc Zyngier @ 2017-12-13 20:15 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: kvmarm, linux-arm-kernel, kvm, Andre Przywara, Eric Auger

On Wed, 13 Dec 2017 10:46:02 +0000,
Christoffer Dall wrote:
> 
> Now when we've reworked how mapped level-triggered interrupts are
> processed for the timer interrupts, we update the documentation
> correspondingly.

Seems like the documentation is more out of date than we thought, see
below.

> 
> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
> ---
>  Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt | 50 ++++++++++------------
>  1 file changed, 23 insertions(+), 27 deletions(-)
> 
> diff --git a/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt b/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
> index 38bca2835278..f68c7d95a341 100644
> --- a/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
> +++ b/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
> @@ -7,9 +7,10 @@ allowing software to inject virtual interrupts to a VM, which the guest
>  OS sees as regular interrupts.  The code is famously known as the VGIC.
>  
>  Some of these virtual interrupts, however, correspond to physical
> -interrupts from real physical devices.  One example could be the
> -architected timer, which itself supports virtualization, and therefore
> -lets a guest OS program the hardware device directly to raise an
> +interrupts from real physical devices.  One example could be the ARM
> +Generic Timers (also known as the "architected timers"), which are
> +directly assigned to a VM while it's running, and therefore
> +makes it possible for guest OSes to program the timers directly to raise an
>  interrupt at some point in time.  When such an interrupt is raised, the
>  host OS initially handles the interrupt and must somehow signal this
>  event as a virtual interrupt to the guest.  Another example could be a
> @@ -37,7 +38,7 @@ inactive.
>  
>  The LRs include an extra bit, called the HW bit.  When this bit is set,
>  KVM must also program an additional field in the LR, the physical IRQ
> -number, to link the virtual with the physical IRQ.
> +number, to link the virtual and physical IRQs together.
>  
>  When the HW bit is set, KVM must EITHER set the Pending OR the Active
>  bit, never both at the same time.
> @@ -59,21 +60,21 @@ The state of forwarded physical interrupts is managed in the following way:
>    - LR.Pending will stay set as long as the guest has not acked the interrupt.
>    - LR.Pending transitions to LR.Active on the guest read of the IAR, as
>      expected.
> -  - On guest EOI, the *physical distributor* active bit gets cleared,
> +  - On guest deactivate, the *physical distributor* active bit gets cleared,
>      but the LR.Active is left untouched (set).

Is this true? I seem to remember that we established it wasn't (back
when we redesigned the vgic). Certainly, the current code relies on
the Active bit being cleared in the LR as well as in the physical
distributor.

>    - KVM clears the LR on VM exits when the physical distributor
>      active state has been cleared.

And this isn't either, if my above assertion stands.

>  
>  (*): The host handling is slightly more complicated.  For some forwarded
> -interrupts (shared), KVM directly sets the active state on the physical
> -distributor before entering the guest, because the interrupt is never actually
> -handled on the host (see details on the timer as an example below).  For other
> -forwarded interrupts (non-shared) the host does not deactivate the interrupt
> -when the host ISR completes, but leaves the interrupt active until the guest
> -deactivates it.  Leaving the interrupt active is allowed, because Linux
> -configures the physical GIC with EOIMode=1, which causes EOI operations to
> -perform a priority drop allowing the GIC to receive other interrupts of the
> -default priority.
> +interrupts (shared), in some cases, KVM directly sets the active state
> +on the physical distributor before entering the guest, because the
> +interrupt is never actually handled on the host (see details on the
> +timer as an example below).  In other cases, the host does not

This isn't true either. We now handle the timer interrupt on the host.

> +deactivate the interrupt when the host ISR completes, but leaves the
> +interrupt active until the guest deactivates it.  Leaving the interrupt
> +active is allowed, because Linux configures the physical GIC with
> +EOIMode=1, which causes EOI operations to perform a priority drop
> +allowing the GIC to receive other interrupts of the default priority.
>  
>  
>  Forwarded Edge and Level Triggered PPIs and SPIs
> @@ -170,18 +171,13 @@ instead:
>  
>  1.  KVM runs the VCPU
>  2.  The guest programs the time to fire in T+100
> -4.  At T+100 the timer fires and a physical IRQ causes the VM to exit
> +3.  At T+100 the timer fires and a physical IRQ causes the VM to exit
>      (note that this initially only traps to EL2 and does not run the host ISR
>      until KVM has returned to the host).
> -5.  With interrupts still disabled on the CPU coming back from the guest, KVM
> -    stores the virtual timer state to memory and disables the virtual hw timer.
> -6.  KVM looks at the timer state (in memory) and injects a forwarded physical
> -    interrupt because it concludes the timer has expired.
> -7.  KVM marks the timer interrupt as active on the physical distributor
> -7.  KVM enables the timer, enables interrupts, and runs the VCPU
> -
> -Notice that again the forwarded physical interrupt is injected to the
> -guest without having actually been handled on the host.  In this case it
> -is because the physical interrupt is never actually seen by the host because the
> -timer is disabled upon guest return, and the virtual forwarded interrupt is
> -injected on the KVM guest entry path.
> +4.  When KVM returns to EL1 and enables interrupts, the timer interrupt
> +    fires again, and the kvm arch timer ISR runs and injects a virtual
> +    interrupt to the guest.
> +5.  Because the timer interrupt has the vcpu affinity set, as the ISR
> +    completes, the physical interrupt stays active on the physical
> +    distributor.
> +6.  KVM enables the timer, enables interrupts, and runs the VCPU
> -- 
> 2.14.2
> 

Thanks,

	M.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 9/9] KVM: arm/arm64: Update timer and forwarded irq documentation
@ 2017-12-13 20:15     ` Marc Zyngier
  0 siblings, 0 replies; 44+ messages in thread
From: Marc Zyngier @ 2017-12-13 20:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 13 Dec 2017 10:46:02 +0000,
Christoffer Dall wrote:
> 
> Now when we've reworked how mapped level-triggered interrupts are
> processed for the timer interrupts, we update the documentation
> correspondingly.

Seems like the documentation is more out of date than we thought, see
below.

> 
> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
> ---
>  Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt | 50 ++++++++++------------
>  1 file changed, 23 insertions(+), 27 deletions(-)
> 
> diff --git a/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt b/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
> index 38bca2835278..f68c7d95a341 100644
> --- a/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
> +++ b/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
> @@ -7,9 +7,10 @@ allowing software to inject virtual interrupts to a VM, which the guest
>  OS sees as regular interrupts.  The code is famously known as the VGIC.
>  
>  Some of these virtual interrupts, however, correspond to physical
> -interrupts from real physical devices.  One example could be the
> -architected timer, which itself supports virtualization, and therefore
> -lets a guest OS program the hardware device directly to raise an
> +interrupts from real physical devices.  One example could be the ARM
> +Generic Timers (also known as the "architected timers"), which are
> +directly assigned to a VM while it's running, and therefore
> +makes it possible for guest OSes to program the timers directly to raise an
>  interrupt at some point in time.  When such an interrupt is raised, the
>  host OS initially handles the interrupt and must somehow signal this
>  event as a virtual interrupt to the guest.  Another example could be a
> @@ -37,7 +38,7 @@ inactive.
>  
>  The LRs include an extra bit, called the HW bit.  When this bit is set,
>  KVM must also program an additional field in the LR, the physical IRQ
> -number, to link the virtual with the physical IRQ.
> +number, to link the virtual and physical IRQs together.
>  
>  When the HW bit is set, KVM must EITHER set the Pending OR the Active
>  bit, never both at the same time.
> @@ -59,21 +60,21 @@ The state of forwarded physical interrupts is managed in the following way:
>    - LR.Pending will stay set as long as the guest has not acked the interrupt.
>    - LR.Pending transitions to LR.Active on the guest read of the IAR, as
>      expected.
> -  - On guest EOI, the *physical distributor* active bit gets cleared,
> +  - On guest deactivate, the *physical distributor* active bit gets cleared,
>      but the LR.Active is left untouched (set).

Is this true? I seem to remember that we established it wasn't (back
when we redesigned the vgic). Certainly, the current code relies on
the Active bit being cleared in the LR as well as in the physical
distributor.

>    - KVM clears the LR on VM exits when the physical distributor
>      active state has been cleared.

And this isn't either, if my above assertion stands.

>  
>  (*): The host handling is slightly more complicated.  For some forwarded
> -interrupts (shared), KVM directly sets the active state on the physical
> -distributor before entering the guest, because the interrupt is never actually
> -handled on the host (see details on the timer as an example below).  For other
> -forwarded interrupts (non-shared) the host does not deactivate the interrupt
> -when the host ISR completes, but leaves the interrupt active until the guest
> -deactivates it.  Leaving the interrupt active is allowed, because Linux
> -configures the physical GIC with EOIMode=1, which causes EOI operations to
> -perform a priority drop allowing the GIC to receive other interrupts of the
> -default priority.
> +interrupts (shared), in some cases, KVM directly sets the active state
> +on the physical distributor before entering the guest, because the
> +interrupt is never actually handled on the host (see details on the
> +timer as an example below).  In other cases, the host does not

This isn't true either. We now handle the timer interrupt on the host.

> +deactivate the interrupt when the host ISR completes, but leaves the
> +interrupt active until the guest deactivates it.  Leaving the interrupt
> +active is allowed, because Linux configures the physical GIC with
> +EOIMode=1, which causes EOI operations to perform a priority drop
> +allowing the GIC to receive other interrupts of the default priority.
>  
>  
>  Forwarded Edge and Level Triggered PPIs and SPIs
> @@ -170,18 +171,13 @@ instead:
>  
>  1.  KVM runs the VCPU
>  2.  The guest programs the time to fire in T+100
> -4.  At T+100 the timer fires and a physical IRQ causes the VM to exit
> +3.  At T+100 the timer fires and a physical IRQ causes the VM to exit
>      (note that this initially only traps to EL2 and does not run the host ISR
>      until KVM has returned to the host).
> -5.  With interrupts still disabled on the CPU coming back from the guest, KVM
> -    stores the virtual timer state to memory and disables the virtual hw timer.
> -6.  KVM looks at the timer state (in memory) and injects a forwarded physical
> -    interrupt because it concludes the timer has expired.
> -7.  KVM marks the timer interrupt as active on the physical distributor
> -7.  KVM enables the timer, enables interrupts, and runs the VCPU
> -
> -Notice that again the forwarded physical interrupt is injected to the
> -guest without having actually been handled on the host.  In this case it
> -is because the physical interrupt is never actually seen by the host because the
> -timer is disabled upon guest return, and the virtual forwarded interrupt is
> -injected on the KVM guest entry path.
> +4.  When KVM returns to EL1 and enables interrupts, the timer interrupt
> +    fires again, and the kvm arch timer ISR runs and injects a virtual
> +    interrupt to the guest.
> +5.  Because the timer interrupt has the vcpu affinity set, as the ISR
> +    completes, the physical interrupt stays active on the physical
> +    distributor.
> +6.  KVM enables the timer, enables interrupts, and runs the VCPU
> -- 
> 2.14.2
> 

Thanks,

	M.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v8 8/9] KVM: arm/arm64: Avoid work when userspace iqchips are not used
  2017-12-13 20:05     ` Marc Zyngier
@ 2017-12-19 13:34       ` Christoffer Dall
  -1 siblings, 0 replies; 44+ messages in thread
From: Christoffer Dall @ 2017-12-19 13:34 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvmarm, linux-arm-kernel, kvm, Andre Przywara, Eric Auger

On Wed, Dec 13, 2017 at 08:05:33PM +0000, Marc Zyngier wrote:
> On Wed, 13 Dec 2017 10:46:01 +0000,
> Christoffer Dall wrote:
> > 
> > We currently check if the VM has a userspace irqchip on every exit from
> > the VCPU, and if so, we do some work to ensure correct timer behavior.
> > This is unfortunate, as we could avoid doing any work entirely, if we
> > didn't have to support irqchip in userspace.
> > 
> > Realizing the userspace irqchip on ARM is mostly a developer or hobby
> > feature, and is unlikely to be used in servers or other scenarios where
> > performance is a priority, we can use a refcounted static key to only
> > check the irqchip configuration when we have at least one VM that uses
> > an irqchip in userspace.
> > 
> > Reviewed-by: Eric Auger <eric.auger@redhat.com>
> > Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
> 
> On its own, this doesn't seem to be that useful. As far as I can see,
> it saves us a load from the kvm structure before giving up.

A load and a conditional.  But what I really wanted to also avoid was
the function call from the main run loop, which I neglected as well.  I
think I can achieve that with a static inline wrapper in the arch timer
header file which first evaluates the static key and then calls into the
arch timer code.


> I think it
> is more the cumulative effect of this load that could have an impact,
> but you're only dealing with it at a single location.
> 
> How about making this a first class helper and redefine
> irqchip_in_kernel as such:
> 
> static inline bool irqchip_in_kernel(struct kvm *kvm)
> {
> 	if (static_branch_unlikely(&userspace_irqchip_in_use) &&
> 	    unlikely(!irqchip_in_kernel(kvm)))
> 		return true;
> 
> 	return false;
> }
> 
> and move that static key to a more central location?
> 

That's a neat idea.  The only problem is that creating a new VM would
then flip the static key, and then we'd have to flip it back when a vgic
is created on that VM, and I don't particularly like the idea of doing
this too often.

What I'd suggest then is to have two versions of the function:
irqchip_in_kernel() which is what it is today, and then
__irqchip_in_kernel() which can only be called from within the critical
path of the run loop, so that we can increment the static key on
kvm_vcpu_first_run_init() when we don't have a VGIC.

How does that sound?

Thanks,
-Christoffer


> > ---
> >  virt/kvm/arm/arch_timer.c | 11 +++++++++--
> >  1 file changed, 9 insertions(+), 2 deletions(-)
> > 
> > diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> > index f8d09665ddce..73d262c4712b 100644
> > --- a/virt/kvm/arm/arch_timer.c
> > +++ b/virt/kvm/arm/arch_timer.c
> > @@ -51,6 +51,8 @@ static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level,
> >  				 struct arch_timer_context *timer_ctx);
> >  static bool kvm_timer_should_fire(struct arch_timer_context *timer_ctx);
> >  
> > +static DEFINE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
> > +
> >  u64 kvm_phys_timer_read(void)
> >  {
> >  	return timecounter->cc->read(timecounter->cc);
> > @@ -562,7 +564,8 @@ static void unmask_vtimer_irq_user(struct kvm_vcpu *vcpu)
> >  
> >  void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
> >  {
> > -	unmask_vtimer_irq_user(vcpu);
> > +	if (static_branch_unlikely(&userspace_irqchip_in_use))
> > +		unmask_vtimer_irq_user(vcpu);
> >  }
> >  
> >  int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu)
> > @@ -767,6 +770,8 @@ void kvm_timer_vcpu_terminate(struct kvm_vcpu *vcpu)
> >  	soft_timer_cancel(&timer->bg_timer, &timer->expired);
> >  	soft_timer_cancel(&timer->phys_timer, NULL);
> >  	kvm_vgic_unmap_phys_irq(vcpu, vtimer->irq.irq);
> > +	if (timer->enabled && !irqchip_in_kernel(vcpu->kvm))
> > +		static_branch_dec(&userspace_irqchip_in_use);
> >  }
> >  
> >  static bool timer_irqs_are_valid(struct kvm_vcpu *vcpu)
> > @@ -819,8 +824,10 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
> >  		return 0;
> >  
> >  	/* Without a VGIC we do not map virtual IRQs to physical IRQs */
> > -	if (!irqchip_in_kernel(vcpu->kvm))
> > +	if (!irqchip_in_kernel(vcpu->kvm)) {
> > +		static_branch_inc(&userspace_irqchip_in_use);
> >  		goto no_vgic;
> > +	}
> >  
> >  	if (!vgic_initialized(vcpu->kvm))
> >  		return -ENODEV;
> > -- 
> > 2.14.2
> > 
> 
> Thanks,
> 
> 	M.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 8/9] KVM: arm/arm64: Avoid work when userspace iqchips are not used
@ 2017-12-19 13:34       ` Christoffer Dall
  0 siblings, 0 replies; 44+ messages in thread
From: Christoffer Dall @ 2017-12-19 13:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 13, 2017 at 08:05:33PM +0000, Marc Zyngier wrote:
> On Wed, 13 Dec 2017 10:46:01 +0000,
> Christoffer Dall wrote:
> > 
> > We currently check if the VM has a userspace irqchip on every exit from
> > the VCPU, and if so, we do some work to ensure correct timer behavior.
> > This is unfortunate, as we could avoid doing any work entirely, if we
> > didn't have to support irqchip in userspace.
> > 
> > Realizing the userspace irqchip on ARM is mostly a developer or hobby
> > feature, and is unlikely to be used in servers or other scenarios where
> > performance is a priority, we can use a refcounted static key to only
> > check the irqchip configuration when we have at least one VM that uses
> > an irqchip in userspace.
> > 
> > Reviewed-by: Eric Auger <eric.auger@redhat.com>
> > Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
> 
> On its own, this doesn't seem to be that useful. As far as I can see,
> it saves us a load from the kvm structure before giving up.

A load and a conditional.  But what I really wanted to also avoid was
the function call from the main run loop, which I neglected as well.  I
think I can achieve that with a static inline wrapper in the arch timer
header file which first evaluates the static key and then calls into the
arch timer code.


> I think it
> is more the cumulative effect of this load that could have an impact,
> but you're only dealing with it at a single location.
> 
> How about making this a first class helper and redefine
> irqchip_in_kernel as such:
> 
> static inline bool irqchip_in_kernel(struct kvm *kvm)
> {
> 	if (static_branch_unlikely(&userspace_irqchip_in_use) &&
> 	    unlikely(!irqchip_in_kernel(kvm)))
> 		return true;
> 
> 	return false;
> }
> 
> and move that static key to a more central location?
> 

That's a neat idea.  The only problem is that creating a new VM would
then flip the static key, and then we'd have to flip it back when a vgic
is created on that VM, and I don't particularly like the idea of doing
this too often.

What I'd suggest then is to have two versions of the function:
irqchip_in_kernel() which is what it is today, and then
__irqchip_in_kernel() which can only be called from within the critical
path of the run loop, so that we can increment the static key on
kvm_vcpu_first_run_init() when we don't have a VGIC.

How does that sound?

Thanks,
-Christoffer


> > ---
> >  virt/kvm/arm/arch_timer.c | 11 +++++++++--
> >  1 file changed, 9 insertions(+), 2 deletions(-)
> > 
> > diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> > index f8d09665ddce..73d262c4712b 100644
> > --- a/virt/kvm/arm/arch_timer.c
> > +++ b/virt/kvm/arm/arch_timer.c
> > @@ -51,6 +51,8 @@ static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level,
> >  				 struct arch_timer_context *timer_ctx);
> >  static bool kvm_timer_should_fire(struct arch_timer_context *timer_ctx);
> >  
> > +static DEFINE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
> > +
> >  u64 kvm_phys_timer_read(void)
> >  {
> >  	return timecounter->cc->read(timecounter->cc);
> > @@ -562,7 +564,8 @@ static void unmask_vtimer_irq_user(struct kvm_vcpu *vcpu)
> >  
> >  void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
> >  {
> > -	unmask_vtimer_irq_user(vcpu);
> > +	if (static_branch_unlikely(&userspace_irqchip_in_use))
> > +		unmask_vtimer_irq_user(vcpu);
> >  }
> >  
> >  int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu)
> > @@ -767,6 +770,8 @@ void kvm_timer_vcpu_terminate(struct kvm_vcpu *vcpu)
> >  	soft_timer_cancel(&timer->bg_timer, &timer->expired);
> >  	soft_timer_cancel(&timer->phys_timer, NULL);
> >  	kvm_vgic_unmap_phys_irq(vcpu, vtimer->irq.irq);
> > +	if (timer->enabled && !irqchip_in_kernel(vcpu->kvm))
> > +		static_branch_dec(&userspace_irqchip_in_use);
> >  }
> >  
> >  static bool timer_irqs_are_valid(struct kvm_vcpu *vcpu)
> > @@ -819,8 +824,10 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
> >  		return 0;
> >  
> >  	/* Without a VGIC we do not map virtual IRQs to physical IRQs */
> > -	if (!irqchip_in_kernel(vcpu->kvm))
> > +	if (!irqchip_in_kernel(vcpu->kvm)) {
> > +		static_branch_inc(&userspace_irqchip_in_use);
> >  		goto no_vgic;
> > +	}
> >  
> >  	if (!vgic_initialized(vcpu->kvm))
> >  		return -ENODEV;
> > -- 
> > 2.14.2
> > 
> 
> Thanks,
> 
> 	M.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v8 8/9] KVM: arm/arm64: Avoid work when userspace iqchips are not used
  2017-12-19 13:34       ` Christoffer Dall
@ 2017-12-19 13:55         ` Marc Zyngier
  -1 siblings, 0 replies; 44+ messages in thread
From: Marc Zyngier @ 2017-12-19 13:55 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: kvmarm, linux-arm-kernel, kvm, Andre Przywara, Eric Auger

On 19/12/17 13:34, Christoffer Dall wrote:
> On Wed, Dec 13, 2017 at 08:05:33PM +0000, Marc Zyngier wrote:
>> On Wed, 13 Dec 2017 10:46:01 +0000,
>> Christoffer Dall wrote:
>>>
>>> We currently check if the VM has a userspace irqchip on every exit from
>>> the VCPU, and if so, we do some work to ensure correct timer behavior.
>>> This is unfortunate, as we could avoid doing any work entirely, if we
>>> didn't have to support irqchip in userspace.
>>>
>>> Realizing the userspace irqchip on ARM is mostly a developer or hobby
>>> feature, and is unlikely to be used in servers or other scenarios where
>>> performance is a priority, we can use a refcounted static key to only
>>> check the irqchip configuration when we have at least one VM that uses
>>> an irqchip in userspace.
>>>
>>> Reviewed-by: Eric Auger <eric.auger@redhat.com>
>>> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
>>
>> On its own, this doesn't seem to be that useful. As far as I can see,
>> it saves us a load from the kvm structure before giving up.
> 
> A load and a conditional.  But what I really wanted to also avoid was
> the function call from the main run loop, which I neglected as well.  I
> think I can achieve that with a static inline wrapper in the arch timer
> header file which first evaluates the static key and then calls into the
> arch timer code.
> 
> 
>> I think it
>> is more the cumulative effect of this load that could have an impact,
>> but you're only dealing with it at a single location.
>>
>> How about making this a first class helper and redefine
>> irqchip_in_kernel as such:
>>
>> static inline bool irqchip_in_kernel(struct kvm *kvm)
>> {
>> 	if (static_branch_unlikely(&userspace_irqchip_in_use) &&
>> 	    unlikely(!irqchip_in_kernel(kvm)))
>> 		return true;
>>
>> 	return false;
>> }
>>
>> and move that static key to a more central location?
>>
> 
> That's a neat idea.  The only problem is that creating a new VM would
> then flip the static key, and then we'd have to flip it back when a vgic
> is created on that VM, and I don't particularly like the idea of doing
> this too often.

Fair enough.

> 
> What I'd suggest then is to have two versions of the function:
> irqchip_in_kernel() which is what it is today, and then
> __irqchip_in_kernel() which can only be called from within the critical
> path of the run loop, so that we can increment the static key on
> kvm_vcpu_first_run_init() when we don't have a VGIC.
> 
> How does that sound?

OK, you only patch once per non-VGIC VM instead of twice per VGIC VM.
But you now create a distinction between what can be used at runtime and
what can be used at config time. The distinction is a bit annoying.

Also, does this actually show up on the radar?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 8/9] KVM: arm/arm64: Avoid work when userspace iqchips are not used
@ 2017-12-19 13:55         ` Marc Zyngier
  0 siblings, 0 replies; 44+ messages in thread
From: Marc Zyngier @ 2017-12-19 13:55 UTC (permalink / raw)
  To: linux-arm-kernel

On 19/12/17 13:34, Christoffer Dall wrote:
> On Wed, Dec 13, 2017 at 08:05:33PM +0000, Marc Zyngier wrote:
>> On Wed, 13 Dec 2017 10:46:01 +0000,
>> Christoffer Dall wrote:
>>>
>>> We currently check if the VM has a userspace irqchip on every exit from
>>> the VCPU, and if so, we do some work to ensure correct timer behavior.
>>> This is unfortunate, as we could avoid doing any work entirely, if we
>>> didn't have to support irqchip in userspace.
>>>
>>> Realizing the userspace irqchip on ARM is mostly a developer or hobby
>>> feature, and is unlikely to be used in servers or other scenarios where
>>> performance is a priority, we can use a refcounted static key to only
>>> check the irqchip configuration when we have at least one VM that uses
>>> an irqchip in userspace.
>>>
>>> Reviewed-by: Eric Auger <eric.auger@redhat.com>
>>> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
>>
>> On its own, this doesn't seem to be that useful. As far as I can see,
>> it saves us a load from the kvm structure before giving up.
> 
> A load and a conditional.  But what I really wanted to also avoid was
> the function call from the main run loop, which I neglected as well.  I
> think I can achieve that with a static inline wrapper in the arch timer
> header file which first evaluates the static key and then calls into the
> arch timer code.
> 
> 
>> I think it
>> is more the cumulative effect of this load that could have an impact,
>> but you're only dealing with it at a single location.
>>
>> How about making this a first class helper and redefine
>> irqchip_in_kernel as such:
>>
>> static inline bool irqchip_in_kernel(struct kvm *kvm)
>> {
>> 	if (static_branch_unlikely(&userspace_irqchip_in_use) &&
>> 	    unlikely(!irqchip_in_kernel(kvm)))
>> 		return true;
>>
>> 	return false;
>> }
>>
>> and move that static key to a more central location?
>>
> 
> That's a neat idea.  The only problem is that creating a new VM would
> then flip the static key, and then we'd have to flip it back when a vgic
> is created on that VM, and I don't particularly like the idea of doing
> this too often.

Fair enough.

> 
> What I'd suggest then is to have two versions of the function:
> irqchip_in_kernel() which is what it is today, and then
> __irqchip_in_kernel() which can only be called from within the critical
> path of the run loop, so that we can increment the static key on
> kvm_vcpu_first_run_init() when we don't have a VGIC.
> 
> How does that sound?

OK, you only patch once per non-VGIC VM instead of twice per VGIC VM.
But you now create a distinction between what can be used at runtime and
what can be used at config time. The distinction is a bit annoying.

Also, does this actually show up on the radar?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v8 3/9] KVM: arm/arm64: Don't cache the timer IRQ level
  2017-12-13 10:45   ` Christoffer Dall
@ 2017-12-19 14:17     ` Julien Thierry
  -1 siblings, 0 replies; 44+ messages in thread
From: Julien Thierry @ 2017-12-19 14:17 UTC (permalink / raw)
  To: Christoffer Dall, kvmarm, linux-arm-kernel
  Cc: Marc Zyngier, Andre Przywara, kvm

Hi Christoffer,

A few nits in the commit message.

On 13/12/17 10:45, Christoffer Dall wrote:
> The timer was modeled after a strict idea of modelling an interrupt line

nit: modelling (also, modeled after a strict idea of modelling?)

> level in software, meaning that only transitions in the level needed to

s/needed/need/ ?

> be reported to the VGIC.  This works well for the timer, because the
> arch timer code is in complete control of the device and can track the
> transitions of the line.
> 
> However, as we are about to support using the HW bit in the VGIC not
> just for the timer, but also for VFIO which cannot track transitions of
> the interrupt line, we have to decide on an interface for level
> triggered mapped interrupts to the GIC, which both the timer and VFIO

"level triggered interrupts mapped to the GIC" ?

> can use.
> 
> VFIO only sees an asserting transition of the physical interrupt line,
> and tells the VGIC when that happens.  That means that part of the
> interrupt flow is offloaded to the hardware.
> 
> To use the same interface for VFIO devices and the timer, we therefore
> have to change the timer (we cannot change VFIO because it doesn't know
> the details of the device it is assigning to a VM).
> 
> Luckily, changing the timer is simple, we just need to stop 'caching'
> the line level, but instead let the VGIC know the state of the timer
> every time there is a potential change in the line level, and when the
> line level should be asserted from the timer ISR.  The VGIC can ignore
> extra notifications using its validate mechanism.
> 
> Reviewed-by: Andre Przywara <andre.przywara@arm.com>
> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>

Reviewed-by: Julien Thierry <julien.thierry@arm.com>

> ---
>   virt/kvm/arm/arch_timer.c | 20 +++++++++++++-------
>   1 file changed, 13 insertions(+), 7 deletions(-)
> 
> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> index 4151250ce8da..dd5aca05c500 100644
> --- a/virt/kvm/arm/arch_timer.c
> +++ b/virt/kvm/arm/arch_timer.c
> @@ -99,11 +99,9 @@ static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id)
>   	}
>   	vtimer = vcpu_vtimer(vcpu);
>   
> -	if (!vtimer->irq.level) {
> -		vtimer->cnt_ctl = read_sysreg_el0(cntv_ctl);
> -		if (kvm_timer_irq_can_fire(vtimer))
> -			kvm_timer_update_irq(vcpu, true, vtimer);
> -	}
> +	vtimer->cnt_ctl = read_sysreg_el0(cntv_ctl);
> +	if (kvm_timer_irq_can_fire(vtimer))
> +		kvm_timer_update_irq(vcpu, true, vtimer);
>   
>   	if (unlikely(!irqchip_in_kernel(vcpu->kvm)))
>   		kvm_vtimer_update_mask_user(vcpu);
> @@ -324,12 +322,20 @@ static void kvm_timer_update_state(struct kvm_vcpu *vcpu)
>   	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
>   	struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
>   	struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
> +	bool level;
>   
>   	if (unlikely(!timer->enabled))
>   		return;
>   
> -	if (kvm_timer_should_fire(vtimer) != vtimer->irq.level)
> -		kvm_timer_update_irq(vcpu, !vtimer->irq.level, vtimer);
> +	/*
> +	 * The vtimer virtual interrupt is a 'mapped' interrupt, meaning part
> +	 * of its lifecycle is offloaded to the hardware, and we therefore may
> +	 * not have lowered the irq.level value before having to signal a new
> +	 * interrupt, but have to signal an interrupt every time the level is
> +	 * asserted.
> +	 */
> +	level = kvm_timer_should_fire(vtimer);
> +	kvm_timer_update_irq(vcpu, level, vtimer);
>   
>   	if (kvm_timer_should_fire(ptimer) != ptimer->irq.level)
>   		kvm_timer_update_irq(vcpu, !ptimer->irq.level, ptimer);
> 

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 3/9] KVM: arm/arm64: Don't cache the timer IRQ level
@ 2017-12-19 14:17     ` Julien Thierry
  0 siblings, 0 replies; 44+ messages in thread
From: Julien Thierry @ 2017-12-19 14:17 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Christoffer,

A few nits in the commit message.

On 13/12/17 10:45, Christoffer Dall wrote:
> The timer was modeled after a strict idea of modelling an interrupt line

nit: modelling (also, modeled after a strict idea of modelling?)

> level in software, meaning that only transitions in the level needed to

s/needed/need/ ?

> be reported to the VGIC.  This works well for the timer, because the
> arch timer code is in complete control of the device and can track the
> transitions of the line.
> 
> However, as we are about to support using the HW bit in the VGIC not
> just for the timer, but also for VFIO which cannot track transitions of
> the interrupt line, we have to decide on an interface for level
> triggered mapped interrupts to the GIC, which both the timer and VFIO

"level triggered interrupts mapped to the GIC" ?

> can use.
> 
> VFIO only sees an asserting transition of the physical interrupt line,
> and tells the VGIC when that happens.  That means that part of the
> interrupt flow is offloaded to the hardware.
> 
> To use the same interface for VFIO devices and the timer, we therefore
> have to change the timer (we cannot change VFIO because it doesn't know
> the details of the device it is assigning to a VM).
> 
> Luckily, changing the timer is simple, we just need to stop 'caching'
> the line level, but instead let the VGIC know the state of the timer
> every time there is a potential change in the line level, and when the
> line level should be asserted from the timer ISR.  The VGIC can ignore
> extra notifications using its validate mechanism.
> 
> Reviewed-by: Andre Przywara <andre.przywara@arm.com>
> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>

Reviewed-by: Julien Thierry <julien.thierry@arm.com>

> ---
>   virt/kvm/arm/arch_timer.c | 20 +++++++++++++-------
>   1 file changed, 13 insertions(+), 7 deletions(-)
> 
> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> index 4151250ce8da..dd5aca05c500 100644
> --- a/virt/kvm/arm/arch_timer.c
> +++ b/virt/kvm/arm/arch_timer.c
> @@ -99,11 +99,9 @@ static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id)
>   	}
>   	vtimer = vcpu_vtimer(vcpu);
>   
> -	if (!vtimer->irq.level) {
> -		vtimer->cnt_ctl = read_sysreg_el0(cntv_ctl);
> -		if (kvm_timer_irq_can_fire(vtimer))
> -			kvm_timer_update_irq(vcpu, true, vtimer);
> -	}
> +	vtimer->cnt_ctl = read_sysreg_el0(cntv_ctl);
> +	if (kvm_timer_irq_can_fire(vtimer))
> +		kvm_timer_update_irq(vcpu, true, vtimer);
>   
>   	if (unlikely(!irqchip_in_kernel(vcpu->kvm)))
>   		kvm_vtimer_update_mask_user(vcpu);
> @@ -324,12 +322,20 @@ static void kvm_timer_update_state(struct kvm_vcpu *vcpu)
>   	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
>   	struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
>   	struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
> +	bool level;
>   
>   	if (unlikely(!timer->enabled))
>   		return;
>   
> -	if (kvm_timer_should_fire(vtimer) != vtimer->irq.level)
> -		kvm_timer_update_irq(vcpu, !vtimer->irq.level, vtimer);
> +	/*
> +	 * The vtimer virtual interrupt is a 'mapped' interrupt, meaning part
> +	 * of its lifecycle is offloaded to the hardware, and we therefore may
> +	 * not have lowered the irq.level value before having to signal a new
> +	 * interrupt, but have to signal an interrupt every time the level is
> +	 * asserted.
> +	 */
> +	level = kvm_timer_should_fire(vtimer);
> +	kvm_timer_update_irq(vcpu, level, vtimer);
>   
>   	if (kvm_timer_should_fire(ptimer) != ptimer->irq.level)
>   		kvm_timer_update_irq(vcpu, !ptimer->irq.level, ptimer);
> 

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v8 8/9] KVM: arm/arm64: Avoid work when userspace iqchips are not used
  2017-12-19 13:55         ` Marc Zyngier
@ 2017-12-19 14:18           ` Christoffer Dall
  -1 siblings, 0 replies; 44+ messages in thread
From: Christoffer Dall @ 2017-12-19 14:18 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: Andre Przywara, kvmarm, linux-arm-kernel, kvm

On Tue, Dec 19, 2017 at 01:55:25PM +0000, Marc Zyngier wrote:
> On 19/12/17 13:34, Christoffer Dall wrote:
> > On Wed, Dec 13, 2017 at 08:05:33PM +0000, Marc Zyngier wrote:
> >> On Wed, 13 Dec 2017 10:46:01 +0000,
> >> Christoffer Dall wrote:
> >>>
> >>> We currently check if the VM has a userspace irqchip on every exit from
> >>> the VCPU, and if so, we do some work to ensure correct timer behavior.
> >>> This is unfortunate, as we could avoid doing any work entirely, if we
> >>> didn't have to support irqchip in userspace.
> >>>
> >>> Realizing the userspace irqchip on ARM is mostly a developer or hobby
> >>> feature, and is unlikely to be used in servers or other scenarios where
> >>> performance is a priority, we can use a refcounted static key to only
> >>> check the irqchip configuration when we have at least one VM that uses
> >>> an irqchip in userspace.
> >>>
> >>> Reviewed-by: Eric Auger <eric.auger@redhat.com>
> >>> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
> >>
> >> On its own, this doesn't seem to be that useful. As far as I can see,
> >> it saves us a load from the kvm structure before giving up.
> > 
> > A load and a conditional.  But what I really wanted to also avoid was
> > the function call from the main run loop, which I neglected as well.  I
> > think I can achieve that with a static inline wrapper in the arch timer
> > header file which first evaluates the static key and then calls into the
> > arch timer code.
> > 
> > 
> >> I think it
> >> is more the cumulative effect of this load that could have an impact,
> >> but you're only dealing with it at a single location.
> >>
> >> How about making this a first class helper and redefine
> >> irqchip_in_kernel as such:
> >>
> >> static inline bool irqchip_in_kernel(struct kvm *kvm)
> >> {
> >> 	if (static_branch_unlikely(&userspace_irqchip_in_use) &&
> >> 	    unlikely(!irqchip_in_kernel(kvm)))
> >> 		return true;
> >>
> >> 	return false;
> >> }
> >>
> >> and move that static key to a more central location?
> >>
> > 
> > That's a neat idea.  The only problem is that creating a new VM would
> > then flip the static key, and then we'd have to flip it back when a vgic
> > is created on that VM, and I don't particularly like the idea of doing
> > this too often.
> 
> Fair enough.
> 
> > 
> > What I'd suggest then is to have two versions of the function:
> > irqchip_in_kernel() which is what it is today, and then
> > __irqchip_in_kernel() which can only be called from within the critical
> > path of the run loop, so that we can increment the static key on
> > kvm_vcpu_first_run_init() when we don't have a VGIC.
> > 
> > How does that sound?
> 
> OK, you only patch once per non-VGIC VM instead of twice per VGIC VM.
> But you now create a distinction between what can be used at runtime and
> what can be used at config time. The distinction is a bit annoying.
> 
> Also, does this actually show up on the radar?
> 

Honestly, I don't know for this particular version of the patch.

But when I did the VHE optimization work, which was before the userspace
irqchip support went in, getting rid of calling kvm_timer_sync_hwstate()
and the load+conditional in there (also prior to the level mapped
patches), was measurable, between 50 to 100 cycles.

Of course, that turned out to be buggy when rebooting VMs, so I never
actually included that in my measurements, but it left me wanting to get
rid of this.

It's a bit of a delicate balance.  On the one hand, it's silly to try to
over-optimize, but on the other hand it's exactly the cumulative effect
of optimizing every bit that managed to get us good results on VHE.

How about this:  I write up the patch in the complicated version as part
of the next version, and if you think it's too difficult to maintain, we
can just drop it an apply the series without it?

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 8/9] KVM: arm/arm64: Avoid work when userspace iqchips are not used
@ 2017-12-19 14:18           ` Christoffer Dall
  0 siblings, 0 replies; 44+ messages in thread
From: Christoffer Dall @ 2017-12-19 14:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 19, 2017 at 01:55:25PM +0000, Marc Zyngier wrote:
> On 19/12/17 13:34, Christoffer Dall wrote:
> > On Wed, Dec 13, 2017 at 08:05:33PM +0000, Marc Zyngier wrote:
> >> On Wed, 13 Dec 2017 10:46:01 +0000,
> >> Christoffer Dall wrote:
> >>>
> >>> We currently check if the VM has a userspace irqchip on every exit from
> >>> the VCPU, and if so, we do some work to ensure correct timer behavior.
> >>> This is unfortunate, as we could avoid doing any work entirely, if we
> >>> didn't have to support irqchip in userspace.
> >>>
> >>> Realizing the userspace irqchip on ARM is mostly a developer or hobby
> >>> feature, and is unlikely to be used in servers or other scenarios where
> >>> performance is a priority, we can use a refcounted static key to only
> >>> check the irqchip configuration when we have at least one VM that uses
> >>> an irqchip in userspace.
> >>>
> >>> Reviewed-by: Eric Auger <eric.auger@redhat.com>
> >>> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
> >>
> >> On its own, this doesn't seem to be that useful. As far as I can see,
> >> it saves us a load from the kvm structure before giving up.
> > 
> > A load and a conditional.  But what I really wanted to also avoid was
> > the function call from the main run loop, which I neglected as well.  I
> > think I can achieve that with a static inline wrapper in the arch timer
> > header file which first evaluates the static key and then calls into the
> > arch timer code.
> > 
> > 
> >> I think it
> >> is more the cumulative effect of this load that could have an impact,
> >> but you're only dealing with it at a single location.
> >>
> >> How about making this a first class helper and redefine
> >> irqchip_in_kernel as such:
> >>
> >> static inline bool irqchip_in_kernel(struct kvm *kvm)
> >> {
> >> 	if (static_branch_unlikely(&userspace_irqchip_in_use) &&
> >> 	    unlikely(!irqchip_in_kernel(kvm)))
> >> 		return true;
> >>
> >> 	return false;
> >> }
> >>
> >> and move that static key to a more central location?
> >>
> > 
> > That's a neat idea.  The only problem is that creating a new VM would
> > then flip the static key, and then we'd have to flip it back when a vgic
> > is created on that VM, and I don't particularly like the idea of doing
> > this too often.
> 
> Fair enough.
> 
> > 
> > What I'd suggest then is to have two versions of the function:
> > irqchip_in_kernel() which is what it is today, and then
> > __irqchip_in_kernel() which can only be called from within the critical
> > path of the run loop, so that we can increment the static key on
> > kvm_vcpu_first_run_init() when we don't have a VGIC.
> > 
> > How does that sound?
> 
> OK, you only patch once per non-VGIC VM instead of twice per VGIC VM.
> But you now create a distinction between what can be used at runtime and
> what can be used at config time. The distinction is a bit annoying.
> 
> Also, does this actually show up on the radar?
> 

Honestly, I don't know for this particular version of the patch.

But when I did the VHE optimization work, which was before the userspace
irqchip support went in, getting rid of calling kvm_timer_sync_hwstate()
and the load+conditional in there (also prior to the level mapped
patches), was measurable, between 50 to 100 cycles.

Of course, that turned out to be buggy when rebooting VMs, so I never
actually included that in my measurements, but it left me wanting to get
rid of this.

It's a bit of a delicate balance.  On the one hand, it's silly to try to
over-optimize, but on the other hand it's exactly the cumulative effect
of optimizing every bit that managed to get us good results on VHE.

How about this:  I write up the patch in the complicated version as part
of the next version, and if you think it's too difficult to maintain, we
can just drop it an apply the series without it?

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v8 8/9] KVM: arm/arm64: Avoid work when userspace iqchips are not used
  2017-12-19 14:18           ` Christoffer Dall
@ 2017-12-19 14:32             ` Marc Zyngier
  -1 siblings, 0 replies; 44+ messages in thread
From: Marc Zyngier @ 2017-12-19 14:32 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: kvmarm, linux-arm-kernel, kvm, Andre Przywara, Eric Auger

On 19/12/17 14:18, Christoffer Dall wrote:
> On Tue, Dec 19, 2017 at 01:55:25PM +0000, Marc Zyngier wrote:
>> On 19/12/17 13:34, Christoffer Dall wrote:
>>> On Wed, Dec 13, 2017 at 08:05:33PM +0000, Marc Zyngier wrote:
>>>> On Wed, 13 Dec 2017 10:46:01 +0000,
>>>> Christoffer Dall wrote:
>>>>>
>>>>> We currently check if the VM has a userspace irqchip on every exit from
>>>>> the VCPU, and if so, we do some work to ensure correct timer behavior.
>>>>> This is unfortunate, as we could avoid doing any work entirely, if we
>>>>> didn't have to support irqchip in userspace.
>>>>>
>>>>> Realizing the userspace irqchip on ARM is mostly a developer or hobby
>>>>> feature, and is unlikely to be used in servers or other scenarios where
>>>>> performance is a priority, we can use a refcounted static key to only
>>>>> check the irqchip configuration when we have at least one VM that uses
>>>>> an irqchip in userspace.
>>>>>
>>>>> Reviewed-by: Eric Auger <eric.auger@redhat.com>
>>>>> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
>>>>
>>>> On its own, this doesn't seem to be that useful. As far as I can see,
>>>> it saves us a load from the kvm structure before giving up.
>>>
>>> A load and a conditional.  But what I really wanted to also avoid was
>>> the function call from the main run loop, which I neglected as well.  I
>>> think I can achieve that with a static inline wrapper in the arch timer
>>> header file which first evaluates the static key and then calls into the
>>> arch timer code.
>>>
>>>
>>>> I think it
>>>> is more the cumulative effect of this load that could have an impact,
>>>> but you're only dealing with it at a single location.
>>>>
>>>> How about making this a first class helper and redefine
>>>> irqchip_in_kernel as such:
>>>>
>>>> static inline bool irqchip_in_kernel(struct kvm *kvm)
>>>> {
>>>> 	if (static_branch_unlikely(&userspace_irqchip_in_use) &&
>>>> 	    unlikely(!irqchip_in_kernel(kvm)))
>>>> 		return true;
>>>>
>>>> 	return false;
>>>> }
>>>>
>>>> and move that static key to a more central location?
>>>>
>>>
>>> That's a neat idea.  The only problem is that creating a new VM would
>>> then flip the static key, and then we'd have to flip it back when a vgic
>>> is created on that VM, and I don't particularly like the idea of doing
>>> this too often.
>>
>> Fair enough.
>>
>>>
>>> What I'd suggest then is to have two versions of the function:
>>> irqchip_in_kernel() which is what it is today, and then
>>> __irqchip_in_kernel() which can only be called from within the critical
>>> path of the run loop, so that we can increment the static key on
>>> kvm_vcpu_first_run_init() when we don't have a VGIC.
>>>
>>> How does that sound?
>>
>> OK, you only patch once per non-VGIC VM instead of twice per VGIC VM.
>> But you now create a distinction between what can be used at runtime and
>> what can be used at config time. The distinction is a bit annoying.
>>
>> Also, does this actually show up on the radar?
>>
> 
> Honestly, I don't know for this particular version of the patch.
> 
> But when I did the VHE optimization work, which was before the userspace
> irqchip support went in, getting rid of calling kvm_timer_sync_hwstate()
> and the load+conditional in there (also prior to the level mapped
> patches), was measurable, between 50 to 100 cycles.
> 
> Of course, that turned out to be buggy when rebooting VMs, so I never
> actually included that in my measurements, but it left me wanting to get
> rid of this.
> 
> It's a bit of a delicate balance.  On the one hand, it's silly to try to
> over-optimize, but on the other hand it's exactly the cumulative effect
> of optimizing every bit that managed to get us good results on VHE.
> 
> How about this:  I write up the patch in the complicated version as part
> of the next version, and if you think it's too difficult to maintain, we
> can just drop it an apply the series without it?

Sounds like a good plan.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 8/9] KVM: arm/arm64: Avoid work when userspace iqchips are not used
@ 2017-12-19 14:32             ` Marc Zyngier
  0 siblings, 0 replies; 44+ messages in thread
From: Marc Zyngier @ 2017-12-19 14:32 UTC (permalink / raw)
  To: linux-arm-kernel

On 19/12/17 14:18, Christoffer Dall wrote:
> On Tue, Dec 19, 2017 at 01:55:25PM +0000, Marc Zyngier wrote:
>> On 19/12/17 13:34, Christoffer Dall wrote:
>>> On Wed, Dec 13, 2017 at 08:05:33PM +0000, Marc Zyngier wrote:
>>>> On Wed, 13 Dec 2017 10:46:01 +0000,
>>>> Christoffer Dall wrote:
>>>>>
>>>>> We currently check if the VM has a userspace irqchip on every exit from
>>>>> the VCPU, and if so, we do some work to ensure correct timer behavior.
>>>>> This is unfortunate, as we could avoid doing any work entirely, if we
>>>>> didn't have to support irqchip in userspace.
>>>>>
>>>>> Realizing the userspace irqchip on ARM is mostly a developer or hobby
>>>>> feature, and is unlikely to be used in servers or other scenarios where
>>>>> performance is a priority, we can use a refcounted static key to only
>>>>> check the irqchip configuration when we have at least one VM that uses
>>>>> an irqchip in userspace.
>>>>>
>>>>> Reviewed-by: Eric Auger <eric.auger@redhat.com>
>>>>> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
>>>>
>>>> On its own, this doesn't seem to be that useful. As far as I can see,
>>>> it saves us a load from the kvm structure before giving up.
>>>
>>> A load and a conditional.  But what I really wanted to also avoid was
>>> the function call from the main run loop, which I neglected as well.  I
>>> think I can achieve that with a static inline wrapper in the arch timer
>>> header file which first evaluates the static key and then calls into the
>>> arch timer code.
>>>
>>>
>>>> I think it
>>>> is more the cumulative effect of this load that could have an impact,
>>>> but you're only dealing with it at a single location.
>>>>
>>>> How about making this a first class helper and redefine
>>>> irqchip_in_kernel as such:
>>>>
>>>> static inline bool irqchip_in_kernel(struct kvm *kvm)
>>>> {
>>>> 	if (static_branch_unlikely(&userspace_irqchip_in_use) &&
>>>> 	    unlikely(!irqchip_in_kernel(kvm)))
>>>> 		return true;
>>>>
>>>> 	return false;
>>>> }
>>>>
>>>> and move that static key to a more central location?
>>>>
>>>
>>> That's a neat idea.  The only problem is that creating a new VM would
>>> then flip the static key, and then we'd have to flip it back when a vgic
>>> is created on that VM, and I don't particularly like the idea of doing
>>> this too often.
>>
>> Fair enough.
>>
>>>
>>> What I'd suggest then is to have two versions of the function:
>>> irqchip_in_kernel() which is what it is today, and then
>>> __irqchip_in_kernel() which can only be called from within the critical
>>> path of the run loop, so that we can increment the static key on
>>> kvm_vcpu_first_run_init() when we don't have a VGIC.
>>>
>>> How does that sound?
>>
>> OK, you only patch once per non-VGIC VM instead of twice per VGIC VM.
>> But you now create a distinction between what can be used at runtime and
>> what can be used at config time. The distinction is a bit annoying.
>>
>> Also, does this actually show up on the radar?
>>
> 
> Honestly, I don't know for this particular version of the patch.
> 
> But when I did the VHE optimization work, which was before the userspace
> irqchip support went in, getting rid of calling kvm_timer_sync_hwstate()
> and the load+conditional in there (also prior to the level mapped
> patches), was measurable, between 50 to 100 cycles.
> 
> Of course, that turned out to be buggy when rebooting VMs, so I never
> actually included that in my measurements, but it left me wanting to get
> rid of this.
> 
> It's a bit of a delicate balance.  On the one hand, it's silly to try to
> over-optimize, but on the other hand it's exactly the cumulative effect
> of optimizing every bit that managed to get us good results on VHE.
> 
> How about this:  I write up the patch in the complicated version as part
> of the next version, and if you think it's too difficult to maintain, we
> can just drop it an apply the series without it?

Sounds like a good plan.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v8 9/9] KVM: arm/arm64: Update timer and forwarded irq documentation
  2017-12-13 20:15     ` Marc Zyngier
@ 2017-12-19 20:29       ` Christoffer Dall
  -1 siblings, 0 replies; 44+ messages in thread
From: Christoffer Dall @ 2017-12-19 20:29 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvmarm, linux-arm-kernel, kvm, Andre Przywara, Eric Auger

On Wed, Dec 13, 2017 at 08:15:10PM +0000, Marc Zyngier wrote:
> On Wed, 13 Dec 2017 10:46:02 +0000,
> Christoffer Dall wrote:
> > 
> > Now when we've reworked how mapped level-triggered interrupts are
> > processed for the timer interrupts, we update the documentation
> > correspondingly.
> 
> Seems like the documentation is more out of date than we thought, see
> below.
> 

Indeed.  And I wondered if we should just nuke this file.  The reason I
added it originally was that the concept of "never taking the interrupt"
was confusing to most, and we had to explain it several times over, but
perhaps it's really not needed anymore and we should let people read the
code instead?

> > 
> > Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
> > ---
> >  Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt | 50 ++++++++++------------
> >  1 file changed, 23 insertions(+), 27 deletions(-)
> > 
> > diff --git a/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt b/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
> > index 38bca2835278..f68c7d95a341 100644
> > --- a/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
> > +++ b/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
> > @@ -7,9 +7,10 @@ allowing software to inject virtual interrupts to a VM, which the guest
> >  OS sees as regular interrupts.  The code is famously known as the VGIC.
> >  
> >  Some of these virtual interrupts, however, correspond to physical
> > -interrupts from real physical devices.  One example could be the
> > -architected timer, which itself supports virtualization, and therefore
> > -lets a guest OS program the hardware device directly to raise an
> > +interrupts from real physical devices.  One example could be the ARM
> > +Generic Timers (also known as the "architected timers"), which are
> > +directly assigned to a VM while it's running, and therefore
> > +makes it possible for guest OSes to program the timers directly to raise an
> >  interrupt at some point in time.  When such an interrupt is raised, the
> >  host OS initially handles the interrupt and must somehow signal this
> >  event as a virtual interrupt to the guest.  Another example could be a
> > @@ -37,7 +38,7 @@ inactive.
> >  
> >  The LRs include an extra bit, called the HW bit.  When this bit is set,
> >  KVM must also program an additional field in the LR, the physical IRQ
> > -number, to link the virtual with the physical IRQ.
> > +number, to link the virtual and physical IRQs together.
> >  
> >  When the HW bit is set, KVM must EITHER set the Pending OR the Active
> >  bit, never both at the same time.
> > @@ -59,21 +60,21 @@ The state of forwarded physical interrupts is managed in the following way:
> >    - LR.Pending will stay set as long as the guest has not acked the interrupt.
> >    - LR.Pending transitions to LR.Active on the guest read of the IAR, as
> >      expected.
> > -  - On guest EOI, the *physical distributor* active bit gets cleared,
> > +  - On guest deactivate, the *physical distributor* active bit gets cleared,
> >      but the LR.Active is left untouched (set).
> 
> Is this true? I seem to remember that we established it wasn't (back
> when we redesigned the vgic). Certainly, the current code relies on
> the Active bit being cleared in the LR as well as in the physical
> distributor.
> 

No, you're right, it' crap.

> >    - KVM clears the LR on VM exits when the physical distributor
> >      active state has been cleared.
> 
> And this isn't either, if my above assertion stands.
> 

Right again.

> >  
> >  (*): The host handling is slightly more complicated.  For some forwarded
> > -interrupts (shared), KVM directly sets the active state on the physical
> > -distributor before entering the guest, because the interrupt is never actually
> > -handled on the host (see details on the timer as an example below).  For other
> > -forwarded interrupts (non-shared) the host does not deactivate the interrupt
> > -when the host ISR completes, but leaves the interrupt active until the guest
> > -deactivates it.  Leaving the interrupt active is allowed, because Linux
> > -configures the physical GIC with EOIMode=1, which causes EOI operations to
> > -perform a priority drop allowing the GIC to receive other interrupts of the
> > -default priority.
> > +interrupts (shared), in some cases, KVM directly sets the active state
> > +on the physical distributor before entering the guest, because the
> > +interrupt is never actually handled on the host (see details on the
> > +timer as an example below).  In other cases, the host does not
> 
> This isn't true either. We now handle the timer interrupt on the host.
> 

And again.  I've rewritten this.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 9/9] KVM: arm/arm64: Update timer and forwarded irq documentation
@ 2017-12-19 20:29       ` Christoffer Dall
  0 siblings, 0 replies; 44+ messages in thread
From: Christoffer Dall @ 2017-12-19 20:29 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 13, 2017 at 08:15:10PM +0000, Marc Zyngier wrote:
> On Wed, 13 Dec 2017 10:46:02 +0000,
> Christoffer Dall wrote:
> > 
> > Now when we've reworked how mapped level-triggered interrupts are
> > processed for the timer interrupts, we update the documentation
> > correspondingly.
> 
> Seems like the documentation is more out of date than we thought, see
> below.
> 

Indeed.  And I wondered if we should just nuke this file.  The reason I
added it originally was that the concept of "never taking the interrupt"
was confusing to most, and we had to explain it several times over, but
perhaps it's really not needed anymore and we should let people read the
code instead?

> > 
> > Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
> > ---
> >  Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt | 50 ++++++++++------------
> >  1 file changed, 23 insertions(+), 27 deletions(-)
> > 
> > diff --git a/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt b/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
> > index 38bca2835278..f68c7d95a341 100644
> > --- a/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
> > +++ b/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
> > @@ -7,9 +7,10 @@ allowing software to inject virtual interrupts to a VM, which the guest
> >  OS sees as regular interrupts.  The code is famously known as the VGIC.
> >  
> >  Some of these virtual interrupts, however, correspond to physical
> > -interrupts from real physical devices.  One example could be the
> > -architected timer, which itself supports virtualization, and therefore
> > -lets a guest OS program the hardware device directly to raise an
> > +interrupts from real physical devices.  One example could be the ARM
> > +Generic Timers (also known as the "architected timers"), which are
> > +directly assigned to a VM while it's running, and therefore
> > +makes it possible for guest OSes to program the timers directly to raise an
> >  interrupt at some point in time.  When such an interrupt is raised, the
> >  host OS initially handles the interrupt and must somehow signal this
> >  event as a virtual interrupt to the guest.  Another example could be a
> > @@ -37,7 +38,7 @@ inactive.
> >  
> >  The LRs include an extra bit, called the HW bit.  When this bit is set,
> >  KVM must also program an additional field in the LR, the physical IRQ
> > -number, to link the virtual with the physical IRQ.
> > +number, to link the virtual and physical IRQs together.
> >  
> >  When the HW bit is set, KVM must EITHER set the Pending OR the Active
> >  bit, never both at the same time.
> > @@ -59,21 +60,21 @@ The state of forwarded physical interrupts is managed in the following way:
> >    - LR.Pending will stay set as long as the guest has not acked the interrupt.
> >    - LR.Pending transitions to LR.Active on the guest read of the IAR, as
> >      expected.
> > -  - On guest EOI, the *physical distributor* active bit gets cleared,
> > +  - On guest deactivate, the *physical distributor* active bit gets cleared,
> >      but the LR.Active is left untouched (set).
> 
> Is this true? I seem to remember that we established it wasn't (back
> when we redesigned the vgic). Certainly, the current code relies on
> the Active bit being cleared in the LR as well as in the physical
> distributor.
> 

No, you're right, it' crap.

> >    - KVM clears the LR on VM exits when the physical distributor
> >      active state has been cleared.
> 
> And this isn't either, if my above assertion stands.
> 

Right again.

> >  
> >  (*): The host handling is slightly more complicated.  For some forwarded
> > -interrupts (shared), KVM directly sets the active state on the physical
> > -distributor before entering the guest, because the interrupt is never actually
> > -handled on the host (see details on the timer as an example below).  For other
> > -forwarded interrupts (non-shared) the host does not deactivate the interrupt
> > -when the host ISR completes, but leaves the interrupt active until the guest
> > -deactivates it.  Leaving the interrupt active is allowed, because Linux
> > -configures the physical GIC with EOIMode=1, which causes EOI operations to
> > -perform a priority drop allowing the GIC to receive other interrupts of the
> > -default priority.
> > +interrupts (shared), in some cases, KVM directly sets the active state
> > +on the physical distributor before entering the guest, because the
> > +interrupt is never actually handled on the host (see details on the
> > +timer as an example below).  In other cases, the host does not
> 
> This isn't true either. We now handle the timer interrupt on the host.
> 

And again.  I've rewritten this.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v8 9/9] KVM: arm/arm64: Update timer and forwarded irq documentation
  2017-12-19 20:29       ` Christoffer Dall
@ 2017-12-19 20:35         ` Marc Zyngier
  -1 siblings, 0 replies; 44+ messages in thread
From: Marc Zyngier @ 2017-12-19 20:35 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: kvmarm, linux-arm-kernel, kvm, Andre Przywara, Eric Auger

On Tue, 19 Dec 2017 21:29:54 +0100
Christoffer Dall <christoffer.dall@linaro.org> wrote:

> On Wed, Dec 13, 2017 at 08:15:10PM +0000, Marc Zyngier wrote:
> > On Wed, 13 Dec 2017 10:46:02 +0000,
> > Christoffer Dall wrote:  
> > > 
> > > Now when we've reworked how mapped level-triggered interrupts are
> > > processed for the timer interrupts, we update the documentation
> > > correspondingly.  
> > 
> > Seems like the documentation is more out of date than we thought, see
> > below.
> >   
> 
> Indeed.  And I wondered if we should just nuke this file.  The reason I
> added it originally was that the concept of "never taking the interrupt"
> was confusing to most, and we had to explain it several times over, but
> perhaps it's really not needed anymore and we should let people read the
> code instead?

Suits me (maintaining documentation is hard). I suggest we delete it
and write the perfect explanation once KVM is completely done...

	M.
-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 9/9] KVM: arm/arm64: Update timer and forwarded irq documentation
@ 2017-12-19 20:35         ` Marc Zyngier
  0 siblings, 0 replies; 44+ messages in thread
From: Marc Zyngier @ 2017-12-19 20:35 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 19 Dec 2017 21:29:54 +0100
Christoffer Dall <christoffer.dall@linaro.org> wrote:

> On Wed, Dec 13, 2017 at 08:15:10PM +0000, Marc Zyngier wrote:
> > On Wed, 13 Dec 2017 10:46:02 +0000,
> > Christoffer Dall wrote:  
> > > 
> > > Now when we've reworked how mapped level-triggered interrupts are
> > > processed for the timer interrupts, we update the documentation
> > > correspondingly.  
> > 
> > Seems like the documentation is more out of date than we thought, see
> > below.
> >   
> 
> Indeed.  And I wondered if we should just nuke this file.  The reason I
> added it originally was that the concept of "never taking the interrupt"
> was confusing to most, and we had to explain it several times over, but
> perhaps it's really not needed anymore and we should let people read the
> code instead?

Suits me (maintaining documentation is hard). I suggest we delete it
and write the perfect explanation once KVM is completely done...

	M.
-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v8 3/9] KVM: arm/arm64: Don't cache the timer IRQ level
  2017-12-19 14:17     ` Julien Thierry
@ 2017-12-19 20:35       ` Christoffer Dall
  -1 siblings, 0 replies; 44+ messages in thread
From: Christoffer Dall @ 2017-12-19 20:35 UTC (permalink / raw)
  To: Julien Thierry
  Cc: kvmarm, linux-arm-kernel, Marc Zyngier, Andre Przywara, kvm, Eric Auger

On Tue, Dec 19, 2017 at 02:17:38PM +0000, Julien Thierry wrote:
> Hi Christoffer,
> 
> A few nits in the commit message.
> 
> On 13/12/17 10:45, Christoffer Dall wrote:
> >The timer was modeled after a strict idea of modelling an interrupt line
> 
> nit: modelling (also, modeled after a strict idea of modelling?)
> 

Yes, I model the modelling of models of modeled timers.  Is that not
clear?  ;)

> >level in software, meaning that only transitions in the level needed to
> 
> s/needed/need/ ?
> 

ack

> >be reported to the VGIC.  This works well for the timer, because the
> >arch timer code is in complete control of the device and can track the
> >transitions of the line.
> >
> >However, as we are about to support using the HW bit in the VGIC not
> >just for the timer, but also for VFIO which cannot track transitions of
> >the interrupt line, we have to decide on an interface for level
> >triggered mapped interrupts to the GIC, which both the timer and VFIO
> 
> "level triggered interrupts mapped to the GIC" ?
> 

an interface to the GIC for level ...

My writing here is really crap.  Thanks for pointing that out.

> >can use.
> >
> >VFIO only sees an asserting transition of the physical interrupt line,
> >and tells the VGIC when that happens.  That means that part of the
> >interrupt flow is offloaded to the hardware.
> >
> >To use the same interface for VFIO devices and the timer, we therefore
> >have to change the timer (we cannot change VFIO because it doesn't know
> >the details of the device it is assigning to a VM).
> >
> >Luckily, changing the timer is simple, we just need to stop 'caching'
> >the line level, but instead let the VGIC know the state of the timer
> >every time there is a potential change in the line level, and when the
> >line level should be asserted from the timer ISR.  The VGIC can ignore
> >extra notifications using its validate mechanism.
> >
> >Reviewed-by: Andre Przywara <andre.przywara@arm.com>
> >Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
> 
> Reviewed-by: Julien Thierry <julien.thierry@arm.com>
> 

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 3/9] KVM: arm/arm64: Don't cache the timer IRQ level
@ 2017-12-19 20:35       ` Christoffer Dall
  0 siblings, 0 replies; 44+ messages in thread
From: Christoffer Dall @ 2017-12-19 20:35 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 19, 2017 at 02:17:38PM +0000, Julien Thierry wrote:
> Hi Christoffer,
> 
> A few nits in the commit message.
> 
> On 13/12/17 10:45, Christoffer Dall wrote:
> >The timer was modeled after a strict idea of modelling an interrupt line
> 
> nit: modelling (also, modeled after a strict idea of modelling?)
> 

Yes, I model the modelling of models of modeled timers.  Is that not
clear?  ;)

> >level in software, meaning that only transitions in the level needed to
> 
> s/needed/need/ ?
> 

ack

> >be reported to the VGIC.  This works well for the timer, because the
> >arch timer code is in complete control of the device and can track the
> >transitions of the line.
> >
> >However, as we are about to support using the HW bit in the VGIC not
> >just for the timer, but also for VFIO which cannot track transitions of
> >the interrupt line, we have to decide on an interface for level
> >triggered mapped interrupts to the GIC, which both the timer and VFIO
> 
> "level triggered interrupts mapped to the GIC" ?
> 

an interface to the GIC for level ...

My writing here is really crap.  Thanks for pointing that out.

> >can use.
> >
> >VFIO only sees an asserting transition of the physical interrupt line,
> >and tells the VGIC when that happens.  That means that part of the
> >interrupt flow is offloaded to the hardware.
> >
> >To use the same interface for VFIO devices and the timer, we therefore
> >have to change the timer (we cannot change VFIO because it doesn't know
> >the details of the device it is assigning to a VM).
> >
> >Luckily, changing the timer is simple, we just need to stop 'caching'
> >the line level, but instead let the VGIC know the state of the timer
> >every time there is a potential change in the line level, and when the
> >line level should be asserted from the timer ISR.  The VGIC can ignore
> >extra notifications using its validate mechanism.
> >
> >Reviewed-by: Andre Przywara <andre.przywara@arm.com>
> >Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
> 
> Reviewed-by: Julien Thierry <julien.thierry@arm.com>
> 

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2017-12-19 20:35 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-13 10:45 [PATCH v8 0/9] Handle forwarded level-triggered interrupts Christoffer Dall
2017-12-13 10:45 ` Christoffer Dall
2017-12-13 10:45 ` [PATCH v8 1/9] KVM: arm/arm64: Remove redundant preemptible checks Christoffer Dall
2017-12-13 10:45   ` Christoffer Dall
2017-12-13 10:45 ` [PATCH v8 2/9] KVM: arm/arm64: Factor out functionality to get vgic mmio requester_vcpu Christoffer Dall
2017-12-13 10:45   ` Christoffer Dall
2017-12-13 10:45 ` [PATCH v8 3/9] KVM: arm/arm64: Don't cache the timer IRQ level Christoffer Dall
2017-12-13 10:45   ` Christoffer Dall
2017-12-13 19:38   ` Marc Zyngier
2017-12-13 19:38     ` Marc Zyngier
2017-12-19 14:17   ` Julien Thierry
2017-12-19 14:17     ` Julien Thierry
2017-12-19 20:35     ` Christoffer Dall
2017-12-19 20:35       ` Christoffer Dall
2017-12-13 10:45 ` [PATCH v8 4/9] KVM: arm/arm64: vgic: Support level-triggered mapped interrupts Christoffer Dall
2017-12-13 10:45   ` Christoffer Dall
2017-12-13 10:45 ` [PATCH v8 5/9] KVM: arm/arm64: Support a vgic interrupt line level sample function Christoffer Dall
2017-12-13 10:45   ` Christoffer Dall
2017-12-13 10:45 ` [PATCH v8 6/9] KVM: arm/arm64: Support VGIC dist pend/active changes for mapped IRQs Christoffer Dall
2017-12-13 10:45   ` Christoffer Dall
2017-12-13 10:46 ` [PATCH v8 7/9] KVM: arm/arm64: Provide a get_input_level for the arch timer Christoffer Dall
2017-12-13 10:46   ` Christoffer Dall
2017-12-13 19:45   ` Marc Zyngier
2017-12-13 19:45     ` Marc Zyngier
2017-12-13 10:46 ` [PATCH v8 8/9] KVM: arm/arm64: Avoid work when userspace iqchips are not used Christoffer Dall
2017-12-13 10:46   ` Christoffer Dall
2017-12-13 20:05   ` Marc Zyngier
2017-12-13 20:05     ` Marc Zyngier
2017-12-19 13:34     ` Christoffer Dall
2017-12-19 13:34       ` Christoffer Dall
2017-12-19 13:55       ` Marc Zyngier
2017-12-19 13:55         ` Marc Zyngier
2017-12-19 14:18         ` Christoffer Dall
2017-12-19 14:18           ` Christoffer Dall
2017-12-19 14:32           ` Marc Zyngier
2017-12-19 14:32             ` Marc Zyngier
2017-12-13 10:46 ` [PATCH v8 9/9] KVM: arm/arm64: Update timer and forwarded irq documentation Christoffer Dall
2017-12-13 10:46   ` Christoffer Dall
2017-12-13 20:15   ` Marc Zyngier
2017-12-13 20:15     ` Marc Zyngier
2017-12-19 20:29     ` Christoffer Dall
2017-12-19 20:29       ` Christoffer Dall
2017-12-19 20:35       ` Marc Zyngier
2017-12-19 20:35         ` Marc Zyngier

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.