All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/11] arm/arm64: KVM: Active interrupt state switching for shared devices
@ 2015-07-24 15:54 ` Marc Zyngier
  0 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-07-24 15:54 UTC (permalink / raw)
  To: kvm, kvmarm, linux-arm-kernel
  Cc: Christoffer Dall, Eric Auger, Alex Bennée, Andre Przywara

>From day 1, our timer code has been using a terrible hack: whenever
the guest is scheduled with a timer interrupt pending (i.e. the HW
timer has expired), we restore the timer state with the MASK bit set,
in order to avoid the physical interrupt to fire again. And again. And
again...

This is absolutely silly, for at least two reasons:

- This relies on the device (the timer) having a mask bit that we can
  play with. Not all devices are built like this.

- This expects some behaviour of the guest that only works because the
  both the kernel timer code and the KVM counterpart have been written
  by the same idiot (the idiot being me).

The One True Way is to set the GIC active bit when injecting the
interrupt, and to context-switch across the world switch. This is what
this series implements.

We introduce a relatively simple infrastructure enabling the mapping
of a virtual interrupt with its physical counterpart:

- Whenever an virtual interrupt is injected, we look it up in an
  rbtree. If we have a match, the interrupt is injected with the HW
  bit set in the LR, together with the physical interrupt.

- Across the world switch, we save/restore the active state for these
  interrupts using the irqchip_state API.

- On guest EOI, the HW interrupt is automagically deactivated by the
  GIC, allowing the interrupt to be resampled.

The timer code is slightly modified to set the active state at the
same time as the injection.

The last patch also allows non-shared devices to have their interrupt
deactivated the same way (in this case we do not context-switch the
active state). This is the first step in the long overdue direction of
the mythical IRQ forwarding thing... This last patch is not a
candidate for merging, but is included for completeness, and to allow
Eric to base his own code on this stuff.

This series is based on v4.2-rc2 + the current queue for 4.3, and has
been tested on Juno, Cubietruck (GICv2) and the FVP Base model (GICv3
host, both GICv2 and GICv3 guests). I'd appreciate any form of
testing, specially in the context of guest migration (there is
obviously some interesting stuff there...).

The code is otherwise available at
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git kvm-arm64/active-timer

* From v2:
  - Many comments added
  - Fixed potential race between unmap and destroy
  - Removed the requirement for a GFP_ATOMIC allocations in map
  - New function names for the exported functions (map, unmap, get, set, init)
  - New patch preventing the injection of a HW mapped interrupt, and
    offer a new injection API

* From v1:
  - Rebased on top of current mainline
  - Fixed non-shared handling of forwarded interrupts
  - Fixed memory leaks on VM exit
  - Used RCU lists instead of an RB tree

Marc Zyngier (10):
  arm/arm64: KVM: Fix ordering of timer/GIC on guest entry
  arm/arm64: KVM: Move vgic handling to a non-preemptible section
  KVM: arm/arm64: vgic: Convert struct vgic_lr to use bitfields
  KVM: arm/arm64: vgic: Allow HW irq to be encoded in LR
  KVM: arm/arm64: vgic: Relax vgic_can_sample_irq for edge IRQs
  KVM: arm/arm64: vgic: Allow dynamic mapping of physical/virtual
    interrupts
  KVM: arm/arm64: vgic: Allow HW interrupts to be queued to a guest
  KVM: arm/arm64: vgic: Add vgic_{get,set}_phys_irq_active
  KVM: arm/arm64: timer: Allow the timer to control the active state
  KVM: arm/arm64: vgic: Allow non-shared device HW interrupts

 arch/arm/kvm/arm.c                 |  21 ++-
 include/kvm/arm_arch_timer.h       |   3 +
 include/kvm/arm_vgic.h             |  38 +++++-
 include/linux/irqchip/arm-gic-v3.h |   3 +
 include/linux/irqchip/arm-gic.h    |   3 +-
 virt/kvm/arm/arch_timer.c          |  13 +-
 virt/kvm/arm/vgic-v2.c             |  16 ++-
 virt/kvm/arm/vgic-v3.c             |  21 ++-
 virt/kvm/arm/vgic.c                | 264 ++++++++++++++++++++++++++++++++++++-
 9 files changed, 363 insertions(+), 19 deletions(-)

-- 
2.1.4

*** BLURB HERE ***

Marc Zyngier (11):
  arm/arm64: KVM: Fix ordering of timer/GIC on guest entry
  arm/arm64: KVM: Move vgic handling to a non-preemptible section
  KVM: arm/arm64: vgic: Convert struct vgic_lr to use bitfields
  KVM: arm/arm64: vgic: Allow HW irq to be encoded in LR
  KVM: arm/arm64: vgic: Relax vgic_can_sample_irq for edge IRQs
  KVM: arm/arm64: vgic: Allow dynamic mapping of physical/virtual
    interrupts
  KVM: arm/arm64: vgic: Allow HW interrupts to be queued to a guest
  KVM: arm/arm64: vgic: Add vgic_{get,set}_phys_irq_active
  KVM: arm/arm64: vgic: Prevent userspace injection of a mapped
    interrupt
  KVM: arm/arm64: timer: Allow the timer to control the active state
  KVM: arm/arm64: vgic: Allow HW interrupts for non-shared devices

 arch/arm/kvm/arm.c                 |  21 +-
 arch/arm/kvm/reset.c               |   4 +-
 arch/arm64/kvm/reset.c             |   4 +-
 include/kvm/arm_arch_timer.h       |   7 +-
 include/kvm/arm_vgic.h             |  42 +++-
 include/linux/irqchip/arm-gic-v3.h |   3 +
 include/linux/irqchip/arm-gic.h    |   3 +-
 virt/kvm/arm/arch_timer.c          |  30 ++-
 virt/kvm/arm/vgic-v2.c             |  16 +-
 virt/kvm/arm/vgic-v3.c             |  21 +-
 virt/kvm/arm/vgic.c                | 446 ++++++++++++++++++++++++++++++++++---
 11 files changed, 534 insertions(+), 63 deletions(-)

-- 
2.1.4


^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 00/11] arm/arm64: KVM: Active interrupt state switching for shared devices
@ 2015-07-24 15:54 ` Marc Zyngier
  0 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-07-24 15:54 UTC (permalink / raw)
  To: linux-arm-kernel

>From day 1, our timer code has been using a terrible hack: whenever
the guest is scheduled with a timer interrupt pending (i.e. the HW
timer has expired), we restore the timer state with the MASK bit set,
in order to avoid the physical interrupt to fire again. And again. And
again...

This is absolutely silly, for at least two reasons:

- This relies on the device (the timer) having a mask bit that we can
  play with. Not all devices are built like this.

- This expects some behaviour of the guest that only works because the
  both the kernel timer code and the KVM counterpart have been written
  by the same idiot (the idiot being me).

The One True Way is to set the GIC active bit when injecting the
interrupt, and to context-switch across the world switch. This is what
this series implements.

We introduce a relatively simple infrastructure enabling the mapping
of a virtual interrupt with its physical counterpart:

- Whenever an virtual interrupt is injected, we look it up in an
  rbtree. If we have a match, the interrupt is injected with the HW
  bit set in the LR, together with the physical interrupt.

- Across the world switch, we save/restore the active state for these
  interrupts using the irqchip_state API.

- On guest EOI, the HW interrupt is automagically deactivated by the
  GIC, allowing the interrupt to be resampled.

The timer code is slightly modified to set the active state at the
same time as the injection.

The last patch also allows non-shared devices to have their interrupt
deactivated the same way (in this case we do not context-switch the
active state). This is the first step in the long overdue direction of
the mythical IRQ forwarding thing... This last patch is not a
candidate for merging, but is included for completeness, and to allow
Eric to base his own code on this stuff.

This series is based on v4.2-rc2 + the current queue for 4.3, and has
been tested on Juno, Cubietruck (GICv2) and the FVP Base model (GICv3
host, both GICv2 and GICv3 guests). I'd appreciate any form of
testing, specially in the context of guest migration (there is
obviously some interesting stuff there...).

The code is otherwise available at
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git kvm-arm64/active-timer

* From v2:
  - Many comments added
  - Fixed potential race between unmap and destroy
  - Removed the requirement for a GFP_ATOMIC allocations in map
  - New function names for the exported functions (map, unmap, get, set, init)
  - New patch preventing the injection of a HW mapped interrupt, and
    offer a new injection API

* From v1:
  - Rebased on top of current mainline
  - Fixed non-shared handling of forwarded interrupts
  - Fixed memory leaks on VM exit
  - Used RCU lists instead of an RB tree

Marc Zyngier (10):
  arm/arm64: KVM: Fix ordering of timer/GIC on guest entry
  arm/arm64: KVM: Move vgic handling to a non-preemptible section
  KVM: arm/arm64: vgic: Convert struct vgic_lr to use bitfields
  KVM: arm/arm64: vgic: Allow HW irq to be encoded in LR
  KVM: arm/arm64: vgic: Relax vgic_can_sample_irq for edge IRQs
  KVM: arm/arm64: vgic: Allow dynamic mapping of physical/virtual
    interrupts
  KVM: arm/arm64: vgic: Allow HW interrupts to be queued to a guest
  KVM: arm/arm64: vgic: Add vgic_{get,set}_phys_irq_active
  KVM: arm/arm64: timer: Allow the timer to control the active state
  KVM: arm/arm64: vgic: Allow non-shared device HW interrupts

 arch/arm/kvm/arm.c                 |  21 ++-
 include/kvm/arm_arch_timer.h       |   3 +
 include/kvm/arm_vgic.h             |  38 +++++-
 include/linux/irqchip/arm-gic-v3.h |   3 +
 include/linux/irqchip/arm-gic.h    |   3 +-
 virt/kvm/arm/arch_timer.c          |  13 +-
 virt/kvm/arm/vgic-v2.c             |  16 ++-
 virt/kvm/arm/vgic-v3.c             |  21 ++-
 virt/kvm/arm/vgic.c                | 264 ++++++++++++++++++++++++++++++++++++-
 9 files changed, 363 insertions(+), 19 deletions(-)

-- 
2.1.4

*** BLURB HERE ***

Marc Zyngier (11):
  arm/arm64: KVM: Fix ordering of timer/GIC on guest entry
  arm/arm64: KVM: Move vgic handling to a non-preemptible section
  KVM: arm/arm64: vgic: Convert struct vgic_lr to use bitfields
  KVM: arm/arm64: vgic: Allow HW irq to be encoded in LR
  KVM: arm/arm64: vgic: Relax vgic_can_sample_irq for edge IRQs
  KVM: arm/arm64: vgic: Allow dynamic mapping of physical/virtual
    interrupts
  KVM: arm/arm64: vgic: Allow HW interrupts to be queued to a guest
  KVM: arm/arm64: vgic: Add vgic_{get,set}_phys_irq_active
  KVM: arm/arm64: vgic: Prevent userspace injection of a mapped
    interrupt
  KVM: arm/arm64: timer: Allow the timer to control the active state
  KVM: arm/arm64: vgic: Allow HW interrupts for non-shared devices

 arch/arm/kvm/arm.c                 |  21 +-
 arch/arm/kvm/reset.c               |   4 +-
 arch/arm64/kvm/reset.c             |   4 +-
 include/kvm/arm_arch_timer.h       |   7 +-
 include/kvm/arm_vgic.h             |  42 +++-
 include/linux/irqchip/arm-gic-v3.h |   3 +
 include/linux/irqchip/arm-gic.h    |   3 +-
 virt/kvm/arm/arch_timer.c          |  30 ++-
 virt/kvm/arm/vgic-v2.c             |  16 +-
 virt/kvm/arm/vgic-v3.c             |  21 +-
 virt/kvm/arm/vgic.c                | 446 ++++++++++++++++++++++++++++++++++---
 11 files changed, 534 insertions(+), 63 deletions(-)

-- 
2.1.4

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 01/11] arm/arm64: KVM: Fix ordering of timer/GIC on guest entry
  2015-07-24 15:54 ` Marc Zyngier
@ 2015-07-24 15:54   ` Marc Zyngier
  -1 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-07-24 15:54 UTC (permalink / raw)
  To: kvm, kvmarm, linux-arm-kernel; +Cc: Andre Przywara

As we now inject the timer interrupt when we're about to enter
the guest, it makes a lot more sense to make sure this happens
before the vgic code queues the pending interrupts.

Otherwise, we get the interrupt on the following exit, which is
not great for latency (and leads to all kind of bizarre issues
when using with active interrupts at the HW level).

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 arch/arm/kvm/arm.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 9ce5cf0..1141d21 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -523,8 +523,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		if (vcpu->arch.pause)
 			vcpu_pause(vcpu);
 
-		kvm_vgic_flush_hwstate(vcpu);
 		kvm_timer_flush_hwstate(vcpu);
+		kvm_vgic_flush_hwstate(vcpu);
 
 		preempt_disable();
 		local_irq_disable();
@@ -540,8 +540,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		if (ret <= 0 || need_new_vmid_gen(vcpu->kvm)) {
 			local_irq_enable();
 			preempt_enable();
-			kvm_timer_sync_hwstate(vcpu);
 			kvm_vgic_sync_hwstate(vcpu);
+			kvm_timer_sync_hwstate(vcpu);
 			continue;
 		}
 
@@ -587,9 +587,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		trace_kvm_exit(kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu));
 		preempt_enable();
 
-
-		kvm_timer_sync_hwstate(vcpu);
 		kvm_vgic_sync_hwstate(vcpu);
+		kvm_timer_sync_hwstate(vcpu);
 
 		ret = handle_exit(vcpu, run, ret);
 	}
-- 
2.1.4

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 01/11] arm/arm64: KVM: Fix ordering of timer/GIC on guest entry
@ 2015-07-24 15:54   ` Marc Zyngier
  0 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-07-24 15:54 UTC (permalink / raw)
  To: linux-arm-kernel

As we now inject the timer interrupt when we're about to enter
the guest, it makes a lot more sense to make sure this happens
before the vgic code queues the pending interrupts.

Otherwise, we get the interrupt on the following exit, which is
not great for latency (and leads to all kind of bizarre issues
when using with active interrupts at the HW level).

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Alex Benn?e <alex.bennee@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 arch/arm/kvm/arm.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 9ce5cf0..1141d21 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -523,8 +523,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		if (vcpu->arch.pause)
 			vcpu_pause(vcpu);
 
-		kvm_vgic_flush_hwstate(vcpu);
 		kvm_timer_flush_hwstate(vcpu);
+		kvm_vgic_flush_hwstate(vcpu);
 
 		preempt_disable();
 		local_irq_disable();
@@ -540,8 +540,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		if (ret <= 0 || need_new_vmid_gen(vcpu->kvm)) {
 			local_irq_enable();
 			preempt_enable();
-			kvm_timer_sync_hwstate(vcpu);
 			kvm_vgic_sync_hwstate(vcpu);
+			kvm_timer_sync_hwstate(vcpu);
 			continue;
 		}
 
@@ -587,9 +587,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		trace_kvm_exit(kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu));
 		preempt_enable();
 
-
-		kvm_timer_sync_hwstate(vcpu);
 		kvm_vgic_sync_hwstate(vcpu);
+		kvm_timer_sync_hwstate(vcpu);
 
 		ret = handle_exit(vcpu, run, ret);
 	}
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 02/11] arm/arm64: KVM: Move vgic handling to a non-preemptible section
  2015-07-24 15:54 ` Marc Zyngier
@ 2015-07-24 15:55   ` Marc Zyngier
  -1 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-07-24 15:55 UTC (permalink / raw)
  To: kvm, kvmarm, linux-arm-kernel; +Cc: Andre Przywara

As we're about to introduce some serious GIC-poking to the vgic code,
it is important to make sure that we're going to poke the part of
the GIC that belongs to the CPU we're about to run on (otherwise,
we'd end up with some unexpected interrupts firing)...

Introducing a non-preemptible section in kvm_arch_vcpu_ioctl_run
prevents the problem from occuring.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm/kvm/arm.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 1141d21..f1bf418 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -523,10 +523,20 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		if (vcpu->arch.pause)
 			vcpu_pause(vcpu);
 
+		/*
+		 * Disarming the background timer must be done in a
+		 * preemptible context, as this call may sleep.
+		 */
 		kvm_timer_flush_hwstate(vcpu);
-		kvm_vgic_flush_hwstate(vcpu);
 
+		/*
+		 * Preparing the interrupts to be injected also
+		 * involves poking the GIC, which must be done in a
+		 * non-preemptible context.
+		 */
 		preempt_disable();
+		kvm_vgic_flush_hwstate(vcpu);
+
 		local_irq_disable();
 
 		/*
@@ -539,8 +549,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 
 		if (ret <= 0 || need_new_vmid_gen(vcpu->kvm)) {
 			local_irq_enable();
-			preempt_enable();
 			kvm_vgic_sync_hwstate(vcpu);
+			preempt_enable();
 			kvm_timer_sync_hwstate(vcpu);
 			continue;
 		}
@@ -585,9 +595,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		 */
 		kvm_guest_exit();
 		trace_kvm_exit(kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu));
-		preempt_enable();
 
 		kvm_vgic_sync_hwstate(vcpu);
+
+		preempt_enable();
+
 		kvm_timer_sync_hwstate(vcpu);
 
 		ret = handle_exit(vcpu, run, ret);
-- 
2.1.4

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 02/11] arm/arm64: KVM: Move vgic handling to a non-preemptible section
@ 2015-07-24 15:55   ` Marc Zyngier
  0 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-07-24 15:55 UTC (permalink / raw)
  To: linux-arm-kernel

As we're about to introduce some serious GIC-poking to the vgic code,
it is important to make sure that we're going to poke the part of
the GIC that belongs to the CPU we're about to run on (otherwise,
we'd end up with some unexpected interrupts firing)...

Introducing a non-preemptible section in kvm_arch_vcpu_ioctl_run
prevents the problem from occuring.

Reviewed-by: Alex Benn?e <alex.bennee@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm/kvm/arm.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 1141d21..f1bf418 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -523,10 +523,20 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		if (vcpu->arch.pause)
 			vcpu_pause(vcpu);
 
+		/*
+		 * Disarming the background timer must be done in a
+		 * preemptible context, as this call may sleep.
+		 */
 		kvm_timer_flush_hwstate(vcpu);
-		kvm_vgic_flush_hwstate(vcpu);
 
+		/*
+		 * Preparing the interrupts to be injected also
+		 * involves poking the GIC, which must be done in a
+		 * non-preemptible context.
+		 */
 		preempt_disable();
+		kvm_vgic_flush_hwstate(vcpu);
+
 		local_irq_disable();
 
 		/*
@@ -539,8 +549,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 
 		if (ret <= 0 || need_new_vmid_gen(vcpu->kvm)) {
 			local_irq_enable();
-			preempt_enable();
 			kvm_vgic_sync_hwstate(vcpu);
+			preempt_enable();
 			kvm_timer_sync_hwstate(vcpu);
 			continue;
 		}
@@ -585,9 +595,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		 */
 		kvm_guest_exit();
 		trace_kvm_exit(kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu));
-		preempt_enable();
 
 		kvm_vgic_sync_hwstate(vcpu);
+
+		preempt_enable();
+
 		kvm_timer_sync_hwstate(vcpu);
 
 		ret = handle_exit(vcpu, run, ret);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 03/11] KVM: arm/arm64: vgic: Convert struct vgic_lr to use bitfields
  2015-07-24 15:54 ` Marc Zyngier
@ 2015-07-24 15:55   ` Marc Zyngier
  -1 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-07-24 15:55 UTC (permalink / raw)
  To: kvm, kvmarm, linux-arm-kernel; +Cc: Andre Przywara

As we're about to cram more information in the vgic_lr structure
(HW interrupt number and additional state information), we switch
to a layout similar to the HW's:

- use bitfields to save space (we don't need more than 10 bits
  to represent the irq numbers)
- source CPU and HW interrupt can share the same field, as
  a SGI doesn't have a physical line.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 include/kvm/arm_vgic.h | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 133ea00..a881e39 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -95,11 +95,15 @@ enum vgic_type {
 #define LR_STATE_ACTIVE		(1 << 1)
 #define LR_STATE_MASK		(3 << 0)
 #define LR_EOI_INT		(1 << 2)
+#define LR_HW			(1 << 3)
 
 struct vgic_lr {
-	u16	irq;
-	u8	source;
-	u8	state;
+	unsigned irq:10;
+	union {
+		unsigned hwirq:10;
+		unsigned source:3;
+	};
+	unsigned state:4;
 };
 
 struct vgic_vmcr {
-- 
2.1.4

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 03/11] KVM: arm/arm64: vgic: Convert struct vgic_lr to use bitfields
@ 2015-07-24 15:55   ` Marc Zyngier
  0 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-07-24 15:55 UTC (permalink / raw)
  To: linux-arm-kernel

As we're about to cram more information in the vgic_lr structure
(HW interrupt number and additional state information), we switch
to a layout similar to the HW's:

- use bitfields to save space (we don't need more than 10 bits
  to represent the irq numbers)
- source CPU and HW interrupt can share the same field, as
  a SGI doesn't have a physical line.

Reviewed-by: Alex Benn?e <alex.bennee@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 include/kvm/arm_vgic.h | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 133ea00..a881e39 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -95,11 +95,15 @@ enum vgic_type {
 #define LR_STATE_ACTIVE		(1 << 1)
 #define LR_STATE_MASK		(3 << 0)
 #define LR_EOI_INT		(1 << 2)
+#define LR_HW			(1 << 3)
 
 struct vgic_lr {
-	u16	irq;
-	u8	source;
-	u8	state;
+	unsigned irq:10;
+	union {
+		unsigned hwirq:10;
+		unsigned source:3;
+	};
+	unsigned state:4;
 };
 
 struct vgic_vmcr {
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 04/11] KVM: arm/arm64: vgic: Allow HW irq to be encoded in LR
  2015-07-24 15:54 ` Marc Zyngier
@ 2015-07-24 15:55   ` Marc Zyngier
  -1 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-07-24 15:55 UTC (permalink / raw)
  To: kvm, kvmarm, linux-arm-kernel; +Cc: Andre Przywara

Now that struct vgic_lr supports the LR_HW bit and carries a hwirq
field, we can encode that information into the list registers.

This patch provides implementations for both GICv2 and GICv3.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 include/linux/irqchip/arm-gic-v3.h |  3 +++
 include/linux/irqchip/arm-gic.h    |  3 ++-
 virt/kvm/arm/vgic-v2.c             | 16 +++++++++++++++-
 virt/kvm/arm/vgic-v3.c             | 21 ++++++++++++++++++---
 4 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/include/linux/irqchip/arm-gic-v3.h b/include/linux/irqchip/arm-gic-v3.h
index ffbc034..cf637d6 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -268,9 +268,12 @@
 
 #define ICH_LR_EOI			(1UL << 41)
 #define ICH_LR_GROUP			(1UL << 60)
+#define ICH_LR_HW			(1UL << 61)
 #define ICH_LR_STATE			(3UL << 62)
 #define ICH_LR_PENDING_BIT		(1UL << 62)
 #define ICH_LR_ACTIVE_BIT		(1UL << 63)
+#define ICH_LR_PHYS_ID_SHIFT		32
+#define ICH_LR_PHYS_ID_MASK		(0x3ffUL << ICH_LR_PHYS_ID_SHIFT)
 
 #define ICH_MISR_EOI			(1 << 0)
 #define ICH_MISR_U			(1 << 1)
diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h
index 9de976b..ca88dad 100644
--- a/include/linux/irqchip/arm-gic.h
+++ b/include/linux/irqchip/arm-gic.h
@@ -71,11 +71,12 @@
 
 #define GICH_LR_VIRTUALID		(0x3ff << 0)
 #define GICH_LR_PHYSID_CPUID_SHIFT	(10)
-#define GICH_LR_PHYSID_CPUID		(7 << GICH_LR_PHYSID_CPUID_SHIFT)
+#define GICH_LR_PHYSID_CPUID		(0x3ff << GICH_LR_PHYSID_CPUID_SHIFT)
 #define GICH_LR_STATE			(3 << 28)
 #define GICH_LR_PENDING_BIT		(1 << 28)
 #define GICH_LR_ACTIVE_BIT		(1 << 29)
 #define GICH_LR_EOI			(1 << 19)
+#define GICH_LR_HW			(1 << 31)
 
 #define GICH_VMCR_CTRL_SHIFT		0
 #define GICH_VMCR_CTRL_MASK		(0x21f << GICH_VMCR_CTRL_SHIFT)
diff --git a/virt/kvm/arm/vgic-v2.c b/virt/kvm/arm/vgic-v2.c
index f9b9c7c..8d7b04d 100644
--- a/virt/kvm/arm/vgic-v2.c
+++ b/virt/kvm/arm/vgic-v2.c
@@ -48,6 +48,10 @@ static struct vgic_lr vgic_v2_get_lr(const struct kvm_vcpu *vcpu, int lr)
 		lr_desc.state |= LR_STATE_ACTIVE;
 	if (val & GICH_LR_EOI)
 		lr_desc.state |= LR_EOI_INT;
+	if (val & GICH_LR_HW) {
+		lr_desc.state |= LR_HW;
+		lr_desc.hwirq = (val & GICH_LR_PHYSID_CPUID) >> GICH_LR_PHYSID_CPUID_SHIFT;
+	}
 
 	return lr_desc;
 }
@@ -55,7 +59,9 @@ static struct vgic_lr vgic_v2_get_lr(const struct kvm_vcpu *vcpu, int lr)
 static void vgic_v2_set_lr(struct kvm_vcpu *vcpu, int lr,
 			   struct vgic_lr lr_desc)
 {
-	u32 lr_val = (lr_desc.source << GICH_LR_PHYSID_CPUID_SHIFT) | lr_desc.irq;
+	u32 lr_val;
+
+	lr_val = lr_desc.irq;
 
 	if (lr_desc.state & LR_STATE_PENDING)
 		lr_val |= GICH_LR_PENDING_BIT;
@@ -64,6 +70,14 @@ static void vgic_v2_set_lr(struct kvm_vcpu *vcpu, int lr,
 	if (lr_desc.state & LR_EOI_INT)
 		lr_val |= GICH_LR_EOI;
 
+	if (lr_desc.state & LR_HW) {
+		lr_val |= GICH_LR_HW;
+		lr_val |= (u32)lr_desc.hwirq << GICH_LR_PHYSID_CPUID_SHIFT;
+	}
+
+	if (lr_desc.irq < VGIC_NR_SGIS)
+		lr_val |= (lr_desc.source << GICH_LR_PHYSID_CPUID_SHIFT);
+
 	vcpu->arch.vgic_cpu.vgic_v2.vgic_lr[lr] = lr_val;
 }
 
diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
index dff0602..afbf925 100644
--- a/virt/kvm/arm/vgic-v3.c
+++ b/virt/kvm/arm/vgic-v3.c
@@ -67,6 +67,10 @@ static struct vgic_lr vgic_v3_get_lr(const struct kvm_vcpu *vcpu, int lr)
 		lr_desc.state |= LR_STATE_ACTIVE;
 	if (val & ICH_LR_EOI)
 		lr_desc.state |= LR_EOI_INT;
+	if (val & ICH_LR_HW) {
+		lr_desc.state |= LR_HW;
+		lr_desc.hwirq = (val >> ICH_LR_PHYS_ID_SHIFT) & GENMASK(9, 0);
+	}
 
 	return lr_desc;
 }
@@ -84,10 +88,17 @@ static void vgic_v3_set_lr(struct kvm_vcpu *vcpu, int lr,
 	 * Eventually we want to make this configurable, so we may revisit
 	 * this in the future.
 	 */
-	if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3)
+	switch (vcpu->kvm->arch.vgic.vgic_model) {
+	case KVM_DEV_TYPE_ARM_VGIC_V3:
 		lr_val |= ICH_LR_GROUP;
-	else
-		lr_val |= (u32)lr_desc.source << GICH_LR_PHYSID_CPUID_SHIFT;
+		break;
+	case  KVM_DEV_TYPE_ARM_VGIC_V2:
+		if (lr_desc.irq < VGIC_NR_SGIS)
+			lr_val |= (u32)lr_desc.source << GICH_LR_PHYSID_CPUID_SHIFT;
+		break;
+	default:
+		BUG();
+	}
 
 	if (lr_desc.state & LR_STATE_PENDING)
 		lr_val |= ICH_LR_PENDING_BIT;
@@ -95,6 +106,10 @@ static void vgic_v3_set_lr(struct kvm_vcpu *vcpu, int lr,
 		lr_val |= ICH_LR_ACTIVE_BIT;
 	if (lr_desc.state & LR_EOI_INT)
 		lr_val |= ICH_LR_EOI;
+	if (lr_desc.state & LR_HW) {
+		lr_val |= ICH_LR_HW;
+		lr_val |= ((u64)lr_desc.hwirq) << ICH_LR_PHYS_ID_SHIFT;
+	}
 
 	vcpu->arch.vgic_cpu.vgic_v3.vgic_lr[LR_INDEX(lr)] = lr_val;
 }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 04/11] KVM: arm/arm64: vgic: Allow HW irq to be encoded in LR
@ 2015-07-24 15:55   ` Marc Zyngier
  0 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-07-24 15:55 UTC (permalink / raw)
  To: linux-arm-kernel

Now that struct vgic_lr supports the LR_HW bit and carries a hwirq
field, we can encode that information into the list registers.

This patch provides implementations for both GICv2 and GICv3.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 include/linux/irqchip/arm-gic-v3.h |  3 +++
 include/linux/irqchip/arm-gic.h    |  3 ++-
 virt/kvm/arm/vgic-v2.c             | 16 +++++++++++++++-
 virt/kvm/arm/vgic-v3.c             | 21 ++++++++++++++++++---
 4 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/include/linux/irqchip/arm-gic-v3.h b/include/linux/irqchip/arm-gic-v3.h
index ffbc034..cf637d6 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -268,9 +268,12 @@
 
 #define ICH_LR_EOI			(1UL << 41)
 #define ICH_LR_GROUP			(1UL << 60)
+#define ICH_LR_HW			(1UL << 61)
 #define ICH_LR_STATE			(3UL << 62)
 #define ICH_LR_PENDING_BIT		(1UL << 62)
 #define ICH_LR_ACTIVE_BIT		(1UL << 63)
+#define ICH_LR_PHYS_ID_SHIFT		32
+#define ICH_LR_PHYS_ID_MASK		(0x3ffUL << ICH_LR_PHYS_ID_SHIFT)
 
 #define ICH_MISR_EOI			(1 << 0)
 #define ICH_MISR_U			(1 << 1)
diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h
index 9de976b..ca88dad 100644
--- a/include/linux/irqchip/arm-gic.h
+++ b/include/linux/irqchip/arm-gic.h
@@ -71,11 +71,12 @@
 
 #define GICH_LR_VIRTUALID		(0x3ff << 0)
 #define GICH_LR_PHYSID_CPUID_SHIFT	(10)
-#define GICH_LR_PHYSID_CPUID		(7 << GICH_LR_PHYSID_CPUID_SHIFT)
+#define GICH_LR_PHYSID_CPUID		(0x3ff << GICH_LR_PHYSID_CPUID_SHIFT)
 #define GICH_LR_STATE			(3 << 28)
 #define GICH_LR_PENDING_BIT		(1 << 28)
 #define GICH_LR_ACTIVE_BIT		(1 << 29)
 #define GICH_LR_EOI			(1 << 19)
+#define GICH_LR_HW			(1 << 31)
 
 #define GICH_VMCR_CTRL_SHIFT		0
 #define GICH_VMCR_CTRL_MASK		(0x21f << GICH_VMCR_CTRL_SHIFT)
diff --git a/virt/kvm/arm/vgic-v2.c b/virt/kvm/arm/vgic-v2.c
index f9b9c7c..8d7b04d 100644
--- a/virt/kvm/arm/vgic-v2.c
+++ b/virt/kvm/arm/vgic-v2.c
@@ -48,6 +48,10 @@ static struct vgic_lr vgic_v2_get_lr(const struct kvm_vcpu *vcpu, int lr)
 		lr_desc.state |= LR_STATE_ACTIVE;
 	if (val & GICH_LR_EOI)
 		lr_desc.state |= LR_EOI_INT;
+	if (val & GICH_LR_HW) {
+		lr_desc.state |= LR_HW;
+		lr_desc.hwirq = (val & GICH_LR_PHYSID_CPUID) >> GICH_LR_PHYSID_CPUID_SHIFT;
+	}
 
 	return lr_desc;
 }
@@ -55,7 +59,9 @@ static struct vgic_lr vgic_v2_get_lr(const struct kvm_vcpu *vcpu, int lr)
 static void vgic_v2_set_lr(struct kvm_vcpu *vcpu, int lr,
 			   struct vgic_lr lr_desc)
 {
-	u32 lr_val = (lr_desc.source << GICH_LR_PHYSID_CPUID_SHIFT) | lr_desc.irq;
+	u32 lr_val;
+
+	lr_val = lr_desc.irq;
 
 	if (lr_desc.state & LR_STATE_PENDING)
 		lr_val |= GICH_LR_PENDING_BIT;
@@ -64,6 +70,14 @@ static void vgic_v2_set_lr(struct kvm_vcpu *vcpu, int lr,
 	if (lr_desc.state & LR_EOI_INT)
 		lr_val |= GICH_LR_EOI;
 
+	if (lr_desc.state & LR_HW) {
+		lr_val |= GICH_LR_HW;
+		lr_val |= (u32)lr_desc.hwirq << GICH_LR_PHYSID_CPUID_SHIFT;
+	}
+
+	if (lr_desc.irq < VGIC_NR_SGIS)
+		lr_val |= (lr_desc.source << GICH_LR_PHYSID_CPUID_SHIFT);
+
 	vcpu->arch.vgic_cpu.vgic_v2.vgic_lr[lr] = lr_val;
 }
 
diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
index dff0602..afbf925 100644
--- a/virt/kvm/arm/vgic-v3.c
+++ b/virt/kvm/arm/vgic-v3.c
@@ -67,6 +67,10 @@ static struct vgic_lr vgic_v3_get_lr(const struct kvm_vcpu *vcpu, int lr)
 		lr_desc.state |= LR_STATE_ACTIVE;
 	if (val & ICH_LR_EOI)
 		lr_desc.state |= LR_EOI_INT;
+	if (val & ICH_LR_HW) {
+		lr_desc.state |= LR_HW;
+		lr_desc.hwirq = (val >> ICH_LR_PHYS_ID_SHIFT) & GENMASK(9, 0);
+	}
 
 	return lr_desc;
 }
@@ -84,10 +88,17 @@ static void vgic_v3_set_lr(struct kvm_vcpu *vcpu, int lr,
 	 * Eventually we want to make this configurable, so we may revisit
 	 * this in the future.
 	 */
-	if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3)
+	switch (vcpu->kvm->arch.vgic.vgic_model) {
+	case KVM_DEV_TYPE_ARM_VGIC_V3:
 		lr_val |= ICH_LR_GROUP;
-	else
-		lr_val |= (u32)lr_desc.source << GICH_LR_PHYSID_CPUID_SHIFT;
+		break;
+	case  KVM_DEV_TYPE_ARM_VGIC_V2:
+		if (lr_desc.irq < VGIC_NR_SGIS)
+			lr_val |= (u32)lr_desc.source << GICH_LR_PHYSID_CPUID_SHIFT;
+		break;
+	default:
+		BUG();
+	}
 
 	if (lr_desc.state & LR_STATE_PENDING)
 		lr_val |= ICH_LR_PENDING_BIT;
@@ -95,6 +106,10 @@ static void vgic_v3_set_lr(struct kvm_vcpu *vcpu, int lr,
 		lr_val |= ICH_LR_ACTIVE_BIT;
 	if (lr_desc.state & LR_EOI_INT)
 		lr_val |= ICH_LR_EOI;
+	if (lr_desc.state & LR_HW) {
+		lr_val |= ICH_LR_HW;
+		lr_val |= ((u64)lr_desc.hwirq) << ICH_LR_PHYS_ID_SHIFT;
+	}
 
 	vcpu->arch.vgic_cpu.vgic_v3.vgic_lr[LR_INDEX(lr)] = lr_val;
 }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 05/11] KVM: arm/arm64: vgic: Relax vgic_can_sample_irq for edge IRQs
  2015-07-24 15:54 ` Marc Zyngier
@ 2015-07-24 15:55   ` Marc Zyngier
  -1 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-07-24 15:55 UTC (permalink / raw)
  To: kvm, kvmarm, linux-arm-kernel
  Cc: Christoffer Dall, Eric Auger, Alex Bennée, Andre Przywara

We only set the irq_queued flag for level interrupts, meaning
that "!vgic_irq_is_queued(vcpu, irq)" is a good enough predicate
for all interrupts.

This will allow us to inject edge HW interrupts, for which the
state ACTIVE+PENDING is not allowed.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/vgic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index bc40137..5bd1695 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -375,7 +375,7 @@ void vgic_cpu_irq_clear(struct kvm_vcpu *vcpu, int irq)
 
 static bool vgic_can_sample_irq(struct kvm_vcpu *vcpu, int irq)
 {
-	return vgic_irq_is_edge(vcpu, irq) || !vgic_irq_is_queued(vcpu, irq);
+	return !vgic_irq_is_queued(vcpu, irq);
 }
 
 /**
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 05/11] KVM: arm/arm64: vgic: Relax vgic_can_sample_irq for edge IRQs
@ 2015-07-24 15:55   ` Marc Zyngier
  0 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-07-24 15:55 UTC (permalink / raw)
  To: linux-arm-kernel

We only set the irq_queued flag for level interrupts, meaning
that "!vgic_irq_is_queued(vcpu, irq)" is a good enough predicate
for all interrupts.

This will allow us to inject edge HW interrupts, for which the
state ACTIVE+PENDING is not allowed.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/vgic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index bc40137..5bd1695 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -375,7 +375,7 @@ void vgic_cpu_irq_clear(struct kvm_vcpu *vcpu, int irq)
 
 static bool vgic_can_sample_irq(struct kvm_vcpu *vcpu, int irq)
 {
-	return vgic_irq_is_edge(vcpu, irq) || !vgic_irq_is_queued(vcpu, irq);
+	return !vgic_irq_is_queued(vcpu, irq);
 }
 
 /**
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 06/11] KVM: arm/arm64: vgic: Allow dynamic mapping of physical/virtual interrupts
  2015-07-24 15:54 ` Marc Zyngier
@ 2015-07-24 15:55   ` Marc Zyngier
  -1 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-07-24 15:55 UTC (permalink / raw)
  To: kvm, kvmarm, linux-arm-kernel; +Cc: Andre Przywara

In order to be able to feed physical interrupts to a guest, we need
to be able to establish the virtual-physical mapping between the two
worlds.

The mappings are kept in a set of RCU lists, indexed by virtual interrupts.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm/kvm/arm.c     |   2 +
 include/kvm/arm_vgic.h |  26 +++++++
 virt/kvm/arm/vgic.c    | 189 ++++++++++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 216 insertions(+), 1 deletion(-)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index f1bf418..ce404a5 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -125,6 +125,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 	if (ret)
 		goto out_free_stage2_pgd;
 
+	kvm_vgic_early_init(kvm);
 	kvm_timer_init(kvm);
 
 	/* Mark the initial VMID generation invalid */
@@ -249,6 +250,7 @@ out:
 
 void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
 {
+	kvm_vgic_vcpu_early_init(vcpu);
 }
 
 void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu)
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index a881e39..68212a1 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -159,6 +159,20 @@ struct vgic_io_device {
 	struct kvm_io_device dev;
 };
 
+struct irq_phys_map {
+	u32			virt_irq;
+	u32			phys_irq;
+	u32			irq;
+	bool			deleted;
+	bool			active;
+};
+
+struct irq_phys_map_entry {
+	struct list_head	entry;
+	struct rcu_head		rcu;
+	struct irq_phys_map	map;
+};
+
 struct vgic_dist {
 	spinlock_t		lock;
 	bool			in_kernel;
@@ -256,6 +270,10 @@ struct vgic_dist {
 	struct vgic_vm_ops	vm_ops;
 	struct vgic_io_device	dist_iodev;
 	struct vgic_io_device	*redist_iodevs;
+
+	/* Virtual irq to hwirq mapping */
+	spinlock_t		irq_phys_map_lock;
+	struct list_head	irq_phys_map_list;
 };
 
 struct vgic_v2_cpu_if {
@@ -307,6 +325,9 @@ struct vgic_cpu {
 		struct vgic_v2_cpu_if	vgic_v2;
 		struct vgic_v3_cpu_if	vgic_v3;
 	};
+
+	/* Protected by the distributor's irq_phys_map_lock */
+	struct list_head	irq_phys_map_list;
 };
 
 #define LR_EMPTY	0xff
@@ -321,8 +342,10 @@ int kvm_vgic_addr(struct kvm *kvm, unsigned long type, u64 *addr, bool write);
 int kvm_vgic_hyp_init(void);
 int kvm_vgic_map_resources(struct kvm *kvm);
 int kvm_vgic_get_max_vcpus(void);
+void kvm_vgic_early_init(struct kvm *kvm);
 int kvm_vgic_create(struct kvm *kvm, u32 type);
 void kvm_vgic_destroy(struct kvm *kvm);
+void kvm_vgic_vcpu_early_init(struct kvm_vcpu *vcpu);
 void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu);
 void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
 void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
@@ -331,6 +354,9 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
 void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
 int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
 int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
+struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
+					   int virt_irq, int irq);
+int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
 
 #define irqchip_in_kernel(k)	(!!((k)->arch.vgic.in_kernel))
 #define vgic_initialized(k)	(!!((k)->arch.vgic.nr_cpus))
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 5bd1695..c74d929 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -24,6 +24,7 @@
 #include <linux/of.h>
 #include <linux/of_address.h>
 #include <linux/of_irq.h>
+#include <linux/rculist.h>
 #include <linux/uaccess.h>
 
 #include <asm/kvm_emulate.h>
@@ -82,6 +83,8 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu);
 static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu);
 static struct vgic_lr vgic_get_lr(const struct kvm_vcpu *vcpu, int lr);
 static void vgic_set_lr(struct kvm_vcpu *vcpu, int lr, struct vgic_lr lr_desc);
+static struct irq_phys_map *vgic_irq_map_search(struct kvm_vcpu *vcpu,
+						int virt_irq);
 
 static const struct vgic_ops *vgic_ops;
 static const struct vgic_params *vgic;
@@ -1583,6 +1586,166 @@ static irqreturn_t vgic_maintenance_handler(int irq, void *data)
 	return IRQ_HANDLED;
 }
 
+static struct list_head *vgic_get_irq_phys_map_list(struct kvm_vcpu *vcpu,
+						    int virt_irq)
+{
+	if (virt_irq < VGIC_NR_PRIVATE_IRQS)
+		return &vcpu->arch.vgic_cpu.irq_phys_map_list;
+	else
+		return &vcpu->kvm->arch.vgic.irq_phys_map_list;
+}
+
+/**
+ * kvm_vgic_map_phys_irq - map a virtual IRQ to a physical IRQ
+ * @vcpu: The VCPU pointer
+ * @virt_irq: The virtual irq number
+ * @irq: The Linux IRQ number
+ *
+ * Establish a mapping between a guest visible irq (@virt_irq) and a
+ * Linux irq (@irq). On injection, @virt_irq will be associated with
+ * the physical interrupt represented by @irq.
+ *
+ * Returns a valid pointer on success, and an error pointer otherwise
+ */
+struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
+					   int virt_irq, int irq)
+{
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+	struct list_head *root = vgic_get_irq_phys_map_list(vcpu, virt_irq);
+	struct irq_phys_map *map;
+	struct irq_phys_map_entry *entry;
+	struct irq_desc *desc;
+	struct irq_data *data;
+	int phys_irq;
+
+	desc = irq_to_desc(irq);
+	if (!desc) {
+		kvm_err("kvm_vgic_map_phys_irq: no interrupt descriptor\n");
+		return ERR_PTR(-EINVAL);
+	}
+
+	data = irq_desc_get_irq_data(desc);
+	while (data->parent_data)
+		data = data->parent_data;
+
+	phys_irq = data->hwirq;
+
+	/* Create a new mapping */
+	entry = kzalloc(sizeof(*entry), GFP_KERNEL);
+	if (!entry)
+		return ERR_PTR(-ENOMEM);
+
+	spin_lock(&dist->irq_phys_map_lock);
+
+	/* Try to match an existing mapping */
+	map = vgic_irq_map_search(vcpu, virt_irq);
+	if (map) {
+		/* Make sure this mapping matches */
+		if (map->phys_irq != phys_irq	||
+		    map->irq      != irq)
+			map = ERR_PTR(-EINVAL);
+
+		goto out;
+	}
+
+	map           = &entry->map;
+	map->virt_irq = virt_irq;
+	map->phys_irq = phys_irq;
+	map->irq      = irq;
+
+	list_add_tail_rcu(&entry->entry, root);
+
+out:
+	spin_unlock(&dist->irq_phys_map_lock);
+	/* If we've found a hit in the existing list, free the useless
+	 * entry */
+	if (IS_ERR(map) || map != &entry->map)
+		kfree(entry);
+	return map;
+}
+
+static struct irq_phys_map *vgic_irq_map_search(struct kvm_vcpu *vcpu,
+						int virt_irq)
+{
+	struct list_head *root = vgic_get_irq_phys_map_list(vcpu, virt_irq);
+	struct irq_phys_map_entry *entry;
+	struct irq_phys_map *map;
+
+	rcu_read_lock();
+
+	list_for_each_entry_rcu(entry, root, entry) {
+		map = &entry->map;
+		if (map->virt_irq == virt_irq && !map->deleted) {
+			rcu_read_unlock();
+			return map;
+		}
+	}
+
+	rcu_read_unlock();
+
+	return NULL;
+}
+
+static void vgic_free_phys_irq_map_rcu(struct rcu_head *rcu)
+{
+	struct irq_phys_map_entry *entry;
+
+	entry = container_of(rcu, struct irq_phys_map_entry, rcu);
+	kfree(entry);
+}
+
+/**
+ * kvm_vgic_unmap_phys_irq - Remove a virtual to physical IRQ mapping
+ * @vcpu: The VCPU pointer
+ * @map: The pointer to a mapping obtained through kvm_vgic_map_phys_irq
+ *
+ * Remove an existing mapping between virtual and physical interrupts.
+ */
+int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map)
+{
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+	struct irq_phys_map_entry *entry;
+	struct list_head *root;
+
+	if (!map)
+		return -EINVAL;
+
+	root = vgic_get_irq_phys_map_list(vcpu, map->virt_irq);
+
+	spin_lock(&dist->irq_phys_map_lock);
+
+	list_for_each_entry(entry, root, entry) {
+		if (&entry->map == map && !map->deleted) {
+			map->deleted = true;
+			list_del_rcu(&entry->entry);
+			call_rcu(&entry->rcu, vgic_free_phys_irq_map_rcu);
+			break;
+		}
+	}
+
+	spin_unlock(&dist->irq_phys_map_lock);
+
+	return 0;
+}
+
+static void vgic_destroy_irq_phys_map(struct kvm *kvm, struct list_head *root)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	struct irq_phys_map_entry *entry;
+
+	spin_lock(&dist->irq_phys_map_lock);
+
+	list_for_each_entry(entry, root, entry) {
+		if (!entry->map.deleted) {
+			entry->map.deleted = true;
+			list_del_rcu(&entry->entry);
+			call_rcu(&entry->rcu, vgic_free_phys_irq_map_rcu);
+		}
+	}
+
+	spin_unlock(&dist->irq_phys_map_lock);
+}
+
 void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
 {
 	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
@@ -1591,6 +1754,7 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
 	kfree(vgic_cpu->active_shared);
 	kfree(vgic_cpu->pend_act_shared);
 	kfree(vgic_cpu->vgic_irq_lr_map);
+	vgic_destroy_irq_phys_map(vcpu->kvm, &vgic_cpu->irq_phys_map_list);
 	vgic_cpu->pending_shared = NULL;
 	vgic_cpu->active_shared = NULL;
 	vgic_cpu->pend_act_shared = NULL;
@@ -1628,6 +1792,17 @@ static int vgic_vcpu_init_maps(struct kvm_vcpu *vcpu, int nr_irqs)
 }
 
 /**
+ * kvm_vgic_vcpu_early_init - Earliest possible per-vcpu vgic init stage
+ *
+ * No memory allocation should be performed here, only static init.
+ */
+void kvm_vgic_vcpu_early_init(struct kvm_vcpu *vcpu)
+{
+	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
+	INIT_LIST_HEAD(&vgic_cpu->irq_phys_map_list);
+}
+
+/**
  * kvm_vgic_get_max_vcpus - Get the maximum number of VCPUs allowed by HW
  *
  * The host's GIC naturally limits the maximum amount of VCPUs a guest
@@ -1664,6 +1839,7 @@ void kvm_vgic_destroy(struct kvm *kvm)
 	kfree(dist->irq_spi_target);
 	kfree(dist->irq_pending_on_cpu);
 	kfree(dist->irq_active_on_cpu);
+	vgic_destroy_irq_phys_map(kvm, &dist->irq_phys_map_list);
 	dist->irq_sgi_sources = NULL;
 	dist->irq_spi_cpu = NULL;
 	dist->irq_spi_target = NULL;
@@ -1787,6 +1963,18 @@ static int init_vgic_model(struct kvm *kvm, int type)
 	return 0;
 }
 
+/**
+ * kvm_vgic_early_init - Earliest possible vgic initialization stage
+ *
+ * No memory allocation should be performed here, only static init.
+ */
+void kvm_vgic_early_init(struct kvm *kvm)
+{
+	spin_lock_init(&kvm->arch.vgic.lock);
+	spin_lock_init(&kvm->arch.vgic.irq_phys_map_lock);
+	INIT_LIST_HEAD(&kvm->arch.vgic.irq_phys_map_list);
+}
+
 int kvm_vgic_create(struct kvm *kvm, u32 type)
 {
 	int i, vcpu_lock_idx = -1, ret;
@@ -1832,7 +2020,6 @@ int kvm_vgic_create(struct kvm *kvm, u32 type)
 	if (ret)
 		goto out_unlock;
 
-	spin_lock_init(&kvm->arch.vgic.lock);
 	kvm->arch.vgic.in_kernel = true;
 	kvm->arch.vgic.vgic_model = type;
 	kvm->arch.vgic.vctrl_base = vgic->vctrl_base;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 06/11] KVM: arm/arm64: vgic: Allow dynamic mapping of physical/virtual interrupts
@ 2015-07-24 15:55   ` Marc Zyngier
  0 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-07-24 15:55 UTC (permalink / raw)
  To: linux-arm-kernel

In order to be able to feed physical interrupts to a guest, we need
to be able to establish the virtual-physical mapping between the two
worlds.

The mappings are kept in a set of RCU lists, indexed by virtual interrupts.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm/kvm/arm.c     |   2 +
 include/kvm/arm_vgic.h |  26 +++++++
 virt/kvm/arm/vgic.c    | 189 ++++++++++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 216 insertions(+), 1 deletion(-)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index f1bf418..ce404a5 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -125,6 +125,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 	if (ret)
 		goto out_free_stage2_pgd;
 
+	kvm_vgic_early_init(kvm);
 	kvm_timer_init(kvm);
 
 	/* Mark the initial VMID generation invalid */
@@ -249,6 +250,7 @@ out:
 
 void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
 {
+	kvm_vgic_vcpu_early_init(vcpu);
 }
 
 void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu)
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index a881e39..68212a1 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -159,6 +159,20 @@ struct vgic_io_device {
 	struct kvm_io_device dev;
 };
 
+struct irq_phys_map {
+	u32			virt_irq;
+	u32			phys_irq;
+	u32			irq;
+	bool			deleted;
+	bool			active;
+};
+
+struct irq_phys_map_entry {
+	struct list_head	entry;
+	struct rcu_head		rcu;
+	struct irq_phys_map	map;
+};
+
 struct vgic_dist {
 	spinlock_t		lock;
 	bool			in_kernel;
@@ -256,6 +270,10 @@ struct vgic_dist {
 	struct vgic_vm_ops	vm_ops;
 	struct vgic_io_device	dist_iodev;
 	struct vgic_io_device	*redist_iodevs;
+
+	/* Virtual irq to hwirq mapping */
+	spinlock_t		irq_phys_map_lock;
+	struct list_head	irq_phys_map_list;
 };
 
 struct vgic_v2_cpu_if {
@@ -307,6 +325,9 @@ struct vgic_cpu {
 		struct vgic_v2_cpu_if	vgic_v2;
 		struct vgic_v3_cpu_if	vgic_v3;
 	};
+
+	/* Protected by the distributor's irq_phys_map_lock */
+	struct list_head	irq_phys_map_list;
 };
 
 #define LR_EMPTY	0xff
@@ -321,8 +342,10 @@ int kvm_vgic_addr(struct kvm *kvm, unsigned long type, u64 *addr, bool write);
 int kvm_vgic_hyp_init(void);
 int kvm_vgic_map_resources(struct kvm *kvm);
 int kvm_vgic_get_max_vcpus(void);
+void kvm_vgic_early_init(struct kvm *kvm);
 int kvm_vgic_create(struct kvm *kvm, u32 type);
 void kvm_vgic_destroy(struct kvm *kvm);
+void kvm_vgic_vcpu_early_init(struct kvm_vcpu *vcpu);
 void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu);
 void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
 void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
@@ -331,6 +354,9 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
 void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
 int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
 int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
+struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
+					   int virt_irq, int irq);
+int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
 
 #define irqchip_in_kernel(k)	(!!((k)->arch.vgic.in_kernel))
 #define vgic_initialized(k)	(!!((k)->arch.vgic.nr_cpus))
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 5bd1695..c74d929 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -24,6 +24,7 @@
 #include <linux/of.h>
 #include <linux/of_address.h>
 #include <linux/of_irq.h>
+#include <linux/rculist.h>
 #include <linux/uaccess.h>
 
 #include <asm/kvm_emulate.h>
@@ -82,6 +83,8 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu);
 static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu);
 static struct vgic_lr vgic_get_lr(const struct kvm_vcpu *vcpu, int lr);
 static void vgic_set_lr(struct kvm_vcpu *vcpu, int lr, struct vgic_lr lr_desc);
+static struct irq_phys_map *vgic_irq_map_search(struct kvm_vcpu *vcpu,
+						int virt_irq);
 
 static const struct vgic_ops *vgic_ops;
 static const struct vgic_params *vgic;
@@ -1583,6 +1586,166 @@ static irqreturn_t vgic_maintenance_handler(int irq, void *data)
 	return IRQ_HANDLED;
 }
 
+static struct list_head *vgic_get_irq_phys_map_list(struct kvm_vcpu *vcpu,
+						    int virt_irq)
+{
+	if (virt_irq < VGIC_NR_PRIVATE_IRQS)
+		return &vcpu->arch.vgic_cpu.irq_phys_map_list;
+	else
+		return &vcpu->kvm->arch.vgic.irq_phys_map_list;
+}
+
+/**
+ * kvm_vgic_map_phys_irq - map a virtual IRQ to a physical IRQ
+ * @vcpu: The VCPU pointer
+ * @virt_irq: The virtual irq number
+ * @irq: The Linux IRQ number
+ *
+ * Establish a mapping between a guest visible irq (@virt_irq) and a
+ * Linux irq (@irq). On injection, @virt_irq will be associated with
+ * the physical interrupt represented by @irq.
+ *
+ * Returns a valid pointer on success, and an error pointer otherwise
+ */
+struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
+					   int virt_irq, int irq)
+{
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+	struct list_head *root = vgic_get_irq_phys_map_list(vcpu, virt_irq);
+	struct irq_phys_map *map;
+	struct irq_phys_map_entry *entry;
+	struct irq_desc *desc;
+	struct irq_data *data;
+	int phys_irq;
+
+	desc = irq_to_desc(irq);
+	if (!desc) {
+		kvm_err("kvm_vgic_map_phys_irq: no interrupt descriptor\n");
+		return ERR_PTR(-EINVAL);
+	}
+
+	data = irq_desc_get_irq_data(desc);
+	while (data->parent_data)
+		data = data->parent_data;
+
+	phys_irq = data->hwirq;
+
+	/* Create a new mapping */
+	entry = kzalloc(sizeof(*entry), GFP_KERNEL);
+	if (!entry)
+		return ERR_PTR(-ENOMEM);
+
+	spin_lock(&dist->irq_phys_map_lock);
+
+	/* Try to match an existing mapping */
+	map = vgic_irq_map_search(vcpu, virt_irq);
+	if (map) {
+		/* Make sure this mapping matches */
+		if (map->phys_irq != phys_irq	||
+		    map->irq      != irq)
+			map = ERR_PTR(-EINVAL);
+
+		goto out;
+	}
+
+	map           = &entry->map;
+	map->virt_irq = virt_irq;
+	map->phys_irq = phys_irq;
+	map->irq      = irq;
+
+	list_add_tail_rcu(&entry->entry, root);
+
+out:
+	spin_unlock(&dist->irq_phys_map_lock);
+	/* If we've found a hit in the existing list, free the useless
+	 * entry */
+	if (IS_ERR(map) || map != &entry->map)
+		kfree(entry);
+	return map;
+}
+
+static struct irq_phys_map *vgic_irq_map_search(struct kvm_vcpu *vcpu,
+						int virt_irq)
+{
+	struct list_head *root = vgic_get_irq_phys_map_list(vcpu, virt_irq);
+	struct irq_phys_map_entry *entry;
+	struct irq_phys_map *map;
+
+	rcu_read_lock();
+
+	list_for_each_entry_rcu(entry, root, entry) {
+		map = &entry->map;
+		if (map->virt_irq == virt_irq && !map->deleted) {
+			rcu_read_unlock();
+			return map;
+		}
+	}
+
+	rcu_read_unlock();
+
+	return NULL;
+}
+
+static void vgic_free_phys_irq_map_rcu(struct rcu_head *rcu)
+{
+	struct irq_phys_map_entry *entry;
+
+	entry = container_of(rcu, struct irq_phys_map_entry, rcu);
+	kfree(entry);
+}
+
+/**
+ * kvm_vgic_unmap_phys_irq - Remove a virtual to physical IRQ mapping
+ * @vcpu: The VCPU pointer
+ * @map: The pointer to a mapping obtained through kvm_vgic_map_phys_irq
+ *
+ * Remove an existing mapping between virtual and physical interrupts.
+ */
+int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map)
+{
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+	struct irq_phys_map_entry *entry;
+	struct list_head *root;
+
+	if (!map)
+		return -EINVAL;
+
+	root = vgic_get_irq_phys_map_list(vcpu, map->virt_irq);
+
+	spin_lock(&dist->irq_phys_map_lock);
+
+	list_for_each_entry(entry, root, entry) {
+		if (&entry->map == map && !map->deleted) {
+			map->deleted = true;
+			list_del_rcu(&entry->entry);
+			call_rcu(&entry->rcu, vgic_free_phys_irq_map_rcu);
+			break;
+		}
+	}
+
+	spin_unlock(&dist->irq_phys_map_lock);
+
+	return 0;
+}
+
+static void vgic_destroy_irq_phys_map(struct kvm *kvm, struct list_head *root)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	struct irq_phys_map_entry *entry;
+
+	spin_lock(&dist->irq_phys_map_lock);
+
+	list_for_each_entry(entry, root, entry) {
+		if (!entry->map.deleted) {
+			entry->map.deleted = true;
+			list_del_rcu(&entry->entry);
+			call_rcu(&entry->rcu, vgic_free_phys_irq_map_rcu);
+		}
+	}
+
+	spin_unlock(&dist->irq_phys_map_lock);
+}
+
 void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
 {
 	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
@@ -1591,6 +1754,7 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
 	kfree(vgic_cpu->active_shared);
 	kfree(vgic_cpu->pend_act_shared);
 	kfree(vgic_cpu->vgic_irq_lr_map);
+	vgic_destroy_irq_phys_map(vcpu->kvm, &vgic_cpu->irq_phys_map_list);
 	vgic_cpu->pending_shared = NULL;
 	vgic_cpu->active_shared = NULL;
 	vgic_cpu->pend_act_shared = NULL;
@@ -1628,6 +1792,17 @@ static int vgic_vcpu_init_maps(struct kvm_vcpu *vcpu, int nr_irqs)
 }
 
 /**
+ * kvm_vgic_vcpu_early_init - Earliest possible per-vcpu vgic init stage
+ *
+ * No memory allocation should be performed here, only static init.
+ */
+void kvm_vgic_vcpu_early_init(struct kvm_vcpu *vcpu)
+{
+	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
+	INIT_LIST_HEAD(&vgic_cpu->irq_phys_map_list);
+}
+
+/**
  * kvm_vgic_get_max_vcpus - Get the maximum number of VCPUs allowed by HW
  *
  * The host's GIC naturally limits the maximum amount of VCPUs a guest
@@ -1664,6 +1839,7 @@ void kvm_vgic_destroy(struct kvm *kvm)
 	kfree(dist->irq_spi_target);
 	kfree(dist->irq_pending_on_cpu);
 	kfree(dist->irq_active_on_cpu);
+	vgic_destroy_irq_phys_map(kvm, &dist->irq_phys_map_list);
 	dist->irq_sgi_sources = NULL;
 	dist->irq_spi_cpu = NULL;
 	dist->irq_spi_target = NULL;
@@ -1787,6 +1963,18 @@ static int init_vgic_model(struct kvm *kvm, int type)
 	return 0;
 }
 
+/**
+ * kvm_vgic_early_init - Earliest possible vgic initialization stage
+ *
+ * No memory allocation should be performed here, only static init.
+ */
+void kvm_vgic_early_init(struct kvm *kvm)
+{
+	spin_lock_init(&kvm->arch.vgic.lock);
+	spin_lock_init(&kvm->arch.vgic.irq_phys_map_lock);
+	INIT_LIST_HEAD(&kvm->arch.vgic.irq_phys_map_list);
+}
+
 int kvm_vgic_create(struct kvm *kvm, u32 type)
 {
 	int i, vcpu_lock_idx = -1, ret;
@@ -1832,7 +2020,6 @@ int kvm_vgic_create(struct kvm *kvm, u32 type)
 	if (ret)
 		goto out_unlock;
 
-	spin_lock_init(&kvm->arch.vgic.lock);
 	kvm->arch.vgic.in_kernel = true;
 	kvm->arch.vgic.vgic_model = type;
 	kvm->arch.vgic.vctrl_base = vgic->vctrl_base;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 07/11] KVM: arm/arm64: vgic: Allow HW interrupts to be queued to a guest
  2015-07-24 15:54 ` Marc Zyngier
@ 2015-07-24 15:55   ` Marc Zyngier
  -1 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-07-24 15:55 UTC (permalink / raw)
  To: kvm, kvmarm, linux-arm-kernel; +Cc: Andre Przywara

To allow a HW interrupt to be injected into a guest, we lookup the
guest virtual interrupt in the irq_phys_map list, and if we have
a match, encode both interrupts in the LR.

We also mark the interrupt as "active" at the host distributor level.

On guest EOI on the virtual interrupt, the host interrupt will be
deactivated.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/vgic.c | 89 +++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 86 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index c74d929..6422b0c 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1118,6 +1118,39 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
 	if (!vgic_irq_is_edge(vcpu, irq))
 		vlr.state |= LR_EOI_INT;
 
+	if (vlr.irq >= VGIC_NR_SGIS) {
+		struct irq_phys_map *map;
+		map = vgic_irq_map_search(vcpu, irq);
+
+		/*
+		 * If we have a mapping, and the virtual interrupt is
+		 * being injected, then we must set the state to
+		 * active in the physical world. Otherwise the
+		 * physical interrupt will fire and the guest will
+		 * exit before processing the virtual interrupt.
+		 */
+		if (map) {
+			int ret;
+
+			BUG_ON(!map->active);
+			vlr.hwirq = map->phys_irq;
+			vlr.state |= LR_HW;
+			vlr.state &= ~LR_EOI_INT;
+
+			ret = irq_set_irqchip_state(map->irq,
+						    IRQCHIP_STATE_ACTIVE,
+						    true);
+			WARN_ON(ret);
+
+			/*
+			 * Make sure we're not going to sample this
+			 * again, as a HW-backed interrupt cannot be
+			 * in the PENDING_ACTIVE stage.
+			 */
+			vgic_irq_set_queued(vcpu, irq);
+		}
+	}
+
 	vgic_set_lr(vcpu, lr_nr, vlr);
 	vgic_sync_lr_elrsr(vcpu, lr_nr, vlr);
 }
@@ -1342,6 +1375,39 @@ static bool vgic_process_maintenance(struct kvm_vcpu *vcpu)
 	return level_pending;
 }
 
+/*
+ * Save the physical active state, and reset it to inactive.
+ *
+ * Return 1 if HW interrupt went from active to inactive, and 0 otherwise.
+ */
+static int vgic_sync_hwirq(struct kvm_vcpu *vcpu, struct vgic_lr vlr)
+{
+	struct irq_phys_map *map;
+	int ret;
+
+	if (!(vlr.state & LR_HW))
+		return 0;
+
+	map = vgic_irq_map_search(vcpu, vlr.irq);
+	BUG_ON(!map || !map->active);
+
+	ret = irq_get_irqchip_state(map->irq,
+				    IRQCHIP_STATE_ACTIVE,
+				    &map->active);
+
+	WARN_ON(ret);
+
+	if (map->active) {
+		ret = irq_set_irqchip_state(map->irq,
+					    IRQCHIP_STATE_ACTIVE,
+					    false);
+		WARN_ON(ret);
+		return 0;
+	}
+
+	return 1;
+}
+
 /* Sync back the VGIC state after a guest run */
 static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
 {
@@ -1356,14 +1422,31 @@ static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
 	elrsr = vgic_get_elrsr(vcpu);
 	elrsr_ptr = u64_to_bitmask(&elrsr);
 
-	/* Clear mappings for empty LRs */
-	for_each_set_bit(lr, elrsr_ptr, vgic->nr_lr) {
+	/* Deal with HW interrupts, and clear mappings for empty LRs */
+	for (lr = 0; lr < vgic->nr_lr; lr++) {
 		struct vgic_lr vlr;
 
-		if (!test_and_clear_bit(lr, vgic_cpu->lr_used))
+		if (!test_bit(lr, vgic_cpu->lr_used))
 			continue;
 
 		vlr = vgic_get_lr(vcpu, lr);
+		if (vgic_sync_hwirq(vcpu, vlr)) {
+			/*
+			 * So this is a HW interrupt that the guest
+			 * EOI-ed. Clean the LR state and allow the
+			 * interrupt to be sampled again.
+			 */
+			vlr.state = 0;
+			vlr.hwirq = 0;
+			vgic_set_lr(vcpu, lr, vlr);
+			vgic_irq_clear_queued(vcpu, vlr.irq);
+			set_bit(lr, elrsr_ptr);
+		}
+
+		if (!test_bit(lr, elrsr_ptr))
+			continue;
+
+		clear_bit(lr, vgic_cpu->lr_used);
 
 		BUG_ON(vlr.irq >= dist->nr_irqs);
 		vgic_cpu->vgic_irq_lr_map[vlr.irq] = LR_EMPTY;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 07/11] KVM: arm/arm64: vgic: Allow HW interrupts to be queued to a guest
@ 2015-07-24 15:55   ` Marc Zyngier
  0 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-07-24 15:55 UTC (permalink / raw)
  To: linux-arm-kernel

To allow a HW interrupt to be injected into a guest, we lookup the
guest virtual interrupt in the irq_phys_map list, and if we have
a match, encode both interrupts in the LR.

We also mark the interrupt as "active" at the host distributor level.

On guest EOI on the virtual interrupt, the host interrupt will be
deactivated.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/vgic.c | 89 +++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 86 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index c74d929..6422b0c 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1118,6 +1118,39 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
 	if (!vgic_irq_is_edge(vcpu, irq))
 		vlr.state |= LR_EOI_INT;
 
+	if (vlr.irq >= VGIC_NR_SGIS) {
+		struct irq_phys_map *map;
+		map = vgic_irq_map_search(vcpu, irq);
+
+		/*
+		 * If we have a mapping, and the virtual interrupt is
+		 * being injected, then we must set the state to
+		 * active in the physical world. Otherwise the
+		 * physical interrupt will fire and the guest will
+		 * exit before processing the virtual interrupt.
+		 */
+		if (map) {
+			int ret;
+
+			BUG_ON(!map->active);
+			vlr.hwirq = map->phys_irq;
+			vlr.state |= LR_HW;
+			vlr.state &= ~LR_EOI_INT;
+
+			ret = irq_set_irqchip_state(map->irq,
+						    IRQCHIP_STATE_ACTIVE,
+						    true);
+			WARN_ON(ret);
+
+			/*
+			 * Make sure we're not going to sample this
+			 * again, as a HW-backed interrupt cannot be
+			 * in the PENDING_ACTIVE stage.
+			 */
+			vgic_irq_set_queued(vcpu, irq);
+		}
+	}
+
 	vgic_set_lr(vcpu, lr_nr, vlr);
 	vgic_sync_lr_elrsr(vcpu, lr_nr, vlr);
 }
@@ -1342,6 +1375,39 @@ static bool vgic_process_maintenance(struct kvm_vcpu *vcpu)
 	return level_pending;
 }
 
+/*
+ * Save the physical active state, and reset it to inactive.
+ *
+ * Return 1 if HW interrupt went from active to inactive, and 0 otherwise.
+ */
+static int vgic_sync_hwirq(struct kvm_vcpu *vcpu, struct vgic_lr vlr)
+{
+	struct irq_phys_map *map;
+	int ret;
+
+	if (!(vlr.state & LR_HW))
+		return 0;
+
+	map = vgic_irq_map_search(vcpu, vlr.irq);
+	BUG_ON(!map || !map->active);
+
+	ret = irq_get_irqchip_state(map->irq,
+				    IRQCHIP_STATE_ACTIVE,
+				    &map->active);
+
+	WARN_ON(ret);
+
+	if (map->active) {
+		ret = irq_set_irqchip_state(map->irq,
+					    IRQCHIP_STATE_ACTIVE,
+					    false);
+		WARN_ON(ret);
+		return 0;
+	}
+
+	return 1;
+}
+
 /* Sync back the VGIC state after a guest run */
 static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
 {
@@ -1356,14 +1422,31 @@ static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
 	elrsr = vgic_get_elrsr(vcpu);
 	elrsr_ptr = u64_to_bitmask(&elrsr);
 
-	/* Clear mappings for empty LRs */
-	for_each_set_bit(lr, elrsr_ptr, vgic->nr_lr) {
+	/* Deal with HW interrupts, and clear mappings for empty LRs */
+	for (lr = 0; lr < vgic->nr_lr; lr++) {
 		struct vgic_lr vlr;
 
-		if (!test_and_clear_bit(lr, vgic_cpu->lr_used))
+		if (!test_bit(lr, vgic_cpu->lr_used))
 			continue;
 
 		vlr = vgic_get_lr(vcpu, lr);
+		if (vgic_sync_hwirq(vcpu, vlr)) {
+			/*
+			 * So this is a HW interrupt that the guest
+			 * EOI-ed. Clean the LR state and allow the
+			 * interrupt to be sampled again.
+			 */
+			vlr.state = 0;
+			vlr.hwirq = 0;
+			vgic_set_lr(vcpu, lr, vlr);
+			vgic_irq_clear_queued(vcpu, vlr.irq);
+			set_bit(lr, elrsr_ptr);
+		}
+
+		if (!test_bit(lr, elrsr_ptr))
+			continue;
+
+		clear_bit(lr, vgic_cpu->lr_used);
 
 		BUG_ON(vlr.irq >= dist->nr_irqs);
 		vgic_cpu->vgic_irq_lr_map[vlr.irq] = LR_EMPTY;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 08/11] KVM: arm/arm64: vgic: Add vgic_{get,set}_phys_irq_active
  2015-07-24 15:54 ` Marc Zyngier
@ 2015-07-24 15:55   ` Marc Zyngier
  -1 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-07-24 15:55 UTC (permalink / raw)
  To: kvm, kvmarm, linux-arm-kernel
  Cc: Christoffer Dall, Eric Auger, Alex Bennée, Andre Przywara

In order to control the active state of an interrupt, introduce
a pair of accessors allowing the state to be set/queried.

This only affects the logical state, and the HW state will only be
applied at world-switch time.

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 include/kvm/arm_vgic.h |  2 ++
 virt/kvm/arm/vgic.c    | 24 ++++++++++++++++++++++++
 2 files changed, 26 insertions(+)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 68212a1..7306b4b 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -357,6 +357,8 @@ int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
 struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
 					   int virt_irq, int irq);
 int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
+bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map);
+void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active);
 
 #define irqchip_in_kernel(k)	(!!((k)->arch.vgic.in_kernel))
 #define vgic_initialized(k)	(!!((k)->arch.vgic.nr_cpus))
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 6422b0c..3f7b690 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1778,6 +1778,30 @@ static void vgic_free_phys_irq_map_rcu(struct rcu_head *rcu)
 }
 
 /**
+ * kvm_vgic_get_phys_irq_active - Return the active state of a mapped IRQ
+ *
+ * Return the logical active state of a mapped interrupt. This doesn't
+ * necessarily reflects the current HW state.
+ */
+bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map)
+{
+	BUG_ON(!map);
+	return map->active;
+}
+
+/**
+ * kvm_vgic_set_phys_irq_active - Set the active state of a mapped IRQ
+ *
+ * Set the logical active state of a mapped interrupt. This doesn't
+ * immediately affects the HW state.
+ */
+void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active)
+{
+	BUG_ON(!map);
+	map->active = active;
+}
+
+/**
  * kvm_vgic_unmap_phys_irq - Remove a virtual to physical IRQ mapping
  * @vcpu: The VCPU pointer
  * @map: The pointer to a mapping obtained through kvm_vgic_map_phys_irq
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 08/11] KVM: arm/arm64: vgic: Add vgic_{get, set}_phys_irq_active
@ 2015-07-24 15:55   ` Marc Zyngier
  0 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-07-24 15:55 UTC (permalink / raw)
  To: linux-arm-kernel

In order to control the active state of an interrupt, introduce
a pair of accessors allowing the state to be set/queried.

This only affects the logical state, and the HW state will only be
applied at world-switch time.

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 include/kvm/arm_vgic.h |  2 ++
 virt/kvm/arm/vgic.c    | 24 ++++++++++++++++++++++++
 2 files changed, 26 insertions(+)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 68212a1..7306b4b 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -357,6 +357,8 @@ int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
 struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
 					   int virt_irq, int irq);
 int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
+bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map);
+void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active);
 
 #define irqchip_in_kernel(k)	(!!((k)->arch.vgic.in_kernel))
 #define vgic_initialized(k)	(!!((k)->arch.vgic.nr_cpus))
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 6422b0c..3f7b690 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1778,6 +1778,30 @@ static void vgic_free_phys_irq_map_rcu(struct rcu_head *rcu)
 }
 
 /**
+ * kvm_vgic_get_phys_irq_active - Return the active state of a mapped IRQ
+ *
+ * Return the logical active state of a mapped interrupt. This doesn't
+ * necessarily reflects the current HW state.
+ */
+bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map)
+{
+	BUG_ON(!map);
+	return map->active;
+}
+
+/**
+ * kvm_vgic_set_phys_irq_active - Set the active state of a mapped IRQ
+ *
+ * Set the logical active state of a mapped interrupt. This doesn't
+ * immediately affects the HW state.
+ */
+void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active)
+{
+	BUG_ON(!map);
+	map->active = active;
+}
+
+/**
  * kvm_vgic_unmap_phys_irq - Remove a virtual to physical IRQ mapping
  * @vcpu: The VCPU pointer
  * @map: The pointer to a mapping obtained through kvm_vgic_map_phys_irq
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 09/11] KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interrupt
  2015-07-24 15:54 ` Marc Zyngier
@ 2015-07-24 15:55   ` Marc Zyngier
  -1 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-07-24 15:55 UTC (permalink / raw)
  To: kvm, kvmarm, linux-arm-kernel; +Cc: Andre Przywara

Virtual interrupts mapped to a HW interrupt should only be triggered
from inside the kernel. Otherwise, you could end up confusing the
kernel (and the GIC's) state machine.

Rearrange the injection path so that kvm_vgic_inject_irq is
used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is
used for mapped interrupts. The latter should only be called from
inside the kernel (timer, VFIO).

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 include/kvm/arm_vgic.h |  2 +
 virt/kvm/arm/vgic.c    | 99 ++++++++++++++++++++++++++++++++++----------------
 2 files changed, 70 insertions(+), 31 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 7306b4b..f6bfd79 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -351,6 +351,8 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
 void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
 int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
 			bool level);
+int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
+			       struct irq_phys_map *map, bool level);
 void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
 int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
 int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 3f7b690..e40ef70 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1533,7 +1533,8 @@ static int vgic_validate_injection(struct kvm_vcpu *vcpu, int irq, int level)
 }
 
 static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
-				  unsigned int irq_num, bool level)
+				   struct irq_phys_map *map,
+				   unsigned int irq_num, bool level)
 {
 	struct vgic_dist *dist = &kvm->arch.vgic;
 	struct kvm_vcpu *vcpu;
@@ -1541,6 +1542,9 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
 	int enabled;
 	bool ret = true, can_inject = true;
 
+	if (irq_num >= min(kvm->arch.vgic.nr_irqs, 1020))
+		return -EINVAL;
+
 	spin_lock(&dist->lock);
 
 	vcpu = kvm_get_vcpu(kvm, cpuid);
@@ -1603,14 +1607,42 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
 out:
 	spin_unlock(&dist->lock);
 
-	return ret ? cpuid : -EINVAL;
+	if (!ret) {
+		/* kick the specified vcpu */
+		kvm_vcpu_kick(kvm_get_vcpu(kvm, cpuid));
+	}
+
+	return 0;
+}
+
+static int vgic_lazy_init(struct kvm *kvm)
+{
+	int ret = 0;
+
+	if (unlikely(!vgic_initialized(kvm))) {
+		/*
+		 * We only provide the automatic initialization of the VGIC
+		 * for the legacy case of a GICv2. Any other type must
+		 * be explicitly initialized once setup with the respective
+		 * KVM device call.
+		 */
+		if (kvm->arch.vgic.vgic_model != KVM_DEV_TYPE_ARM_VGIC_V2)
+			return -EBUSY;
+
+		mutex_lock(&kvm->lock);
+		ret = vgic_init(kvm);
+		mutex_unlock(&kvm->lock);
+	}
+
+	return ret;
 }
 
 /**
  * kvm_vgic_inject_irq - Inject an IRQ from a device to the vgic
  * @kvm:     The VM structure pointer
  * @cpuid:   The CPU for PPIs
- * @irq_num: The IRQ number that is assigned to the device
+ * @irq_num: The IRQ number that is assigned to the device. This IRQ
+ *           must not be mapped to a HW interrupt.
  * @level:   Edge-triggered:  true:  to trigger the interrupt
  *			      false: to ignore the call
  *	     Level-sensitive  true:  activates an interrupt
@@ -1623,39 +1655,44 @@ out:
 int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
 			bool level)
 {
-	int ret = 0;
-	int vcpu_id;
-
-	if (unlikely(!vgic_initialized(kvm))) {
-		/*
-		 * We only provide the automatic initialization of the VGIC
-		 * for the legacy case of a GICv2. Any other type must
-		 * be explicitly initialized once setup with the respective
-		 * KVM device call.
-		 */
-		if (kvm->arch.vgic.vgic_model != KVM_DEV_TYPE_ARM_VGIC_V2) {
-			ret = -EBUSY;
-			goto out;
-		}
-		mutex_lock(&kvm->lock);
-		ret = vgic_init(kvm);
-		mutex_unlock(&kvm->lock);
+	struct irq_phys_map *map;
+	int ret;
 
-		if (ret)
-			goto out;
-	}
+	ret = vgic_lazy_init(kvm);
+	if (ret)
+		return ret;
 
-	if (irq_num >= min(kvm->arch.vgic.nr_irqs, 1020))
+	map = vgic_irq_map_search(kvm_get_vcpu(kvm, cpuid), irq_num);
+	if (map)
 		return -EINVAL;
 
-	vcpu_id = vgic_update_irq_pending(kvm, cpuid, irq_num, level);
-	if (vcpu_id >= 0) {
-		/* kick the specified vcpu */
-		kvm_vcpu_kick(kvm_get_vcpu(kvm, vcpu_id));
-	}
+	return vgic_update_irq_pending(kvm, cpuid, NULL, irq_num, level);
+}
 
-out:
-	return ret;
+/**
+ * kvm_vgic_inject_mapped_irq - Inject a physically mapped IRQ to the vgic
+ * @kvm:     The VM structure pointer
+ * @cpuid:   The CPU for PPIs
+ * @map:     Pointer to a irq_phys_map structure describing the mapping
+ * @level:   Edge-triggered:  true:  to trigger the interrupt
+ *			      false: to ignore the call
+ *	     Level-sensitive  true:  activates an interrupt
+ *			      false: deactivates an interrupt
+ *
+ * The GIC is not concerned with devices being active-LOW or active-HIGH for
+ * level-sensitive interrupts.  You can think of the level parameter as 1
+ * being HIGH and 0 being LOW and all devices being active-HIGH.
+ */
+int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
+			       struct irq_phys_map *map, bool level)
+{
+	int ret;
+
+	ret = vgic_lazy_init(kvm);
+	if (ret)
+		return ret;
+
+	return vgic_update_irq_pending(kvm, cpuid, map, map->virt_irq, level);
 }
 
 static irqreturn_t vgic_maintenance_handler(int irq, void *data)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 09/11] KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interrupt
@ 2015-07-24 15:55   ` Marc Zyngier
  0 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-07-24 15:55 UTC (permalink / raw)
  To: linux-arm-kernel

Virtual interrupts mapped to a HW interrupt should only be triggered
from inside the kernel. Otherwise, you could end up confusing the
kernel (and the GIC's) state machine.

Rearrange the injection path so that kvm_vgic_inject_irq is
used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is
used for mapped interrupts. The latter should only be called from
inside the kernel (timer, VFIO).

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 include/kvm/arm_vgic.h |  2 +
 virt/kvm/arm/vgic.c    | 99 ++++++++++++++++++++++++++++++++++----------------
 2 files changed, 70 insertions(+), 31 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 7306b4b..f6bfd79 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -351,6 +351,8 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
 void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
 int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
 			bool level);
+int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
+			       struct irq_phys_map *map, bool level);
 void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
 int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
 int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 3f7b690..e40ef70 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1533,7 +1533,8 @@ static int vgic_validate_injection(struct kvm_vcpu *vcpu, int irq, int level)
 }
 
 static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
-				  unsigned int irq_num, bool level)
+				   struct irq_phys_map *map,
+				   unsigned int irq_num, bool level)
 {
 	struct vgic_dist *dist = &kvm->arch.vgic;
 	struct kvm_vcpu *vcpu;
@@ -1541,6 +1542,9 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
 	int enabled;
 	bool ret = true, can_inject = true;
 
+	if (irq_num >= min(kvm->arch.vgic.nr_irqs, 1020))
+		return -EINVAL;
+
 	spin_lock(&dist->lock);
 
 	vcpu = kvm_get_vcpu(kvm, cpuid);
@@ -1603,14 +1607,42 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
 out:
 	spin_unlock(&dist->lock);
 
-	return ret ? cpuid : -EINVAL;
+	if (!ret) {
+		/* kick the specified vcpu */
+		kvm_vcpu_kick(kvm_get_vcpu(kvm, cpuid));
+	}
+
+	return 0;
+}
+
+static int vgic_lazy_init(struct kvm *kvm)
+{
+	int ret = 0;
+
+	if (unlikely(!vgic_initialized(kvm))) {
+		/*
+		 * We only provide the automatic initialization of the VGIC
+		 * for the legacy case of a GICv2. Any other type must
+		 * be explicitly initialized once setup with the respective
+		 * KVM device call.
+		 */
+		if (kvm->arch.vgic.vgic_model != KVM_DEV_TYPE_ARM_VGIC_V2)
+			return -EBUSY;
+
+		mutex_lock(&kvm->lock);
+		ret = vgic_init(kvm);
+		mutex_unlock(&kvm->lock);
+	}
+
+	return ret;
 }
 
 /**
  * kvm_vgic_inject_irq - Inject an IRQ from a device to the vgic
  * @kvm:     The VM structure pointer
  * @cpuid:   The CPU for PPIs
- * @irq_num: The IRQ number that is assigned to the device
+ * @irq_num: The IRQ number that is assigned to the device. This IRQ
+ *           must not be mapped to a HW interrupt.
  * @level:   Edge-triggered:  true:  to trigger the interrupt
  *			      false: to ignore the call
  *	     Level-sensitive  true:  activates an interrupt
@@ -1623,39 +1655,44 @@ out:
 int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
 			bool level)
 {
-	int ret = 0;
-	int vcpu_id;
-
-	if (unlikely(!vgic_initialized(kvm))) {
-		/*
-		 * We only provide the automatic initialization of the VGIC
-		 * for the legacy case of a GICv2. Any other type must
-		 * be explicitly initialized once setup with the respective
-		 * KVM device call.
-		 */
-		if (kvm->arch.vgic.vgic_model != KVM_DEV_TYPE_ARM_VGIC_V2) {
-			ret = -EBUSY;
-			goto out;
-		}
-		mutex_lock(&kvm->lock);
-		ret = vgic_init(kvm);
-		mutex_unlock(&kvm->lock);
+	struct irq_phys_map *map;
+	int ret;
 
-		if (ret)
-			goto out;
-	}
+	ret = vgic_lazy_init(kvm);
+	if (ret)
+		return ret;
 
-	if (irq_num >= min(kvm->arch.vgic.nr_irqs, 1020))
+	map = vgic_irq_map_search(kvm_get_vcpu(kvm, cpuid), irq_num);
+	if (map)
 		return -EINVAL;
 
-	vcpu_id = vgic_update_irq_pending(kvm, cpuid, irq_num, level);
-	if (vcpu_id >= 0) {
-		/* kick the specified vcpu */
-		kvm_vcpu_kick(kvm_get_vcpu(kvm, vcpu_id));
-	}
+	return vgic_update_irq_pending(kvm, cpuid, NULL, irq_num, level);
+}
 
-out:
-	return ret;
+/**
+ * kvm_vgic_inject_mapped_irq - Inject a physically mapped IRQ to the vgic
+ * @kvm:     The VM structure pointer
+ * @cpuid:   The CPU for PPIs
+ * @map:     Pointer to a irq_phys_map structure describing the mapping
+ * @level:   Edge-triggered:  true:  to trigger the interrupt
+ *			      false: to ignore the call
+ *	     Level-sensitive  true:  activates an interrupt
+ *			      false: deactivates an interrupt
+ *
+ * The GIC is not concerned with devices being active-LOW or active-HIGH for
+ * level-sensitive interrupts.  You can think of the level parameter as 1
+ * being HIGH and 0 being LOW and all devices being active-HIGH.
+ */
+int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
+			       struct irq_phys_map *map, bool level)
+{
+	int ret;
+
+	ret = vgic_lazy_init(kvm);
+	if (ret)
+		return ret;
+
+	return vgic_update_irq_pending(kvm, cpuid, map, map->virt_irq, level);
 }
 
 static irqreturn_t vgic_maintenance_handler(int irq, void *data)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 10/11] KVM: arm/arm64: timer: Allow the timer to control the active state
  2015-07-24 15:54 ` Marc Zyngier
@ 2015-07-24 15:55   ` Marc Zyngier
  -1 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-07-24 15:55 UTC (permalink / raw)
  To: kvm, kvmarm, linux-arm-kernel
  Cc: Christoffer Dall, Eric Auger, Alex Bennée, Andre Przywara

In order to remove the crude hack where we sneak the masked bit
into the timer's control register, make use of the phys_irq_map
API control the active state of the interrupt.

This causes some limited changes to allow for potential error
propagation.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm/kvm/reset.c         |  4 +---
 arch/arm64/kvm/reset.c       |  4 +---
 include/kvm/arm_arch_timer.h |  7 +++++--
 virt/kvm/arm/arch_timer.c    | 29 ++++++++++++++++++++++-------
 4 files changed, 29 insertions(+), 15 deletions(-)

diff --git a/arch/arm/kvm/reset.c b/arch/arm/kvm/reset.c
index f558c07..eeb85858 100644
--- a/arch/arm/kvm/reset.c
+++ b/arch/arm/kvm/reset.c
@@ -77,7 +77,5 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
 	kvm_reset_coprocs(vcpu);
 
 	/* Reset arch_timer context */
-	kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
-
-	return 0;
+	return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
 }
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index b4af618..91cf535 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -121,7 +121,5 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
 	kvm_reset_sys_regs(vcpu);
 
 	/* Reset timer */
-	kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
-
-	return 0;
+	return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
 }
diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
index e596675..e1e4d7c 100644
--- a/include/kvm/arm_arch_timer.h
+++ b/include/kvm/arm_arch_timer.h
@@ -52,13 +52,16 @@ struct arch_timer_cpu {
 
 	/* Timer IRQ */
 	const struct kvm_irq_level	*irq;
+
+	/* VGIC mapping */
+	struct irq_phys_map		*map;
 };
 
 int kvm_timer_hyp_init(void);
 void kvm_timer_enable(struct kvm *kvm);
 void kvm_timer_init(struct kvm *kvm);
-void kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
-			  const struct kvm_irq_level *irq);
+int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
+			 const struct kvm_irq_level *irq);
 void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu);
 void kvm_timer_flush_hwstate(struct kvm_vcpu *vcpu);
 void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 98c95f2..76e38d2 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -64,10 +64,10 @@ static void kvm_timer_inject_irq(struct kvm_vcpu *vcpu)
 	int ret;
 	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
 
-	timer->cntv_ctl |= ARCH_TIMER_CTRL_IT_MASK;
-	ret = kvm_vgic_inject_irq(vcpu->kvm, vcpu->vcpu_id,
-				  timer->irq->irq,
-				  timer->irq->level);
+	kvm_vgic_set_phys_irq_active(timer->map, true);
+	ret = kvm_vgic_inject_mapped_irq(vcpu->kvm, vcpu->vcpu_id,
+					 timer->map,
+					 timer->irq->level);
 	WARN_ON(ret);
 }
 
@@ -117,7 +117,8 @@ bool kvm_timer_should_fire(struct kvm_vcpu *vcpu)
 	cycle_t cval, now;
 
 	if ((timer->cntv_ctl & ARCH_TIMER_CTRL_IT_MASK) ||
-		!(timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE))
+	    !(timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE) ||
+	    kvm_vgic_get_phys_irq_active(timer->map))
 		return false;
 
 	cval = timer->cntv_cval;
@@ -184,10 +185,11 @@ void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
 	timer_arm(timer, ns);
 }
 
-void kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
-			  const struct kvm_irq_level *irq)
+int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
+			 const struct kvm_irq_level *irq)
 {
 	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
+	struct irq_phys_map *map;
 
 	/*
 	 * The vcpu timer irq number cannot be determined in
@@ -196,6 +198,17 @@ void kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
 	 * vcpu timer irq number when the vcpu is reset.
 	 */
 	timer->irq = irq;
+
+	/*
+	 * Tell the VGIC that the virtual interrupt is tied to a
+	 * physical interrupt. We do that once per VCPU.
+	 */
+	map = kvm_vgic_map_phys_irq(vcpu, irq->irq, host_vtimer_irq);
+	if (WARN_ON(IS_ERR(map)))
+		return PTR_ERR(map);
+
+	timer->map = map;
+	return 0;
 }
 
 void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu)
@@ -335,6 +348,8 @@ void kvm_timer_vcpu_terminate(struct kvm_vcpu *vcpu)
 	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
 
 	timer_disarm(timer);
+	if (timer->map)
+		kvm_vgic_unmap_phys_irq(vcpu, timer->map);
 }
 
 void kvm_timer_enable(struct kvm *kvm)
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 10/11] KVM: arm/arm64: timer: Allow the timer to control the active state
@ 2015-07-24 15:55   ` Marc Zyngier
  0 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-07-24 15:55 UTC (permalink / raw)
  To: linux-arm-kernel

In order to remove the crude hack where we sneak the masked bit
into the timer's control register, make use of the phys_irq_map
API control the active state of the interrupt.

This causes some limited changes to allow for potential error
propagation.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm/kvm/reset.c         |  4 +---
 arch/arm64/kvm/reset.c       |  4 +---
 include/kvm/arm_arch_timer.h |  7 +++++--
 virt/kvm/arm/arch_timer.c    | 29 ++++++++++++++++++++++-------
 4 files changed, 29 insertions(+), 15 deletions(-)

diff --git a/arch/arm/kvm/reset.c b/arch/arm/kvm/reset.c
index f558c07..eeb85858 100644
--- a/arch/arm/kvm/reset.c
+++ b/arch/arm/kvm/reset.c
@@ -77,7 +77,5 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
 	kvm_reset_coprocs(vcpu);
 
 	/* Reset arch_timer context */
-	kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
-
-	return 0;
+	return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
 }
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index b4af618..91cf535 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -121,7 +121,5 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
 	kvm_reset_sys_regs(vcpu);
 
 	/* Reset timer */
-	kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
-
-	return 0;
+	return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
 }
diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
index e596675..e1e4d7c 100644
--- a/include/kvm/arm_arch_timer.h
+++ b/include/kvm/arm_arch_timer.h
@@ -52,13 +52,16 @@ struct arch_timer_cpu {
 
 	/* Timer IRQ */
 	const struct kvm_irq_level	*irq;
+
+	/* VGIC mapping */
+	struct irq_phys_map		*map;
 };
 
 int kvm_timer_hyp_init(void);
 void kvm_timer_enable(struct kvm *kvm);
 void kvm_timer_init(struct kvm *kvm);
-void kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
-			  const struct kvm_irq_level *irq);
+int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
+			 const struct kvm_irq_level *irq);
 void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu);
 void kvm_timer_flush_hwstate(struct kvm_vcpu *vcpu);
 void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 98c95f2..76e38d2 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -64,10 +64,10 @@ static void kvm_timer_inject_irq(struct kvm_vcpu *vcpu)
 	int ret;
 	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
 
-	timer->cntv_ctl |= ARCH_TIMER_CTRL_IT_MASK;
-	ret = kvm_vgic_inject_irq(vcpu->kvm, vcpu->vcpu_id,
-				  timer->irq->irq,
-				  timer->irq->level);
+	kvm_vgic_set_phys_irq_active(timer->map, true);
+	ret = kvm_vgic_inject_mapped_irq(vcpu->kvm, vcpu->vcpu_id,
+					 timer->map,
+					 timer->irq->level);
 	WARN_ON(ret);
 }
 
@@ -117,7 +117,8 @@ bool kvm_timer_should_fire(struct kvm_vcpu *vcpu)
 	cycle_t cval, now;
 
 	if ((timer->cntv_ctl & ARCH_TIMER_CTRL_IT_MASK) ||
-		!(timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE))
+	    !(timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE) ||
+	    kvm_vgic_get_phys_irq_active(timer->map))
 		return false;
 
 	cval = timer->cntv_cval;
@@ -184,10 +185,11 @@ void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
 	timer_arm(timer, ns);
 }
 
-void kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
-			  const struct kvm_irq_level *irq)
+int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
+			 const struct kvm_irq_level *irq)
 {
 	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
+	struct irq_phys_map *map;
 
 	/*
 	 * The vcpu timer irq number cannot be determined in
@@ -196,6 +198,17 @@ void kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
 	 * vcpu timer irq number when the vcpu is reset.
 	 */
 	timer->irq = irq;
+
+	/*
+	 * Tell the VGIC that the virtual interrupt is tied to a
+	 * physical interrupt. We do that once per VCPU.
+	 */
+	map = kvm_vgic_map_phys_irq(vcpu, irq->irq, host_vtimer_irq);
+	if (WARN_ON(IS_ERR(map)))
+		return PTR_ERR(map);
+
+	timer->map = map;
+	return 0;
 }
 
 void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu)
@@ -335,6 +348,8 @@ void kvm_timer_vcpu_terminate(struct kvm_vcpu *vcpu)
 	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
 
 	timer_disarm(timer);
+	if (timer->map)
+		kvm_vgic_unmap_phys_irq(vcpu, timer->map);
 }
 
 void kvm_timer_enable(struct kvm *kvm)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 11/11] KVM: arm/arm64: vgic: Allow HW interrupts for non-shared devices
  2015-07-24 15:54 ` Marc Zyngier
@ 2015-07-24 15:55   ` Marc Zyngier
  -1 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-07-24 15:55 UTC (permalink / raw)
  To: kvm, kvmarm, linux-arm-kernel
  Cc: Christoffer Dall, Eric Auger, Alex Bennée, Andre Przywara

So far, the only use of the HW interrupt facility is the timer,
implying that the active state is context-switched for each vcpu,
as the device is is shared across all vcpus.

This does not work for a device that has been assigned to a VM,
as the guest is entierely in control of that device (the HW is
not shared). In that case, it makes sense to bypass the whole
active state switching, and only track the deactivation of the
interrupt.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 include/kvm/arm_vgic.h    |  6 ++--
 virt/kvm/arm/arch_timer.c |  3 +-
 virt/kvm/arm/vgic.c       | 73 +++++++++++++++++++++++++++++++++++++----------
 3 files changed, 64 insertions(+), 18 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index f6bfd79..6f0a4e1 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -164,7 +164,8 @@ struct irq_phys_map {
 	u32			phys_irq;
 	u32			irq;
 	bool			deleted;
-	bool			active;
+	bool			shared;
+	bool			active; /* Only valid if shared */
 };
 
 struct irq_phys_map_entry {
@@ -357,7 +358,8 @@ void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
 int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
 int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
 struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
-					   int virt_irq, int irq);
+					   int virt_irq, int irq,
+					   bool shared);
 int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
 bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map);
 void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active);
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 76e38d2..db21d8f 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -203,7 +203,8 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
 	 * Tell the VGIC that the virtual interrupt is tied to a
 	 * physical interrupt. We do that once per VCPU.
 	 */
-	map = kvm_vgic_map_phys_irq(vcpu, irq->irq, host_vtimer_irq);
+	map = kvm_vgic_map_phys_irq(vcpu, irq->irq,
+				    host_vtimer_irq, true);
 	if (WARN_ON(IS_ERR(map)))
 		return PTR_ERR(map);
 
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index e40ef70..5e6b816 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1128,19 +1128,25 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
 		 * active in the physical world. Otherwise the
 		 * physical interrupt will fire and the guest will
 		 * exit before processing the virtual interrupt.
+		 *
+		 * This is of course only valid for a shared
+		 * interrupt. A non shared interrupt should already be
+		 * active.
 		 */
 		if (map) {
-			int ret;
-
-			BUG_ON(!map->active);
 			vlr.hwirq = map->phys_irq;
 			vlr.state |= LR_HW;
 			vlr.state &= ~LR_EOI_INT;
 
-			ret = irq_set_irqchip_state(map->irq,
-						    IRQCHIP_STATE_ACTIVE,
-						    true);
-			WARN_ON(ret);
+			if (map->shared) {
+				int ret;
+
+				BUG_ON(!map->active);
+				ret = irq_set_irqchip_state(map->irq,
+							    IRQCHIP_STATE_ACTIVE,
+							    true);
+				WARN_ON(ret);
+			}
 
 			/*
 			 * Make sure we're not going to sample this
@@ -1383,21 +1389,41 @@ static bool vgic_process_maintenance(struct kvm_vcpu *vcpu)
 static int vgic_sync_hwirq(struct kvm_vcpu *vcpu, struct vgic_lr vlr)
 {
 	struct irq_phys_map *map;
+	bool active;
 	int ret;
 
 	if (!(vlr.state & LR_HW))
 		return 0;
 
 	map = vgic_irq_map_search(vcpu, vlr.irq);
-	BUG_ON(!map || !map->active);
+	BUG_ON(!map);
+	BUG_ON(map->shared && !map->active);
 
 	ret = irq_get_irqchip_state(map->irq,
 				    IRQCHIP_STATE_ACTIVE,
-				    &map->active);
+				    &active);
 
 	WARN_ON(ret);
 
-	if (map->active) {
+	/*
+	 * For a non-shared interrupt, we have to cater for two
+	 * possible deactivation conditions:
+	 *
+	 * - the physical interrupt is now inactive (EOIed from the
+         *   guest)
+	 * - the physical interrupt is still active, but its virtual
+	 *   counterpart is flagged as "not queued", indicating another
+	 *   interrupt has fired between the EOI and the guest exit.
+	 *
+	 * Also, we are not reactivating a non-shared interrupt
+	 * ourselves. This is always left to the guest.
+	 */
+	if (!map->shared)
+		return !active || !vgic_irq_is_queued(vcpu, vlr.irq);
+
+	map->active = active;
+
+	if (active) {
 		ret = irq_set_irqchip_state(map->irq,
 					    IRQCHIP_STATE_ACTIVE,
 					    false);
@@ -1590,6 +1616,17 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
 		goto out;
 	}
 
+	if (map && !map->shared) {
+		/*
+		 * We are told to inject a HW irq, so we have to trust
+		 * the caller that the previous one has been EOIed,
+		 * and that a new one is now active. Clearing the
+		 * queued state will have the effect of making it
+		 * sample-able again.
+		 */
+		vgic_irq_clear_queued(vcpu, irq_num);
+	}
+
 	if (!vgic_can_sample_irq(vcpu, irq_num)) {
 		/*
 		 * Level interrupt in progress, will be picked up
@@ -1720,15 +1757,19 @@ static struct list_head *vgic_get_irq_phys_map_list(struct kvm_vcpu *vcpu,
  * @vcpu: The VCPU pointer
  * @virt_irq: The virtual irq number
  * @irq: The Linux IRQ number
+ * @shared: Indicates if the interrupt has to be context-switched or
+ *          if it is private to a VM
  *
  * Establish a mapping between a guest visible irq (@virt_irq) and a
  * Linux irq (@irq). On injection, @virt_irq will be associated with
- * the physical interrupt represented by @irq.
+ * the physical interrupt represented by @irq. If @shared is true,
+ * the active state of the interrupt will be context-switched.
  *
  * Returns a valid pointer on success, and an error pointer otherwise
  */
 struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
-					   int virt_irq, int irq)
+					   int virt_irq, int irq,
+					   bool shared)
 {
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
 	struct list_head *root = vgic_get_irq_phys_map_list(vcpu, virt_irq);
@@ -1762,7 +1803,8 @@ struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
 	if (map) {
 		/* Make sure this mapping matches */
 		if (map->phys_irq != phys_irq	||
-		    map->irq      != irq)
+		    map->irq      != irq	||
+		    map->shared   != shared)
 			map = ERR_PTR(-EINVAL);
 
 		goto out;
@@ -1772,6 +1814,7 @@ struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
 	map->virt_irq = virt_irq;
 	map->phys_irq = phys_irq;
 	map->irq      = irq;
+	map->shared   = shared;
 
 	list_add_tail_rcu(&entry->entry, root);
 
@@ -1822,7 +1865,7 @@ static void vgic_free_phys_irq_map_rcu(struct rcu_head *rcu)
  */
 bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map)
 {
-	BUG_ON(!map);
+	BUG_ON(!map || !map->shared);
 	return map->active;
 }
 
@@ -1834,7 +1877,7 @@ bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map)
  */
 void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active)
 {
-	BUG_ON(!map);
+	BUG_ON(!map || !map->shared);
 	map->active = active;
 }
 
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 11/11] KVM: arm/arm64: vgic: Allow HW interrupts for non-shared devices
@ 2015-07-24 15:55   ` Marc Zyngier
  0 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-07-24 15:55 UTC (permalink / raw)
  To: linux-arm-kernel

So far, the only use of the HW interrupt facility is the timer,
implying that the active state is context-switched for each vcpu,
as the device is is shared across all vcpus.

This does not work for a device that has been assigned to a VM,
as the guest is entierely in control of that device (the HW is
not shared). In that case, it makes sense to bypass the whole
active state switching, and only track the deactivation of the
interrupt.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 include/kvm/arm_vgic.h    |  6 ++--
 virt/kvm/arm/arch_timer.c |  3 +-
 virt/kvm/arm/vgic.c       | 73 +++++++++++++++++++++++++++++++++++++----------
 3 files changed, 64 insertions(+), 18 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index f6bfd79..6f0a4e1 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -164,7 +164,8 @@ struct irq_phys_map {
 	u32			phys_irq;
 	u32			irq;
 	bool			deleted;
-	bool			active;
+	bool			shared;
+	bool			active; /* Only valid if shared */
 };
 
 struct irq_phys_map_entry {
@@ -357,7 +358,8 @@ void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
 int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
 int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
 struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
-					   int virt_irq, int irq);
+					   int virt_irq, int irq,
+					   bool shared);
 int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
 bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map);
 void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active);
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 76e38d2..db21d8f 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -203,7 +203,8 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
 	 * Tell the VGIC that the virtual interrupt is tied to a
 	 * physical interrupt. We do that once per VCPU.
 	 */
-	map = kvm_vgic_map_phys_irq(vcpu, irq->irq, host_vtimer_irq);
+	map = kvm_vgic_map_phys_irq(vcpu, irq->irq,
+				    host_vtimer_irq, true);
 	if (WARN_ON(IS_ERR(map)))
 		return PTR_ERR(map);
 
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index e40ef70..5e6b816 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1128,19 +1128,25 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
 		 * active in the physical world. Otherwise the
 		 * physical interrupt will fire and the guest will
 		 * exit before processing the virtual interrupt.
+		 *
+		 * This is of course only valid for a shared
+		 * interrupt. A non shared interrupt should already be
+		 * active.
 		 */
 		if (map) {
-			int ret;
-
-			BUG_ON(!map->active);
 			vlr.hwirq = map->phys_irq;
 			vlr.state |= LR_HW;
 			vlr.state &= ~LR_EOI_INT;
 
-			ret = irq_set_irqchip_state(map->irq,
-						    IRQCHIP_STATE_ACTIVE,
-						    true);
-			WARN_ON(ret);
+			if (map->shared) {
+				int ret;
+
+				BUG_ON(!map->active);
+				ret = irq_set_irqchip_state(map->irq,
+							    IRQCHIP_STATE_ACTIVE,
+							    true);
+				WARN_ON(ret);
+			}
 
 			/*
 			 * Make sure we're not going to sample this
@@ -1383,21 +1389,41 @@ static bool vgic_process_maintenance(struct kvm_vcpu *vcpu)
 static int vgic_sync_hwirq(struct kvm_vcpu *vcpu, struct vgic_lr vlr)
 {
 	struct irq_phys_map *map;
+	bool active;
 	int ret;
 
 	if (!(vlr.state & LR_HW))
 		return 0;
 
 	map = vgic_irq_map_search(vcpu, vlr.irq);
-	BUG_ON(!map || !map->active);
+	BUG_ON(!map);
+	BUG_ON(map->shared && !map->active);
 
 	ret = irq_get_irqchip_state(map->irq,
 				    IRQCHIP_STATE_ACTIVE,
-				    &map->active);
+				    &active);
 
 	WARN_ON(ret);
 
-	if (map->active) {
+	/*
+	 * For a non-shared interrupt, we have to cater for two
+	 * possible deactivation conditions:
+	 *
+	 * - the physical interrupt is now inactive (EOIed from the
+         *   guest)
+	 * - the physical interrupt is still active, but its virtual
+	 *   counterpart is flagged as "not queued", indicating another
+	 *   interrupt has fired between the EOI and the guest exit.
+	 *
+	 * Also, we are not reactivating a non-shared interrupt
+	 * ourselves. This is always left to the guest.
+	 */
+	if (!map->shared)
+		return !active || !vgic_irq_is_queued(vcpu, vlr.irq);
+
+	map->active = active;
+
+	if (active) {
 		ret = irq_set_irqchip_state(map->irq,
 					    IRQCHIP_STATE_ACTIVE,
 					    false);
@@ -1590,6 +1616,17 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
 		goto out;
 	}
 
+	if (map && !map->shared) {
+		/*
+		 * We are told to inject a HW irq, so we have to trust
+		 * the caller that the previous one has been EOIed,
+		 * and that a new one is now active. Clearing the
+		 * queued state will have the effect of making it
+		 * sample-able again.
+		 */
+		vgic_irq_clear_queued(vcpu, irq_num);
+	}
+
 	if (!vgic_can_sample_irq(vcpu, irq_num)) {
 		/*
 		 * Level interrupt in progress, will be picked up
@@ -1720,15 +1757,19 @@ static struct list_head *vgic_get_irq_phys_map_list(struct kvm_vcpu *vcpu,
  * @vcpu: The VCPU pointer
  * @virt_irq: The virtual irq number
  * @irq: The Linux IRQ number
+ * @shared: Indicates if the interrupt has to be context-switched or
+ *          if it is private to a VM
  *
  * Establish a mapping between a guest visible irq (@virt_irq) and a
  * Linux irq (@irq). On injection, @virt_irq will be associated with
- * the physical interrupt represented by @irq.
+ * the physical interrupt represented by @irq. If @shared is true,
+ * the active state of the interrupt will be context-switched.
  *
  * Returns a valid pointer on success, and an error pointer otherwise
  */
 struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
-					   int virt_irq, int irq)
+					   int virt_irq, int irq,
+					   bool shared)
 {
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
 	struct list_head *root = vgic_get_irq_phys_map_list(vcpu, virt_irq);
@@ -1762,7 +1803,8 @@ struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
 	if (map) {
 		/* Make sure this mapping matches */
 		if (map->phys_irq != phys_irq	||
-		    map->irq      != irq)
+		    map->irq      != irq	||
+		    map->shared   != shared)
 			map = ERR_PTR(-EINVAL);
 
 		goto out;
@@ -1772,6 +1814,7 @@ struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
 	map->virt_irq = virt_irq;
 	map->phys_irq = phys_irq;
 	map->irq      = irq;
+	map->shared   = shared;
 
 	list_add_tail_rcu(&entry->entry, root);
 
@@ -1822,7 +1865,7 @@ static void vgic_free_phys_irq_map_rcu(struct rcu_head *rcu)
  */
 bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map)
 {
-	BUG_ON(!map);
+	BUG_ON(!map || !map->shared);
 	return map->active;
 }
 
@@ -1834,7 +1877,7 @@ bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map)
  */
 void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active)
 {
-	BUG_ON(!map);
+	BUG_ON(!map || !map->shared);
 	map->active = active;
 }
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 06/11] KVM: arm/arm64: vgic: Allow dynamic mapping of physical/virtual interrupts
  2015-07-24 15:55   ` Marc Zyngier
@ 2015-08-04 13:04     ` Christoffer Dall
  -1 siblings, 0 replies; 74+ messages in thread
From: Christoffer Dall @ 2015-08-04 13:04 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvm, kvmarm, linux-arm-kernel, Eric Auger, Alex Bennée,
	Andre Przywara

On Fri, Jul 24, 2015 at 04:55:04PM +0100, Marc Zyngier wrote:
> In order to be able to feed physical interrupts to a guest, we need
> to be able to establish the virtual-physical mapping between the two
> worlds.
> 
> The mappings are kept in a set of RCU lists, indexed by virtual interrupts.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm/kvm/arm.c     |   2 +
>  include/kvm/arm_vgic.h |  26 +++++++
>  virt/kvm/arm/vgic.c    | 189 ++++++++++++++++++++++++++++++++++++++++++++++++-
>  3 files changed, 216 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index f1bf418..ce404a5 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -125,6 +125,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>  	if (ret)
>  		goto out_free_stage2_pgd;
>  
> +	kvm_vgic_early_init(kvm);
>  	kvm_timer_init(kvm);
>  
>  	/* Mark the initial VMID generation invalid */
> @@ -249,6 +250,7 @@ out:
>  
>  void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
>  {
> +	kvm_vgic_vcpu_early_init(vcpu);
>  }
>  
>  void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu)
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index a881e39..68212a1 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -159,6 +159,20 @@ struct vgic_io_device {
>  	struct kvm_io_device dev;
>  };
>  
> +struct irq_phys_map {
> +	u32			virt_irq;
> +	u32			phys_irq;
> +	u32			irq;
> +	bool			deleted;
> +	bool			active;
> +};
> +
> +struct irq_phys_map_entry {
> +	struct list_head	entry;
> +	struct rcu_head		rcu;
> +	struct irq_phys_map	map;
> +};
> +
>  struct vgic_dist {
>  	spinlock_t		lock;
>  	bool			in_kernel;
> @@ -256,6 +270,10 @@ struct vgic_dist {
>  	struct vgic_vm_ops	vm_ops;
>  	struct vgic_io_device	dist_iodev;
>  	struct vgic_io_device	*redist_iodevs;
> +
> +	/* Virtual irq to hwirq mapping */
> +	spinlock_t		irq_phys_map_lock;
> +	struct list_head	irq_phys_map_list;
>  };
>  
>  struct vgic_v2_cpu_if {
> @@ -307,6 +325,9 @@ struct vgic_cpu {
>  		struct vgic_v2_cpu_if	vgic_v2;
>  		struct vgic_v3_cpu_if	vgic_v3;
>  	};
> +
> +	/* Protected by the distributor's irq_phys_map_lock */
> +	struct list_head	irq_phys_map_list;
>  };
>  
>  #define LR_EMPTY	0xff
> @@ -321,8 +342,10 @@ int kvm_vgic_addr(struct kvm *kvm, unsigned long type, u64 *addr, bool write);
>  int kvm_vgic_hyp_init(void);
>  int kvm_vgic_map_resources(struct kvm *kvm);
>  int kvm_vgic_get_max_vcpus(void);
> +void kvm_vgic_early_init(struct kvm *kvm);
>  int kvm_vgic_create(struct kvm *kvm, u32 type);
>  void kvm_vgic_destroy(struct kvm *kvm);
> +void kvm_vgic_vcpu_early_init(struct kvm_vcpu *vcpu);
>  void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu);
>  void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
> @@ -331,6 +354,9 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
> +struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
> +					   int virt_irq, int irq);
> +int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
>  
>  #define irqchip_in_kernel(k)	(!!((k)->arch.vgic.in_kernel))
>  #define vgic_initialized(k)	(!!((k)->arch.vgic.nr_cpus))
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index 5bd1695..c74d929 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -24,6 +24,7 @@
>  #include <linux/of.h>
>  #include <linux/of_address.h>
>  #include <linux/of_irq.h>
> +#include <linux/rculist.h>
>  #include <linux/uaccess.h>
>  
>  #include <asm/kvm_emulate.h>
> @@ -82,6 +83,8 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu);
>  static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu);
>  static struct vgic_lr vgic_get_lr(const struct kvm_vcpu *vcpu, int lr);
>  static void vgic_set_lr(struct kvm_vcpu *vcpu, int lr, struct vgic_lr lr_desc);
> +static struct irq_phys_map *vgic_irq_map_search(struct kvm_vcpu *vcpu,
> +						int virt_irq);
>  
>  static const struct vgic_ops *vgic_ops;
>  static const struct vgic_params *vgic;
> @@ -1583,6 +1586,166 @@ static irqreturn_t vgic_maintenance_handler(int irq, void *data)
>  	return IRQ_HANDLED;
>  }
>  
> +static struct list_head *vgic_get_irq_phys_map_list(struct kvm_vcpu *vcpu,
> +						    int virt_irq)
> +{
> +	if (virt_irq < VGIC_NR_PRIVATE_IRQS)
> +		return &vcpu->arch.vgic_cpu.irq_phys_map_list;
> +	else
> +		return &vcpu->kvm->arch.vgic.irq_phys_map_list;
> +}
> +
> +/**
> + * kvm_vgic_map_phys_irq - map a virtual IRQ to a physical IRQ
> + * @vcpu: The VCPU pointer
> + * @virt_irq: The virtual irq number
> + * @irq: The Linux IRQ number
> + *
> + * Establish a mapping between a guest visible irq (@virt_irq) and a
> + * Linux irq (@irq). On injection, @virt_irq will be associated with
> + * the physical interrupt represented by @irq.
> + *
> + * Returns a valid pointer on success, and an error pointer otherwise
> + */
> +struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
> +					   int virt_irq, int irq)
> +{
> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> +	struct list_head *root = vgic_get_irq_phys_map_list(vcpu, virt_irq);
> +	struct irq_phys_map *map;
> +	struct irq_phys_map_entry *entry;
> +	struct irq_desc *desc;
> +	struct irq_data *data;
> +	int phys_irq;
> +
> +	desc = irq_to_desc(irq);
> +	if (!desc) {
> +		kvm_err("kvm_vgic_map_phys_irq: no interrupt descriptor\n");

super over-ridiculous nit:
if you respin you could consider "%s: no...", __FUNC__

> +		return ERR_PTR(-EINVAL);
> +	}
> +
> +	data = irq_desc_get_irq_data(desc);
> +	while (data->parent_data)
> +		data = data->parent_data;
> +
> +	phys_irq = data->hwirq;
> +
> +	/* Create a new mapping */
> +	entry = kzalloc(sizeof(*entry), GFP_KERNEL);
> +	if (!entry)
> +		return ERR_PTR(-ENOMEM);
> +
> +	spin_lock(&dist->irq_phys_map_lock);
> +
> +	/* Try to match an existing mapping */
> +	map = vgic_irq_map_search(vcpu, virt_irq);
> +	if (map) {
> +		/* Make sure this mapping matches */

why do we need to check that it matches?  Is it ever allowed to have
multiple mappings of a virtual interrupt?

> +		if (map->phys_irq != phys_irq	||
> +		    map->irq      != irq)
> +			map = ERR_PTR(-EINVAL);
> +
> +		goto out;

why is this a valid situation?  Shouldn't you return -EEXIST or
something instead?  (that would also simplify the error-checking free
routine below).

If you want it to be valid to map the same virq to irq multiple times
(to free the caller from some bookkeeping, I presume?) then that should
be part of the function documentation, and I think we should add a
"found existing mapping" comment above the goto out statement.

> +	}
> +
> +	map           = &entry->map;
> +	map->virt_irq = virt_irq;
> +	map->phys_irq = phys_irq;
> +	map->irq      = irq;
> +
> +	list_add_tail_rcu(&entry->entry, root);
> +
> +out:
> +	spin_unlock(&dist->irq_phys_map_lock);
> +	/* If we've found a hit in the existing list, free the useless
> +	 * entry */
> +	if (IS_ERR(map) || map != &entry->map)
> +		kfree(entry);
> +	return map;
> +}
> +
> +static struct irq_phys_map *vgic_irq_map_search(struct kvm_vcpu *vcpu,
> +						int virt_irq)
> +{
> +	struct list_head *root = vgic_get_irq_phys_map_list(vcpu, virt_irq);
> +	struct irq_phys_map_entry *entry;
> +	struct irq_phys_map *map;
> +
> +	rcu_read_lock();
> +
> +	list_for_each_entry_rcu(entry, root, entry) {
> +		map = &entry->map;
> +		if (map->virt_irq == virt_irq && !map->deleted) {
> +			rcu_read_unlock();
> +			return map;
> +		}
> +	}
> +
> +	rcu_read_unlock();
> +
> +	return NULL;
> +}
> +
> +static void vgic_free_phys_irq_map_rcu(struct rcu_head *rcu)
> +{
> +	struct irq_phys_map_entry *entry;
> +
> +	entry = container_of(rcu, struct irq_phys_map_entry, rcu);
> +	kfree(entry);
> +}
> +
> +/**
> + * kvm_vgic_unmap_phys_irq - Remove a virtual to physical IRQ mapping
> + * @vcpu: The VCPU pointer
> + * @map: The pointer to a mapping obtained through kvm_vgic_map_phys_irq
> + *
> + * Remove an existing mapping between virtual and physical interrupts.
> + */
> +int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map)
> +{
> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> +	struct irq_phys_map_entry *entry;
> +	struct list_head *root;
> +
> +	if (!map)
> +		return -EINVAL;
> +
> +	root = vgic_get_irq_phys_map_list(vcpu, map->virt_irq);
> +
> +	spin_lock(&dist->irq_phys_map_lock);
> +
> +	list_for_each_entry(entry, root, entry) {
> +		if (&entry->map == map && !map->deleted) {
> +			map->deleted = true;

why do you need the 'deleted' flag?  You now search the list only after
holding the lock, so the race I pointed out before doesn't exist
anymore.  Is there an additional issue that the deleted flag is taking
care of?

> +			list_del_rcu(&entry->entry);
> +			call_rcu(&entry->rcu, vgic_free_phys_irq_map_rcu);
> +			break;
> +		}
> +	}
> +
> +	spin_unlock(&dist->irq_phys_map_lock);
> +
> +	return 0;
> +}
> +
> +static void vgic_destroy_irq_phys_map(struct kvm *kvm, struct list_head *root)
> +{
> +	struct vgic_dist *dist = &kvm->arch.vgic;
> +	struct irq_phys_map_entry *entry;
> +
> +	spin_lock(&dist->irq_phys_map_lock);
> +
> +	list_for_each_entry(entry, root, entry) {
> +		if (!entry->map.deleted) {
> +			entry->map.deleted = true;
> +			list_del_rcu(&entry->entry);
> +			call_rcu(&entry->rcu, vgic_free_phys_irq_map_rcu);
> +		}
> +	}
> +
> +	spin_unlock(&dist->irq_phys_map_lock);
> +}
> +
>  void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
>  {
>  	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> @@ -1591,6 +1754,7 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
>  	kfree(vgic_cpu->active_shared);
>  	kfree(vgic_cpu->pend_act_shared);
>  	kfree(vgic_cpu->vgic_irq_lr_map);
> +	vgic_destroy_irq_phys_map(vcpu->kvm, &vgic_cpu->irq_phys_map_list);
>  	vgic_cpu->pending_shared = NULL;
>  	vgic_cpu->active_shared = NULL;
>  	vgic_cpu->pend_act_shared = NULL;
> @@ -1628,6 +1792,17 @@ static int vgic_vcpu_init_maps(struct kvm_vcpu *vcpu, int nr_irqs)
>  }
>  
>  /**
> + * kvm_vgic_vcpu_early_init - Earliest possible per-vcpu vgic init stage

earliest possible... oh how I love the vgic code.

If you have the energy and context to write a comment in the beginning
of this file describing the init flow, I think that would be useful...

> + *
> + * No memory allocation should be performed here, only static init.
> + */
> +void kvm_vgic_vcpu_early_init(struct kvm_vcpu *vcpu)
> +{
> +	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> +	INIT_LIST_HEAD(&vgic_cpu->irq_phys_map_list);
> +}
> +
> +/**
>   * kvm_vgic_get_max_vcpus - Get the maximum number of VCPUs allowed by HW
>   *
>   * The host's GIC naturally limits the maximum amount of VCPUs a guest
> @@ -1664,6 +1839,7 @@ void kvm_vgic_destroy(struct kvm *kvm)
>  	kfree(dist->irq_spi_target);
>  	kfree(dist->irq_pending_on_cpu);
>  	kfree(dist->irq_active_on_cpu);
> +	vgic_destroy_irq_phys_map(kvm, &dist->irq_phys_map_list);
>  	dist->irq_sgi_sources = NULL;
>  	dist->irq_spi_cpu = NULL;
>  	dist->irq_spi_target = NULL;
> @@ -1787,6 +1963,18 @@ static int init_vgic_model(struct kvm *kvm, int type)
>  	return 0;
>  }
>  
> +/**
> + * kvm_vgic_early_init - Earliest possible vgic initialization stage
> + *
> + * No memory allocation should be performed here, only static init.
> + */
> +void kvm_vgic_early_init(struct kvm *kvm)
> +{
> +	spin_lock_init(&kvm->arch.vgic.lock);

so here we're initializing the data structures even before the vgic is
created, right?  So therefore it's safe to move the vgic.lock init up
here?

> +	spin_lock_init(&kvm->arch.vgic.irq_phys_map_lock);
> +	INIT_LIST_HEAD(&kvm->arch.vgic.irq_phys_map_list);
> +}
> +
>  int kvm_vgic_create(struct kvm *kvm, u32 type)
>  {
>  	int i, vcpu_lock_idx = -1, ret;
> @@ -1832,7 +2020,6 @@ int kvm_vgic_create(struct kvm *kvm, u32 type)
>  	if (ret)
>  		goto out_unlock;
>  
> -	spin_lock_init(&kvm->arch.vgic.lock);
>  	kvm->arch.vgic.in_kernel = true;
>  	kvm->arch.vgic.vgic_model = type;
>  	kvm->arch.vgic.vctrl_base = vgic->vctrl_base;
> -- 
> 2.1.4
> 

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 06/11] KVM: arm/arm64: vgic: Allow dynamic mapping of physical/virtual interrupts
@ 2015-08-04 13:04     ` Christoffer Dall
  0 siblings, 0 replies; 74+ messages in thread
From: Christoffer Dall @ 2015-08-04 13:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 24, 2015 at 04:55:04PM +0100, Marc Zyngier wrote:
> In order to be able to feed physical interrupts to a guest, we need
> to be able to establish the virtual-physical mapping between the two
> worlds.
> 
> The mappings are kept in a set of RCU lists, indexed by virtual interrupts.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm/kvm/arm.c     |   2 +
>  include/kvm/arm_vgic.h |  26 +++++++
>  virt/kvm/arm/vgic.c    | 189 ++++++++++++++++++++++++++++++++++++++++++++++++-
>  3 files changed, 216 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index f1bf418..ce404a5 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -125,6 +125,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>  	if (ret)
>  		goto out_free_stage2_pgd;
>  
> +	kvm_vgic_early_init(kvm);
>  	kvm_timer_init(kvm);
>  
>  	/* Mark the initial VMID generation invalid */
> @@ -249,6 +250,7 @@ out:
>  
>  void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
>  {
> +	kvm_vgic_vcpu_early_init(vcpu);
>  }
>  
>  void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu)
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index a881e39..68212a1 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -159,6 +159,20 @@ struct vgic_io_device {
>  	struct kvm_io_device dev;
>  };
>  
> +struct irq_phys_map {
> +	u32			virt_irq;
> +	u32			phys_irq;
> +	u32			irq;
> +	bool			deleted;
> +	bool			active;
> +};
> +
> +struct irq_phys_map_entry {
> +	struct list_head	entry;
> +	struct rcu_head		rcu;
> +	struct irq_phys_map	map;
> +};
> +
>  struct vgic_dist {
>  	spinlock_t		lock;
>  	bool			in_kernel;
> @@ -256,6 +270,10 @@ struct vgic_dist {
>  	struct vgic_vm_ops	vm_ops;
>  	struct vgic_io_device	dist_iodev;
>  	struct vgic_io_device	*redist_iodevs;
> +
> +	/* Virtual irq to hwirq mapping */
> +	spinlock_t		irq_phys_map_lock;
> +	struct list_head	irq_phys_map_list;
>  };
>  
>  struct vgic_v2_cpu_if {
> @@ -307,6 +325,9 @@ struct vgic_cpu {
>  		struct vgic_v2_cpu_if	vgic_v2;
>  		struct vgic_v3_cpu_if	vgic_v3;
>  	};
> +
> +	/* Protected by the distributor's irq_phys_map_lock */
> +	struct list_head	irq_phys_map_list;
>  };
>  
>  #define LR_EMPTY	0xff
> @@ -321,8 +342,10 @@ int kvm_vgic_addr(struct kvm *kvm, unsigned long type, u64 *addr, bool write);
>  int kvm_vgic_hyp_init(void);
>  int kvm_vgic_map_resources(struct kvm *kvm);
>  int kvm_vgic_get_max_vcpus(void);
> +void kvm_vgic_early_init(struct kvm *kvm);
>  int kvm_vgic_create(struct kvm *kvm, u32 type);
>  void kvm_vgic_destroy(struct kvm *kvm);
> +void kvm_vgic_vcpu_early_init(struct kvm_vcpu *vcpu);
>  void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu);
>  void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
> @@ -331,6 +354,9 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
> +struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
> +					   int virt_irq, int irq);
> +int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
>  
>  #define irqchip_in_kernel(k)	(!!((k)->arch.vgic.in_kernel))
>  #define vgic_initialized(k)	(!!((k)->arch.vgic.nr_cpus))
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index 5bd1695..c74d929 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -24,6 +24,7 @@
>  #include <linux/of.h>
>  #include <linux/of_address.h>
>  #include <linux/of_irq.h>
> +#include <linux/rculist.h>
>  #include <linux/uaccess.h>
>  
>  #include <asm/kvm_emulate.h>
> @@ -82,6 +83,8 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu);
>  static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu);
>  static struct vgic_lr vgic_get_lr(const struct kvm_vcpu *vcpu, int lr);
>  static void vgic_set_lr(struct kvm_vcpu *vcpu, int lr, struct vgic_lr lr_desc);
> +static struct irq_phys_map *vgic_irq_map_search(struct kvm_vcpu *vcpu,
> +						int virt_irq);
>  
>  static const struct vgic_ops *vgic_ops;
>  static const struct vgic_params *vgic;
> @@ -1583,6 +1586,166 @@ static irqreturn_t vgic_maintenance_handler(int irq, void *data)
>  	return IRQ_HANDLED;
>  }
>  
> +static struct list_head *vgic_get_irq_phys_map_list(struct kvm_vcpu *vcpu,
> +						    int virt_irq)
> +{
> +	if (virt_irq < VGIC_NR_PRIVATE_IRQS)
> +		return &vcpu->arch.vgic_cpu.irq_phys_map_list;
> +	else
> +		return &vcpu->kvm->arch.vgic.irq_phys_map_list;
> +}
> +
> +/**
> + * kvm_vgic_map_phys_irq - map a virtual IRQ to a physical IRQ
> + * @vcpu: The VCPU pointer
> + * @virt_irq: The virtual irq number
> + * @irq: The Linux IRQ number
> + *
> + * Establish a mapping between a guest visible irq (@virt_irq) and a
> + * Linux irq (@irq). On injection, @virt_irq will be associated with
> + * the physical interrupt represented by @irq.
> + *
> + * Returns a valid pointer on success, and an error pointer otherwise
> + */
> +struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
> +					   int virt_irq, int irq)
> +{
> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> +	struct list_head *root = vgic_get_irq_phys_map_list(vcpu, virt_irq);
> +	struct irq_phys_map *map;
> +	struct irq_phys_map_entry *entry;
> +	struct irq_desc *desc;
> +	struct irq_data *data;
> +	int phys_irq;
> +
> +	desc = irq_to_desc(irq);
> +	if (!desc) {
> +		kvm_err("kvm_vgic_map_phys_irq: no interrupt descriptor\n");

super over-ridiculous nit:
if you respin you could consider "%s: no...", __FUNC__

> +		return ERR_PTR(-EINVAL);
> +	}
> +
> +	data = irq_desc_get_irq_data(desc);
> +	while (data->parent_data)
> +		data = data->parent_data;
> +
> +	phys_irq = data->hwirq;
> +
> +	/* Create a new mapping */
> +	entry = kzalloc(sizeof(*entry), GFP_KERNEL);
> +	if (!entry)
> +		return ERR_PTR(-ENOMEM);
> +
> +	spin_lock(&dist->irq_phys_map_lock);
> +
> +	/* Try to match an existing mapping */
> +	map = vgic_irq_map_search(vcpu, virt_irq);
> +	if (map) {
> +		/* Make sure this mapping matches */

why do we need to check that it matches?  Is it ever allowed to have
multiple mappings of a virtual interrupt?

> +		if (map->phys_irq != phys_irq	||
> +		    map->irq      != irq)
> +			map = ERR_PTR(-EINVAL);
> +
> +		goto out;

why is this a valid situation?  Shouldn't you return -EEXIST or
something instead?  (that would also simplify the error-checking free
routine below).

If you want it to be valid to map the same virq to irq multiple times
(to free the caller from some bookkeeping, I presume?) then that should
be part of the function documentation, and I think we should add a
"found existing mapping" comment above the goto out statement.

> +	}
> +
> +	map           = &entry->map;
> +	map->virt_irq = virt_irq;
> +	map->phys_irq = phys_irq;
> +	map->irq      = irq;
> +
> +	list_add_tail_rcu(&entry->entry, root);
> +
> +out:
> +	spin_unlock(&dist->irq_phys_map_lock);
> +	/* If we've found a hit in the existing list, free the useless
> +	 * entry */
> +	if (IS_ERR(map) || map != &entry->map)
> +		kfree(entry);
> +	return map;
> +}
> +
> +static struct irq_phys_map *vgic_irq_map_search(struct kvm_vcpu *vcpu,
> +						int virt_irq)
> +{
> +	struct list_head *root = vgic_get_irq_phys_map_list(vcpu, virt_irq);
> +	struct irq_phys_map_entry *entry;
> +	struct irq_phys_map *map;
> +
> +	rcu_read_lock();
> +
> +	list_for_each_entry_rcu(entry, root, entry) {
> +		map = &entry->map;
> +		if (map->virt_irq == virt_irq && !map->deleted) {
> +			rcu_read_unlock();
> +			return map;
> +		}
> +	}
> +
> +	rcu_read_unlock();
> +
> +	return NULL;
> +}
> +
> +static void vgic_free_phys_irq_map_rcu(struct rcu_head *rcu)
> +{
> +	struct irq_phys_map_entry *entry;
> +
> +	entry = container_of(rcu, struct irq_phys_map_entry, rcu);
> +	kfree(entry);
> +}
> +
> +/**
> + * kvm_vgic_unmap_phys_irq - Remove a virtual to physical IRQ mapping
> + * @vcpu: The VCPU pointer
> + * @map: The pointer to a mapping obtained through kvm_vgic_map_phys_irq
> + *
> + * Remove an existing mapping between virtual and physical interrupts.
> + */
> +int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map)
> +{
> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> +	struct irq_phys_map_entry *entry;
> +	struct list_head *root;
> +
> +	if (!map)
> +		return -EINVAL;
> +
> +	root = vgic_get_irq_phys_map_list(vcpu, map->virt_irq);
> +
> +	spin_lock(&dist->irq_phys_map_lock);
> +
> +	list_for_each_entry(entry, root, entry) {
> +		if (&entry->map == map && !map->deleted) {
> +			map->deleted = true;

why do you need the 'deleted' flag?  You now search the list only after
holding the lock, so the race I pointed out before doesn't exist
anymore.  Is there an additional issue that the deleted flag is taking
care of?

> +			list_del_rcu(&entry->entry);
> +			call_rcu(&entry->rcu, vgic_free_phys_irq_map_rcu);
> +			break;
> +		}
> +	}
> +
> +	spin_unlock(&dist->irq_phys_map_lock);
> +
> +	return 0;
> +}
> +
> +static void vgic_destroy_irq_phys_map(struct kvm *kvm, struct list_head *root)
> +{
> +	struct vgic_dist *dist = &kvm->arch.vgic;
> +	struct irq_phys_map_entry *entry;
> +
> +	spin_lock(&dist->irq_phys_map_lock);
> +
> +	list_for_each_entry(entry, root, entry) {
> +		if (!entry->map.deleted) {
> +			entry->map.deleted = true;
> +			list_del_rcu(&entry->entry);
> +			call_rcu(&entry->rcu, vgic_free_phys_irq_map_rcu);
> +		}
> +	}
> +
> +	spin_unlock(&dist->irq_phys_map_lock);
> +}
> +
>  void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
>  {
>  	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> @@ -1591,6 +1754,7 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
>  	kfree(vgic_cpu->active_shared);
>  	kfree(vgic_cpu->pend_act_shared);
>  	kfree(vgic_cpu->vgic_irq_lr_map);
> +	vgic_destroy_irq_phys_map(vcpu->kvm, &vgic_cpu->irq_phys_map_list);
>  	vgic_cpu->pending_shared = NULL;
>  	vgic_cpu->active_shared = NULL;
>  	vgic_cpu->pend_act_shared = NULL;
> @@ -1628,6 +1792,17 @@ static int vgic_vcpu_init_maps(struct kvm_vcpu *vcpu, int nr_irqs)
>  }
>  
>  /**
> + * kvm_vgic_vcpu_early_init - Earliest possible per-vcpu vgic init stage

earliest possible... oh how I love the vgic code.

If you have the energy and context to write a comment in the beginning
of this file describing the init flow, I think that would be useful...

> + *
> + * No memory allocation should be performed here, only static init.
> + */
> +void kvm_vgic_vcpu_early_init(struct kvm_vcpu *vcpu)
> +{
> +	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> +	INIT_LIST_HEAD(&vgic_cpu->irq_phys_map_list);
> +}
> +
> +/**
>   * kvm_vgic_get_max_vcpus - Get the maximum number of VCPUs allowed by HW
>   *
>   * The host's GIC naturally limits the maximum amount of VCPUs a guest
> @@ -1664,6 +1839,7 @@ void kvm_vgic_destroy(struct kvm *kvm)
>  	kfree(dist->irq_spi_target);
>  	kfree(dist->irq_pending_on_cpu);
>  	kfree(dist->irq_active_on_cpu);
> +	vgic_destroy_irq_phys_map(kvm, &dist->irq_phys_map_list);
>  	dist->irq_sgi_sources = NULL;
>  	dist->irq_spi_cpu = NULL;
>  	dist->irq_spi_target = NULL;
> @@ -1787,6 +1963,18 @@ static int init_vgic_model(struct kvm *kvm, int type)
>  	return 0;
>  }
>  
> +/**
> + * kvm_vgic_early_init - Earliest possible vgic initialization stage
> + *
> + * No memory allocation should be performed here, only static init.
> + */
> +void kvm_vgic_early_init(struct kvm *kvm)
> +{
> +	spin_lock_init(&kvm->arch.vgic.lock);

so here we're initializing the data structures even before the vgic is
created, right?  So therefore it's safe to move the vgic.lock init up
here?

> +	spin_lock_init(&kvm->arch.vgic.irq_phys_map_lock);
> +	INIT_LIST_HEAD(&kvm->arch.vgic.irq_phys_map_list);
> +}
> +
>  int kvm_vgic_create(struct kvm *kvm, u32 type)
>  {
>  	int i, vcpu_lock_idx = -1, ret;
> @@ -1832,7 +2020,6 @@ int kvm_vgic_create(struct kvm *kvm, u32 type)
>  	if (ret)
>  		goto out_unlock;
>  
> -	spin_lock_init(&kvm->arch.vgic.lock);
>  	kvm->arch.vgic.in_kernel = true;
>  	kvm->arch.vgic.vgic_model = type;
>  	kvm->arch.vgic.vctrl_base = vgic->vctrl_base;
> -- 
> 2.1.4
> 

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 09/11] KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interrupt
  2015-07-24 15:55   ` Marc Zyngier
@ 2015-08-04 13:45     ` Christoffer Dall
  -1 siblings, 0 replies; 74+ messages in thread
From: Christoffer Dall @ 2015-08-04 13:45 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvm, kvmarm, linux-arm-kernel, Eric Auger, Alex Bennée,
	Andre Przywara

On Fri, Jul 24, 2015 at 04:55:07PM +0100, Marc Zyngier wrote:
> Virtual interrupts mapped to a HW interrupt should only be triggered
> from inside the kernel. Otherwise, you could end up confusing the
> kernel (and the GIC's) state machine.
> 
> Rearrange the injection path so that kvm_vgic_inject_irq is
> used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is
> used for mapped interrupts. The latter should only be called from
> inside the kernel (timer, VFIO).
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  include/kvm/arm_vgic.h |  2 +
>  virt/kvm/arm/vgic.c    | 99 ++++++++++++++++++++++++++++++++++----------------
>  2 files changed, 70 insertions(+), 31 deletions(-)
> 
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 7306b4b..f6bfd79 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -351,6 +351,8 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>  			bool level);
> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
> +			       struct irq_phys_map *map, bool level);
>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index 3f7b690..e40ef70 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -1533,7 +1533,8 @@ static int vgic_validate_injection(struct kvm_vcpu *vcpu, int irq, int level)
>  }
>  
>  static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
> -				  unsigned int irq_num, bool level)
> +				   struct irq_phys_map *map,
> +				   unsigned int irq_num, bool level)
>  {
>  	struct vgic_dist *dist = &kvm->arch.vgic;
>  	struct kvm_vcpu *vcpu;
> @@ -1541,6 +1542,9 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>  	int enabled;
>  	bool ret = true, can_inject = true;
>  
> +	if (irq_num >= min(kvm->arch.vgic.nr_irqs, 1020))
> +		return -EINVAL;
> +
>  	spin_lock(&dist->lock);
>  
>  	vcpu = kvm_get_vcpu(kvm, cpuid);
> @@ -1603,14 +1607,42 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>  out:
>  	spin_unlock(&dist->lock);
>  
> -	return ret ? cpuid : -EINVAL;
> +	if (!ret) {

don't you mean if (ret) here?  hint: ret is a bool

> +		/* kick the specified vcpu */
> +		kvm_vcpu_kick(kvm_get_vcpu(kvm, cpuid));
> +	}
> +
> +	return 0;

isn't this a change in the internal API?
Before, we would return -EINVAL when ret is false.  Not sure if this
has any consequences though?

> +}
> +
> +static int vgic_lazy_init(struct kvm *kvm)
> +{
> +	int ret = 0;
> +
> +	if (unlikely(!vgic_initialized(kvm))) {
> +		/*
> +		 * We only provide the automatic initialization of the VGIC
> +		 * for the legacy case of a GICv2. Any other type must
> +		 * be explicitly initialized once setup with the respective
> +		 * KVM device call.
> +		 */
> +		if (kvm->arch.vgic.vgic_model != KVM_DEV_TYPE_ARM_VGIC_V2)
> +			return -EBUSY;
> +
> +		mutex_lock(&kvm->lock);
> +		ret = vgic_init(kvm);
> +		mutex_unlock(&kvm->lock);
> +	}
> +
> +	return ret;
>  }
>  
>  /**
>   * kvm_vgic_inject_irq - Inject an IRQ from a device to the vgic
>   * @kvm:     The VM structure pointer
>   * @cpuid:   The CPU for PPIs
> - * @irq_num: The IRQ number that is assigned to the device
> + * @irq_num: The IRQ number that is assigned to the device. This IRQ
> + *           must not be mapped to a HW interrupt.
>   * @level:   Edge-triggered:  true:  to trigger the interrupt
>   *			      false: to ignore the call
>   *	     Level-sensitive  true:  activates an interrupt
> @@ -1623,39 +1655,44 @@ out:
>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>  			bool level)
>  {
> -	int ret = 0;
> -	int vcpu_id;
> -
> -	if (unlikely(!vgic_initialized(kvm))) {
> -		/*
> -		 * We only provide the automatic initialization of the VGIC
> -		 * for the legacy case of a GICv2. Any other type must
> -		 * be explicitly initialized once setup with the respective
> -		 * KVM device call.
> -		 */
> -		if (kvm->arch.vgic.vgic_model != KVM_DEV_TYPE_ARM_VGIC_V2) {
> -			ret = -EBUSY;
> -			goto out;
> -		}
> -		mutex_lock(&kvm->lock);
> -		ret = vgic_init(kvm);
> -		mutex_unlock(&kvm->lock);
> +	struct irq_phys_map *map;
> +	int ret;
>  
> -		if (ret)
> -			goto out;
> -	}
> +	ret = vgic_lazy_init(kvm);
> +	if (ret)
> +		return ret;
>  
> -	if (irq_num >= min(kvm->arch.vgic.nr_irqs, 1020))
> +	map = vgic_irq_map_search(kvm_get_vcpu(kvm, cpuid), irq_num);
> +	if (map)
>  		return -EINVAL;
>  
> -	vcpu_id = vgic_update_irq_pending(kvm, cpuid, irq_num, level);
> -	if (vcpu_id >= 0) {
> -		/* kick the specified vcpu */
> -		kvm_vcpu_kick(kvm_get_vcpu(kvm, vcpu_id));
> -	}
> +	return vgic_update_irq_pending(kvm, cpuid, NULL, irq_num, level);
> +}
>  
> -out:
> -	return ret;
> +/**
> + * kvm_vgic_inject_mapped_irq - Inject a physically mapped IRQ to the vgic
> + * @kvm:     The VM structure pointer
> + * @cpuid:   The CPU for PPIs
> + * @map:     Pointer to a irq_phys_map structure describing the mapping
> + * @level:   Edge-triggered:  true:  to trigger the interrupt
> + *			      false: to ignore the call
> + *	     Level-sensitive  true:  activates an interrupt
> + *			      false: deactivates an interrupt

just noticed this unfortunate use of the words 'activate/deactivate'
here, which is not true, it just raises/lowers the input signal...

> + *
> + * The GIC is not concerned with devices being active-LOW or active-HIGH for
> + * level-sensitive interrupts.  You can think of the level parameter as 1
> + * being HIGH and 0 being LOW and all devices being active-HIGH.
> + */
> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
> +			       struct irq_phys_map *map, bool level)
> +{
> +	int ret;
> +
> +	ret = vgic_lazy_init(kvm);
> +	if (ret)
> +		return ret;
> +
> +	return vgic_update_irq_pending(kvm, cpuid, map, map->virt_irq, level);
>  }
>  
>  static irqreturn_t vgic_maintenance_handler(int irq, void *data)
> -- 
> 2.1.4
> 

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 09/11] KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interrupt
@ 2015-08-04 13:45     ` Christoffer Dall
  0 siblings, 0 replies; 74+ messages in thread
From: Christoffer Dall @ 2015-08-04 13:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 24, 2015 at 04:55:07PM +0100, Marc Zyngier wrote:
> Virtual interrupts mapped to a HW interrupt should only be triggered
> from inside the kernel. Otherwise, you could end up confusing the
> kernel (and the GIC's) state machine.
> 
> Rearrange the injection path so that kvm_vgic_inject_irq is
> used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is
> used for mapped interrupts. The latter should only be called from
> inside the kernel (timer, VFIO).
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  include/kvm/arm_vgic.h |  2 +
>  virt/kvm/arm/vgic.c    | 99 ++++++++++++++++++++++++++++++++++----------------
>  2 files changed, 70 insertions(+), 31 deletions(-)
> 
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 7306b4b..f6bfd79 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -351,6 +351,8 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>  			bool level);
> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
> +			       struct irq_phys_map *map, bool level);
>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index 3f7b690..e40ef70 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -1533,7 +1533,8 @@ static int vgic_validate_injection(struct kvm_vcpu *vcpu, int irq, int level)
>  }
>  
>  static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
> -				  unsigned int irq_num, bool level)
> +				   struct irq_phys_map *map,
> +				   unsigned int irq_num, bool level)
>  {
>  	struct vgic_dist *dist = &kvm->arch.vgic;
>  	struct kvm_vcpu *vcpu;
> @@ -1541,6 +1542,9 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>  	int enabled;
>  	bool ret = true, can_inject = true;
>  
> +	if (irq_num >= min(kvm->arch.vgic.nr_irqs, 1020))
> +		return -EINVAL;
> +
>  	spin_lock(&dist->lock);
>  
>  	vcpu = kvm_get_vcpu(kvm, cpuid);
> @@ -1603,14 +1607,42 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>  out:
>  	spin_unlock(&dist->lock);
>  
> -	return ret ? cpuid : -EINVAL;
> +	if (!ret) {

don't you mean if (ret) here?  hint: ret is a bool

> +		/* kick the specified vcpu */
> +		kvm_vcpu_kick(kvm_get_vcpu(kvm, cpuid));
> +	}
> +
> +	return 0;

isn't this a change in the internal API?
Before, we would return -EINVAL when ret is false.  Not sure if this
has any consequences though?

> +}
> +
> +static int vgic_lazy_init(struct kvm *kvm)
> +{
> +	int ret = 0;
> +
> +	if (unlikely(!vgic_initialized(kvm))) {
> +		/*
> +		 * We only provide the automatic initialization of the VGIC
> +		 * for the legacy case of a GICv2. Any other type must
> +		 * be explicitly initialized once setup with the respective
> +		 * KVM device call.
> +		 */
> +		if (kvm->arch.vgic.vgic_model != KVM_DEV_TYPE_ARM_VGIC_V2)
> +			return -EBUSY;
> +
> +		mutex_lock(&kvm->lock);
> +		ret = vgic_init(kvm);
> +		mutex_unlock(&kvm->lock);
> +	}
> +
> +	return ret;
>  }
>  
>  /**
>   * kvm_vgic_inject_irq - Inject an IRQ from a device to the vgic
>   * @kvm:     The VM structure pointer
>   * @cpuid:   The CPU for PPIs
> - * @irq_num: The IRQ number that is assigned to the device
> + * @irq_num: The IRQ number that is assigned to the device. This IRQ
> + *           must not be mapped to a HW interrupt.
>   * @level:   Edge-triggered:  true:  to trigger the interrupt
>   *			      false: to ignore the call
>   *	     Level-sensitive  true:  activates an interrupt
> @@ -1623,39 +1655,44 @@ out:
>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>  			bool level)
>  {
> -	int ret = 0;
> -	int vcpu_id;
> -
> -	if (unlikely(!vgic_initialized(kvm))) {
> -		/*
> -		 * We only provide the automatic initialization of the VGIC
> -		 * for the legacy case of a GICv2. Any other type must
> -		 * be explicitly initialized once setup with the respective
> -		 * KVM device call.
> -		 */
> -		if (kvm->arch.vgic.vgic_model != KVM_DEV_TYPE_ARM_VGIC_V2) {
> -			ret = -EBUSY;
> -			goto out;
> -		}
> -		mutex_lock(&kvm->lock);
> -		ret = vgic_init(kvm);
> -		mutex_unlock(&kvm->lock);
> +	struct irq_phys_map *map;
> +	int ret;
>  
> -		if (ret)
> -			goto out;
> -	}
> +	ret = vgic_lazy_init(kvm);
> +	if (ret)
> +		return ret;
>  
> -	if (irq_num >= min(kvm->arch.vgic.nr_irqs, 1020))
> +	map = vgic_irq_map_search(kvm_get_vcpu(kvm, cpuid), irq_num);
> +	if (map)
>  		return -EINVAL;
>  
> -	vcpu_id = vgic_update_irq_pending(kvm, cpuid, irq_num, level);
> -	if (vcpu_id >= 0) {
> -		/* kick the specified vcpu */
> -		kvm_vcpu_kick(kvm_get_vcpu(kvm, vcpu_id));
> -	}
> +	return vgic_update_irq_pending(kvm, cpuid, NULL, irq_num, level);
> +}
>  
> -out:
> -	return ret;
> +/**
> + * kvm_vgic_inject_mapped_irq - Inject a physically mapped IRQ to the vgic
> + * @kvm:     The VM structure pointer
> + * @cpuid:   The CPU for PPIs
> + * @map:     Pointer to a irq_phys_map structure describing the mapping
> + * @level:   Edge-triggered:  true:  to trigger the interrupt
> + *			      false: to ignore the call
> + *	     Level-sensitive  true:  activates an interrupt
> + *			      false: deactivates an interrupt

just noticed this unfortunate use of the words 'activate/deactivate'
here, which is not true, it just raises/lowers the input signal...

> + *
> + * The GIC is not concerned with devices being active-LOW or active-HIGH for
> + * level-sensitive interrupts.  You can think of the level parameter as 1
> + * being HIGH and 0 being LOW and all devices being active-HIGH.
> + */
> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
> +			       struct irq_phys_map *map, bool level)
> +{
> +	int ret;
> +
> +	ret = vgic_lazy_init(kvm);
> +	if (ret)
> +		return ret;
> +
> +	return vgic_update_irq_pending(kvm, cpuid, map, map->virt_irq, level);
>  }
>  
>  static irqreturn_t vgic_maintenance_handler(int irq, void *data)
> -- 
> 2.1.4
> 

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 10/11] KVM: arm/arm64: timer: Allow the timer to control the active state
  2015-07-24 15:55   ` Marc Zyngier
@ 2015-08-04 13:56     ` Christoffer Dall
  -1 siblings, 0 replies; 74+ messages in thread
From: Christoffer Dall @ 2015-08-04 13:56 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvm, kvmarm, linux-arm-kernel, Eric Auger, Alex Bennée,
	Andre Przywara

On Fri, Jul 24, 2015 at 04:55:08PM +0100, Marc Zyngier wrote:
> In order to remove the crude hack where we sneak the masked bit
> into the timer's control register, make use of the phys_irq_map
> API control the active state of the interrupt.
> 
> This causes some limited changes to allow for potential error
> propagation.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm/kvm/reset.c         |  4 +---
>  arch/arm64/kvm/reset.c       |  4 +---
>  include/kvm/arm_arch_timer.h |  7 +++++--
>  virt/kvm/arm/arch_timer.c    | 29 ++++++++++++++++++++++-------
>  4 files changed, 29 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/arm/kvm/reset.c b/arch/arm/kvm/reset.c
> index f558c07..eeb85858 100644
> --- a/arch/arm/kvm/reset.c
> +++ b/arch/arm/kvm/reset.c
> @@ -77,7 +77,5 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>  	kvm_reset_coprocs(vcpu);
>  
>  	/* Reset arch_timer context */
> -	kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
> -
> -	return 0;
> +	return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
>  }
> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> index b4af618..91cf535 100644
> --- a/arch/arm64/kvm/reset.c
> +++ b/arch/arm64/kvm/reset.c
> @@ -121,7 +121,5 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>  	kvm_reset_sys_regs(vcpu);
>  
>  	/* Reset timer */
> -	kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
> -
> -	return 0;
> +	return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
>  }
> diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
> index e596675..e1e4d7c 100644
> --- a/include/kvm/arm_arch_timer.h
> +++ b/include/kvm/arm_arch_timer.h
> @@ -52,13 +52,16 @@ struct arch_timer_cpu {
>  
>  	/* Timer IRQ */
>  	const struct kvm_irq_level	*irq;
> +
> +	/* VGIC mapping */
> +	struct irq_phys_map		*map;
>  };
>  
>  int kvm_timer_hyp_init(void);
>  void kvm_timer_enable(struct kvm *kvm);
>  void kvm_timer_init(struct kvm *kvm);
> -void kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
> -			  const struct kvm_irq_level *irq);
> +int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
> +			 const struct kvm_irq_level *irq);
>  void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu);
>  void kvm_timer_flush_hwstate(struct kvm_vcpu *vcpu);
>  void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu);
> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> index 98c95f2..76e38d2 100644
> --- a/virt/kvm/arm/arch_timer.c
> +++ b/virt/kvm/arm/arch_timer.c
> @@ -64,10 +64,10 @@ static void kvm_timer_inject_irq(struct kvm_vcpu *vcpu)
>  	int ret;
>  	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
>  
> -	timer->cntv_ctl |= ARCH_TIMER_CTRL_IT_MASK;
> -	ret = kvm_vgic_inject_irq(vcpu->kvm, vcpu->vcpu_id,
> -				  timer->irq->irq,
> -				  timer->irq->level);
> +	kvm_vgic_set_phys_irq_active(timer->map, true);

actually, I'm not sure I understand the point of setting this boolean?
Is it just a check?  It seems everywhere we find the map in the VGIC
code, we have a BUG_ON(!map->active), so why doesn't (!!map) imply that
it should be marked as active?

Is this again for non-shared devices?

> +	ret = kvm_vgic_inject_mapped_irq(vcpu->kvm, vcpu->vcpu_id,
> +					 timer->map,
> +					 timer->irq->level);
>  	WARN_ON(ret);
>  }
>  
> @@ -117,7 +117,8 @@ bool kvm_timer_should_fire(struct kvm_vcpu *vcpu)
>  	cycle_t cval, now;
>  
>  	if ((timer->cntv_ctl & ARCH_TIMER_CTRL_IT_MASK) ||
> -		!(timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE))
> +	    !(timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE) ||
> +	    kvm_vgic_get_phys_irq_active(timer->map))

hmm, maybe this answers my question above.  It is the virtual active
state preventing us to keep injecting the virtual side?

I don't see anything functional wrong with this code though.

Thanks,
-Christoffer


>  		return false;
>  
>  	cval = timer->cntv_cval;
> @@ -184,10 +185,11 @@ void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
>  	timer_arm(timer, ns);
>  }
>  
> -void kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
> -			  const struct kvm_irq_level *irq)
> +int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
> +			 const struct kvm_irq_level *irq)
>  {
>  	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
> +	struct irq_phys_map *map;
>  
>  	/*
>  	 * The vcpu timer irq number cannot be determined in
> @@ -196,6 +198,17 @@ void kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
>  	 * vcpu timer irq number when the vcpu is reset.
>  	 */
>  	timer->irq = irq;
> +
> +	/*
> +	 * Tell the VGIC that the virtual interrupt is tied to a
> +	 * physical interrupt. We do that once per VCPU.
> +	 */
> +	map = kvm_vgic_map_phys_irq(vcpu, irq->irq, host_vtimer_irq);
> +	if (WARN_ON(IS_ERR(map)))
> +		return PTR_ERR(map);
> +
> +	timer->map = map;
> +	return 0;
>  }
>  
>  void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu)
> @@ -335,6 +348,8 @@ void kvm_timer_vcpu_terminate(struct kvm_vcpu *vcpu)
>  	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
>  
>  	timer_disarm(timer);
> +	if (timer->map)
> +		kvm_vgic_unmap_phys_irq(vcpu, timer->map);
>  }
>  
>  void kvm_timer_enable(struct kvm *kvm)
> -- 
> 2.1.4
> 



^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 10/11] KVM: arm/arm64: timer: Allow the timer to control the active state
@ 2015-08-04 13:56     ` Christoffer Dall
  0 siblings, 0 replies; 74+ messages in thread
From: Christoffer Dall @ 2015-08-04 13:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 24, 2015 at 04:55:08PM +0100, Marc Zyngier wrote:
> In order to remove the crude hack where we sneak the masked bit
> into the timer's control register, make use of the phys_irq_map
> API control the active state of the interrupt.
> 
> This causes some limited changes to allow for potential error
> propagation.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm/kvm/reset.c         |  4 +---
>  arch/arm64/kvm/reset.c       |  4 +---
>  include/kvm/arm_arch_timer.h |  7 +++++--
>  virt/kvm/arm/arch_timer.c    | 29 ++++++++++++++++++++++-------
>  4 files changed, 29 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/arm/kvm/reset.c b/arch/arm/kvm/reset.c
> index f558c07..eeb85858 100644
> --- a/arch/arm/kvm/reset.c
> +++ b/arch/arm/kvm/reset.c
> @@ -77,7 +77,5 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>  	kvm_reset_coprocs(vcpu);
>  
>  	/* Reset arch_timer context */
> -	kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
> -
> -	return 0;
> +	return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
>  }
> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> index b4af618..91cf535 100644
> --- a/arch/arm64/kvm/reset.c
> +++ b/arch/arm64/kvm/reset.c
> @@ -121,7 +121,5 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>  	kvm_reset_sys_regs(vcpu);
>  
>  	/* Reset timer */
> -	kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
> -
> -	return 0;
> +	return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
>  }
> diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
> index e596675..e1e4d7c 100644
> --- a/include/kvm/arm_arch_timer.h
> +++ b/include/kvm/arm_arch_timer.h
> @@ -52,13 +52,16 @@ struct arch_timer_cpu {
>  
>  	/* Timer IRQ */
>  	const struct kvm_irq_level	*irq;
> +
> +	/* VGIC mapping */
> +	struct irq_phys_map		*map;
>  };
>  
>  int kvm_timer_hyp_init(void);
>  void kvm_timer_enable(struct kvm *kvm);
>  void kvm_timer_init(struct kvm *kvm);
> -void kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
> -			  const struct kvm_irq_level *irq);
> +int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
> +			 const struct kvm_irq_level *irq);
>  void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu);
>  void kvm_timer_flush_hwstate(struct kvm_vcpu *vcpu);
>  void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu);
> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> index 98c95f2..76e38d2 100644
> --- a/virt/kvm/arm/arch_timer.c
> +++ b/virt/kvm/arm/arch_timer.c
> @@ -64,10 +64,10 @@ static void kvm_timer_inject_irq(struct kvm_vcpu *vcpu)
>  	int ret;
>  	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
>  
> -	timer->cntv_ctl |= ARCH_TIMER_CTRL_IT_MASK;
> -	ret = kvm_vgic_inject_irq(vcpu->kvm, vcpu->vcpu_id,
> -				  timer->irq->irq,
> -				  timer->irq->level);
> +	kvm_vgic_set_phys_irq_active(timer->map, true);

actually, I'm not sure I understand the point of setting this boolean?
Is it just a check?  It seems everywhere we find the map in the VGIC
code, we have a BUG_ON(!map->active), so why doesn't (!!map) imply that
it should be marked as active?

Is this again for non-shared devices?

> +	ret = kvm_vgic_inject_mapped_irq(vcpu->kvm, vcpu->vcpu_id,
> +					 timer->map,
> +					 timer->irq->level);
>  	WARN_ON(ret);
>  }
>  
> @@ -117,7 +117,8 @@ bool kvm_timer_should_fire(struct kvm_vcpu *vcpu)
>  	cycle_t cval, now;
>  
>  	if ((timer->cntv_ctl & ARCH_TIMER_CTRL_IT_MASK) ||
> -		!(timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE))
> +	    !(timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE) ||
> +	    kvm_vgic_get_phys_irq_active(timer->map))

hmm, maybe this answers my question above.  It is the virtual active
state preventing us to keep injecting the virtual side?

I don't see anything functional wrong with this code though.

Thanks,
-Christoffer


>  		return false;
>  
>  	cval = timer->cntv_cval;
> @@ -184,10 +185,11 @@ void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
>  	timer_arm(timer, ns);
>  }
>  
> -void kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
> -			  const struct kvm_irq_level *irq)
> +int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
> +			 const struct kvm_irq_level *irq)
>  {
>  	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
> +	struct irq_phys_map *map;
>  
>  	/*
>  	 * The vcpu timer irq number cannot be determined in
> @@ -196,6 +198,17 @@ void kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
>  	 * vcpu timer irq number when the vcpu is reset.
>  	 */
>  	timer->irq = irq;
> +
> +	/*
> +	 * Tell the VGIC that the virtual interrupt is tied to a
> +	 * physical interrupt. We do that once per VCPU.
> +	 */
> +	map = kvm_vgic_map_phys_irq(vcpu, irq->irq, host_vtimer_irq);
> +	if (WARN_ON(IS_ERR(map)))
> +		return PTR_ERR(map);
> +
> +	timer->map = map;
> +	return 0;
>  }
>  
>  void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu)
> @@ -335,6 +348,8 @@ void kvm_timer_vcpu_terminate(struct kvm_vcpu *vcpu)
>  	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
>  
>  	timer_disarm(timer);
> +	if (timer->map)
> +		kvm_vgic_unmap_phys_irq(vcpu, timer->map);
>  }
>  
>  void kvm_timer_enable(struct kvm *kvm)
> -- 
> 2.1.4
> 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 11/11] KVM: arm/arm64: vgic: Allow HW interrupts for non-shared devices
  2015-07-24 15:55   ` Marc Zyngier
@ 2015-08-04 14:32     ` Christoffer Dall
  -1 siblings, 0 replies; 74+ messages in thread
From: Christoffer Dall @ 2015-08-04 14:32 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvm, kvmarm, linux-arm-kernel, Eric Auger, Alex Bennée,
	Andre Przywara

On Fri, Jul 24, 2015 at 04:55:09PM +0100, Marc Zyngier wrote:
> So far, the only use of the HW interrupt facility is the timer,
> implying that the active state is context-switched for each vcpu,
> as the device is is shared across all vcpus.
> 
> This does not work for a device that has been assigned to a VM,
> as the guest is entierely in control of that device (the HW is
> not shared). In that case, it makes sense to bypass the whole
> active state switching, and only track the deactivation of the
> interrupt.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  include/kvm/arm_vgic.h    |  6 ++--
>  virt/kvm/arm/arch_timer.c |  3 +-
>  virt/kvm/arm/vgic.c       | 73 +++++++++++++++++++++++++++++++++++++----------
>  3 files changed, 64 insertions(+), 18 deletions(-)
> 
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index f6bfd79..6f0a4e1 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -164,7 +164,8 @@ struct irq_phys_map {
>  	u32			phys_irq;
>  	u32			irq;
>  	bool			deleted;
> -	bool			active;
> +	bool			shared;
> +	bool			active; /* Only valid if shared */
>  };
>  
>  struct irq_phys_map_entry {
> @@ -357,7 +358,8 @@ void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
>  struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
> -					   int virt_irq, int irq);
> +					   int virt_irq, int irq,
> +					   bool shared);
>  int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
>  bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map);
>  void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active);
> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> index 76e38d2..db21d8f 100644
> --- a/virt/kvm/arm/arch_timer.c
> +++ b/virt/kvm/arm/arch_timer.c
> @@ -203,7 +203,8 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
>  	 * Tell the VGIC that the virtual interrupt is tied to a
>  	 * physical interrupt. We do that once per VCPU.
>  	 */
> -	map = kvm_vgic_map_phys_irq(vcpu, irq->irq, host_vtimer_irq);
> +	map = kvm_vgic_map_phys_irq(vcpu, irq->irq,
> +				    host_vtimer_irq, true);
>  	if (WARN_ON(IS_ERR(map)))
>  		return PTR_ERR(map);
>  
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index e40ef70..5e6b816 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -1128,19 +1128,25 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
>  		 * active in the physical world. Otherwise the
>  		 * physical interrupt will fire and the guest will
>  		 * exit before processing the virtual interrupt.
> +		 *
> +		 * This is of course only valid for a shared
> +		 * interrupt. A non shared interrupt should already be
> +		 * active.
>  		 */
>  		if (map) {
> -			int ret;
> -
> -			BUG_ON(!map->active);
>  			vlr.hwirq = map->phys_irq;
>  			vlr.state |= LR_HW;
>  			vlr.state &= ~LR_EOI_INT;
>  
> -			ret = irq_set_irqchip_state(map->irq,
> -						    IRQCHIP_STATE_ACTIVE,
> -						    true);
> -			WARN_ON(ret);
> +			if (map->shared) {
> +				int ret;
> +
> +				BUG_ON(!map->active);
> +				ret = irq_set_irqchip_state(map->irq,
> +							    IRQCHIP_STATE_ACTIVE,
> +							    true);
> +				WARN_ON(ret);
> +			}
>  
>  			/*
>  			 * Make sure we're not going to sample this
> @@ -1383,21 +1389,41 @@ static bool vgic_process_maintenance(struct kvm_vcpu *vcpu)
>  static int vgic_sync_hwirq(struct kvm_vcpu *vcpu, struct vgic_lr vlr)
>  {
>  	struct irq_phys_map *map;
> +	bool active;
>  	int ret;
>  
>  	if (!(vlr.state & LR_HW))
>  		return 0;
>  
>  	map = vgic_irq_map_search(vcpu, vlr.irq);
> -	BUG_ON(!map || !map->active);
> +	BUG_ON(!map);
> +	BUG_ON(map->shared && !map->active);
>  
>  	ret = irq_get_irqchip_state(map->irq,
>  				    IRQCHIP_STATE_ACTIVE,
> -				    &map->active);
> +				    &active);
>  
>  	WARN_ON(ret);
>  
> -	if (map->active) {
> +	/*
> +	 * For a non-shared interrupt, we have to cater for two
> +	 * possible deactivation conditions:
> +	 *
> +	 * - the physical interrupt is now inactive (EOIed from the
> +         *   guest)

nit: whitespace funkyness

> +	 * - the physical interrupt is still active, but its virtual
> +	 *   counterpart is flagged as "not queued", indicating another
> +	 *   interrupt has fired between the EOI and the guest exit.
> +	 *
> +	 * Also, we are not reactivating a non-shared interrupt

what does reactivating mean? did you mean deactivate?

> +	 * ourselves. This is always left to the guest.

In which case, add ", because the device is solely owned by the guest."

> +	 */
> +	if (!map->shared)
> +		return !active || !vgic_irq_is_queued(vcpu, vlr.irq);

do you really need the second part of the disjunction?

The effect seems to be that we clear the queued flag once again, and
clear the LR.  If we don't do this, won't we simply pick up the pending
flag next time we're about to run this VCPU and piggy-back on the
existing LR.  Perhaps this is a weird flow though?

> +
> +	map->active = active;
> +
> +	if (active) {
>  		ret = irq_set_irqchip_state(map->irq,
>  					    IRQCHIP_STATE_ACTIVE,
>  					    false);
> @@ -1590,6 +1616,17 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>  		goto out;
>  	}
>  
> +	if (map && !map->shared) {
> +		/*
> +		 * We are told to inject a HW irq, so we have to trust
> +		 * the caller that the previous one has been EOIed,
> +		 * and that a new one is now active. Clearing the
> +		 * queued state will have the effect of making it
> +		 * sample-able again.
> +		 */
> +		vgic_irq_clear_queued(vcpu, irq_num);
> +	}
> +
>  	if (!vgic_can_sample_irq(vcpu, irq_num)) {
>  		/*
>  		 * Level interrupt in progress, will be picked up
> @@ -1720,15 +1757,19 @@ static struct list_head *vgic_get_irq_phys_map_list(struct kvm_vcpu *vcpu,
>   * @vcpu: The VCPU pointer
>   * @virt_irq: The virtual irq number
>   * @irq: The Linux IRQ number
> + * @shared: Indicates if the interrupt has to be context-switched or
> + *          if it is private to a VM
>   *
>   * Establish a mapping between a guest visible irq (@virt_irq) and a
>   * Linux irq (@irq). On injection, @virt_irq will be associated with
> - * the physical interrupt represented by @irq.
> + * the physical interrupt represented by @irq. If @shared is true,
> + * the active state of the interrupt will be context-switched.
>   *
>   * Returns a valid pointer on success, and an error pointer otherwise
>   */
>  struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
> -					   int virt_irq, int irq)
> +					   int virt_irq, int irq,
> +					   bool shared)
>  {
>  	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>  	struct list_head *root = vgic_get_irq_phys_map_list(vcpu, virt_irq);
> @@ -1762,7 +1803,8 @@ struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
>  	if (map) {
>  		/* Make sure this mapping matches */
>  		if (map->phys_irq != phys_irq	||
> -		    map->irq      != irq)
> +		    map->irq      != irq	||
> +		    map->shared   != shared)
>  			map = ERR_PTR(-EINVAL);
>  
>  		goto out;
> @@ -1772,6 +1814,7 @@ struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
>  	map->virt_irq = virt_irq;
>  	map->phys_irq = phys_irq;
>  	map->irq      = irq;
> +	map->shared   = shared;
>  
>  	list_add_tail_rcu(&entry->entry, root);
>  
> @@ -1822,7 +1865,7 @@ static void vgic_free_phys_irq_map_rcu(struct rcu_head *rcu)
>   */
>  bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map)
>  {
> -	BUG_ON(!map);
> +	BUG_ON(!map || !map->shared);
>  	return map->active;
>  }
>  
> @@ -1834,7 +1877,7 @@ bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map)
>   */
>  void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active)
>  {
> -	BUG_ON(!map);
> +	BUG_ON(!map || !map->shared);
>  	map->active = active;
>  }
>  
> -- 
> 2.1.4
> 

I think I'm seeing this in the correct light this time around, so this
is functionally looking pretty good to me.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 11/11] KVM: arm/arm64: vgic: Allow HW interrupts for non-shared devices
@ 2015-08-04 14:32     ` Christoffer Dall
  0 siblings, 0 replies; 74+ messages in thread
From: Christoffer Dall @ 2015-08-04 14:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 24, 2015 at 04:55:09PM +0100, Marc Zyngier wrote:
> So far, the only use of the HW interrupt facility is the timer,
> implying that the active state is context-switched for each vcpu,
> as the device is is shared across all vcpus.
> 
> This does not work for a device that has been assigned to a VM,
> as the guest is entierely in control of that device (the HW is
> not shared). In that case, it makes sense to bypass the whole
> active state switching, and only track the deactivation of the
> interrupt.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  include/kvm/arm_vgic.h    |  6 ++--
>  virt/kvm/arm/arch_timer.c |  3 +-
>  virt/kvm/arm/vgic.c       | 73 +++++++++++++++++++++++++++++++++++++----------
>  3 files changed, 64 insertions(+), 18 deletions(-)
> 
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index f6bfd79..6f0a4e1 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -164,7 +164,8 @@ struct irq_phys_map {
>  	u32			phys_irq;
>  	u32			irq;
>  	bool			deleted;
> -	bool			active;
> +	bool			shared;
> +	bool			active; /* Only valid if shared */
>  };
>  
>  struct irq_phys_map_entry {
> @@ -357,7 +358,8 @@ void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
>  struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
> -					   int virt_irq, int irq);
> +					   int virt_irq, int irq,
> +					   bool shared);
>  int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
>  bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map);
>  void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active);
> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> index 76e38d2..db21d8f 100644
> --- a/virt/kvm/arm/arch_timer.c
> +++ b/virt/kvm/arm/arch_timer.c
> @@ -203,7 +203,8 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
>  	 * Tell the VGIC that the virtual interrupt is tied to a
>  	 * physical interrupt. We do that once per VCPU.
>  	 */
> -	map = kvm_vgic_map_phys_irq(vcpu, irq->irq, host_vtimer_irq);
> +	map = kvm_vgic_map_phys_irq(vcpu, irq->irq,
> +				    host_vtimer_irq, true);
>  	if (WARN_ON(IS_ERR(map)))
>  		return PTR_ERR(map);
>  
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index e40ef70..5e6b816 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -1128,19 +1128,25 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
>  		 * active in the physical world. Otherwise the
>  		 * physical interrupt will fire and the guest will
>  		 * exit before processing the virtual interrupt.
> +		 *
> +		 * This is of course only valid for a shared
> +		 * interrupt. A non shared interrupt should already be
> +		 * active.
>  		 */
>  		if (map) {
> -			int ret;
> -
> -			BUG_ON(!map->active);
>  			vlr.hwirq = map->phys_irq;
>  			vlr.state |= LR_HW;
>  			vlr.state &= ~LR_EOI_INT;
>  
> -			ret = irq_set_irqchip_state(map->irq,
> -						    IRQCHIP_STATE_ACTIVE,
> -						    true);
> -			WARN_ON(ret);
> +			if (map->shared) {
> +				int ret;
> +
> +				BUG_ON(!map->active);
> +				ret = irq_set_irqchip_state(map->irq,
> +							    IRQCHIP_STATE_ACTIVE,
> +							    true);
> +				WARN_ON(ret);
> +			}
>  
>  			/*
>  			 * Make sure we're not going to sample this
> @@ -1383,21 +1389,41 @@ static bool vgic_process_maintenance(struct kvm_vcpu *vcpu)
>  static int vgic_sync_hwirq(struct kvm_vcpu *vcpu, struct vgic_lr vlr)
>  {
>  	struct irq_phys_map *map;
> +	bool active;
>  	int ret;
>  
>  	if (!(vlr.state & LR_HW))
>  		return 0;
>  
>  	map = vgic_irq_map_search(vcpu, vlr.irq);
> -	BUG_ON(!map || !map->active);
> +	BUG_ON(!map);
> +	BUG_ON(map->shared && !map->active);
>  
>  	ret = irq_get_irqchip_state(map->irq,
>  				    IRQCHIP_STATE_ACTIVE,
> -				    &map->active);
> +				    &active);
>  
>  	WARN_ON(ret);
>  
> -	if (map->active) {
> +	/*
> +	 * For a non-shared interrupt, we have to cater for two
> +	 * possible deactivation conditions:
> +	 *
> +	 * - the physical interrupt is now inactive (EOIed from the
> +         *   guest)

nit: whitespace funkyness

> +	 * - the physical interrupt is still active, but its virtual
> +	 *   counterpart is flagged as "not queued", indicating another
> +	 *   interrupt has fired between the EOI and the guest exit.
> +	 *
> +	 * Also, we are not reactivating a non-shared interrupt

what does reactivating mean? did you mean deactivate?

> +	 * ourselves. This is always left to the guest.

In which case, add ", because the device is solely owned by the guest."

> +	 */
> +	if (!map->shared)
> +		return !active || !vgic_irq_is_queued(vcpu, vlr.irq);

do you really need the second part of the disjunction?

The effect seems to be that we clear the queued flag once again, and
clear the LR.  If we don't do this, won't we simply pick up the pending
flag next time we're about to run this VCPU and piggy-back on the
existing LR.  Perhaps this is a weird flow though?

> +
> +	map->active = active;
> +
> +	if (active) {
>  		ret = irq_set_irqchip_state(map->irq,
>  					    IRQCHIP_STATE_ACTIVE,
>  					    false);
> @@ -1590,6 +1616,17 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>  		goto out;
>  	}
>  
> +	if (map && !map->shared) {
> +		/*
> +		 * We are told to inject a HW irq, so we have to trust
> +		 * the caller that the previous one has been EOIed,
> +		 * and that a new one is now active. Clearing the
> +		 * queued state will have the effect of making it
> +		 * sample-able again.
> +		 */
> +		vgic_irq_clear_queued(vcpu, irq_num);
> +	}
> +
>  	if (!vgic_can_sample_irq(vcpu, irq_num)) {
>  		/*
>  		 * Level interrupt in progress, will be picked up
> @@ -1720,15 +1757,19 @@ static struct list_head *vgic_get_irq_phys_map_list(struct kvm_vcpu *vcpu,
>   * @vcpu: The VCPU pointer
>   * @virt_irq: The virtual irq number
>   * @irq: The Linux IRQ number
> + * @shared: Indicates if the interrupt has to be context-switched or
> + *          if it is private to a VM
>   *
>   * Establish a mapping between a guest visible irq (@virt_irq) and a
>   * Linux irq (@irq). On injection, @virt_irq will be associated with
> - * the physical interrupt represented by @irq.
> + * the physical interrupt represented by @irq. If @shared is true,
> + * the active state of the interrupt will be context-switched.
>   *
>   * Returns a valid pointer on success, and an error pointer otherwise
>   */
>  struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
> -					   int virt_irq, int irq)
> +					   int virt_irq, int irq,
> +					   bool shared)
>  {
>  	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>  	struct list_head *root = vgic_get_irq_phys_map_list(vcpu, virt_irq);
> @@ -1762,7 +1803,8 @@ struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
>  	if (map) {
>  		/* Make sure this mapping matches */
>  		if (map->phys_irq != phys_irq	||
> -		    map->irq      != irq)
> +		    map->irq      != irq	||
> +		    map->shared   != shared)
>  			map = ERR_PTR(-EINVAL);
>  
>  		goto out;
> @@ -1772,6 +1814,7 @@ struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
>  	map->virt_irq = virt_irq;
>  	map->phys_irq = phys_irq;
>  	map->irq      = irq;
> +	map->shared   = shared;
>  
>  	list_add_tail_rcu(&entry->entry, root);
>  
> @@ -1822,7 +1865,7 @@ static void vgic_free_phys_irq_map_rcu(struct rcu_head *rcu)
>   */
>  bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map)
>  {
> -	BUG_ON(!map);
> +	BUG_ON(!map || !map->shared);
>  	return map->active;
>  }
>  
> @@ -1834,7 +1877,7 @@ bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map)
>   */
>  void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active)
>  {
> -	BUG_ON(!map);
> +	BUG_ON(!map || !map->shared);
>  	map->active = active;
>  }
>  
> -- 
> 2.1.4
> 

I think I'm seeing this in the correct light this time around, so this
is functionally looking pretty good to me.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 04/11] KVM: arm/arm64: vgic: Allow HW irq to be encoded in LR
  2015-07-24 15:55   ` Marc Zyngier
@ 2015-08-04 14:33     ` Christoffer Dall
  -1 siblings, 0 replies; 74+ messages in thread
From: Christoffer Dall @ 2015-08-04 14:33 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvm, Andre Przywara, kvmarm, linux-arm-kernel

On Fri, Jul 24, 2015 at 04:55:02PM +0100, Marc Zyngier wrote:
> Now that struct vgic_lr supports the LR_HW bit and carries a hwirq
> field, we can encode that information into the list registers.
> 
> This patch provides implementations for both GICv2 and GICv3.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 04/11] KVM: arm/arm64: vgic: Allow HW irq to be encoded in LR
@ 2015-08-04 14:33     ` Christoffer Dall
  0 siblings, 0 replies; 74+ messages in thread
From: Christoffer Dall @ 2015-08-04 14:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 24, 2015 at 04:55:02PM +0100, Marc Zyngier wrote:
> Now that struct vgic_lr supports the LR_HW bit and carries a hwirq
> field, we can encode that information into the list registers.
> 
> This patch provides implementations for both GICv2 and GICv3.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 07/11] KVM: arm/arm64: vgic: Allow HW interrupts to be queued to a guest
  2015-07-24 15:55   ` Marc Zyngier
@ 2015-08-04 14:33     ` Christoffer Dall
  -1 siblings, 0 replies; 74+ messages in thread
From: Christoffer Dall @ 2015-08-04 14:33 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvm, Andre Przywara, kvmarm, linux-arm-kernel

On Fri, Jul 24, 2015 at 04:55:05PM +0100, Marc Zyngier wrote:
> To allow a HW interrupt to be injected into a guest, we lookup the
> guest virtual interrupt in the irq_phys_map list, and if we have
> a match, encode both interrupts in the LR.
> 
> We also mark the interrupt as "active" at the host distributor level.
> 
> On guest EOI on the virtual interrupt, the host interrupt will be
> deactivated.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 07/11] KVM: arm/arm64: vgic: Allow HW interrupts to be queued to a guest
@ 2015-08-04 14:33     ` Christoffer Dall
  0 siblings, 0 replies; 74+ messages in thread
From: Christoffer Dall @ 2015-08-04 14:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 24, 2015 at 04:55:05PM +0100, Marc Zyngier wrote:
> To allow a HW interrupt to be injected into a guest, we lookup the
> guest virtual interrupt in the irq_phys_map list, and if we have
> a match, encode both interrupts in the LR.
> 
> We also mark the interrupt as "active" at the host distributor level.
> 
> On guest EOI on the virtual interrupt, the host interrupt will be
> deactivated.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 06/11] KVM: arm/arm64: vgic: Allow dynamic mapping of physical/virtual interrupts
  2015-08-04 13:04     ` Christoffer Dall
@ 2015-08-04 15:27       ` Marc Zyngier
  -1 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-08-04 15:27 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: kvm, kvmarm, linux-arm-kernel, Eric Auger, Alex Bennée,
	Andre Przywara

On 04/08/15 14:04, Christoffer Dall wrote:
> On Fri, Jul 24, 2015 at 04:55:04PM +0100, Marc Zyngier wrote:
>> In order to be able to feed physical interrupts to a guest, we need
>> to be able to establish the virtual-physical mapping between the two
>> worlds.
>>
>> The mappings are kept in a set of RCU lists, indexed by virtual interrupts.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  arch/arm/kvm/arm.c     |   2 +
>>  include/kvm/arm_vgic.h |  26 +++++++
>>  virt/kvm/arm/vgic.c    | 189 ++++++++++++++++++++++++++++++++++++++++++++++++-
>>  3 files changed, 216 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>> index f1bf418..ce404a5 100644
>> --- a/arch/arm/kvm/arm.c
>> +++ b/arch/arm/kvm/arm.c
>> @@ -125,6 +125,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>>       if (ret)
>>               goto out_free_stage2_pgd;
>>
>> +     kvm_vgic_early_init(kvm);
>>       kvm_timer_init(kvm);
>>
>>       /* Mark the initial VMID generation invalid */
>> @@ -249,6 +250,7 @@ out:
>>
>>  void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
>>  {
>> +     kvm_vgic_vcpu_early_init(vcpu);
>>  }
>>
>>  void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu)
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index a881e39..68212a1 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -159,6 +159,20 @@ struct vgic_io_device {
>>       struct kvm_io_device dev;
>>  };
>>
>> +struct irq_phys_map {
>> +     u32                     virt_irq;
>> +     u32                     phys_irq;
>> +     u32                     irq;
>> +     bool                    deleted;
>> +     bool                    active;
>> +};
>> +
>> +struct irq_phys_map_entry {
>> +     struct list_head        entry;
>> +     struct rcu_head         rcu;
>> +     struct irq_phys_map     map;
>> +};
>> +
>>  struct vgic_dist {
>>       spinlock_t              lock;
>>       bool                    in_kernel;
>> @@ -256,6 +270,10 @@ struct vgic_dist {
>>       struct vgic_vm_ops      vm_ops;
>>       struct vgic_io_device   dist_iodev;
>>       struct vgic_io_device   *redist_iodevs;
>> +
>> +     /* Virtual irq to hwirq mapping */
>> +     spinlock_t              irq_phys_map_lock;
>> +     struct list_head        irq_phys_map_list;
>>  };
>>
>>  struct vgic_v2_cpu_if {
>> @@ -307,6 +325,9 @@ struct vgic_cpu {
>>               struct vgic_v2_cpu_if   vgic_v2;
>>               struct vgic_v3_cpu_if   vgic_v3;
>>       };
>> +
>> +     /* Protected by the distributor's irq_phys_map_lock */
>> +     struct list_head        irq_phys_map_list;
>>  };
>>
>>  #define LR_EMPTY     0xff
>> @@ -321,8 +342,10 @@ int kvm_vgic_addr(struct kvm *kvm, unsigned long type, u64 *addr, bool write);
>>  int kvm_vgic_hyp_init(void);
>>  int kvm_vgic_map_resources(struct kvm *kvm);
>>  int kvm_vgic_get_max_vcpus(void);
>> +void kvm_vgic_early_init(struct kvm *kvm);
>>  int kvm_vgic_create(struct kvm *kvm, u32 type);
>>  void kvm_vgic_destroy(struct kvm *kvm);
>> +void kvm_vgic_vcpu_early_init(struct kvm_vcpu *vcpu);
>>  void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu);
>>  void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
>>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
>> @@ -331,6 +354,9 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
>> +struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
>> +                                        int virt_irq, int irq);
>> +int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
>>
>>  #define irqchip_in_kernel(k) (!!((k)->arch.vgic.in_kernel))
>>  #define vgic_initialized(k)  (!!((k)->arch.vgic.nr_cpus))
>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>> index 5bd1695..c74d929 100644
>> --- a/virt/kvm/arm/vgic.c
>> +++ b/virt/kvm/arm/vgic.c
>> @@ -24,6 +24,7 @@
>>  #include <linux/of.h>
>>  #include <linux/of_address.h>
>>  #include <linux/of_irq.h>
>> +#include <linux/rculist.h>
>>  #include <linux/uaccess.h>
>>
>>  #include <asm/kvm_emulate.h>
>> @@ -82,6 +83,8 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu);
>>  static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu);
>>  static struct vgic_lr vgic_get_lr(const struct kvm_vcpu *vcpu, int lr);
>>  static void vgic_set_lr(struct kvm_vcpu *vcpu, int lr, struct vgic_lr lr_desc);
>> +static struct irq_phys_map *vgic_irq_map_search(struct kvm_vcpu *vcpu,
>> +                                             int virt_irq);
>>
>>  static const struct vgic_ops *vgic_ops;
>>  static const struct vgic_params *vgic;
>> @@ -1583,6 +1586,166 @@ static irqreturn_t vgic_maintenance_handler(int irq, void *data)
>>       return IRQ_HANDLED;
>>  }
>>
>> +static struct list_head *vgic_get_irq_phys_map_list(struct kvm_vcpu *vcpu,
>> +                                                 int virt_irq)
>> +{
>> +     if (virt_irq < VGIC_NR_PRIVATE_IRQS)
>> +             return &vcpu->arch.vgic_cpu.irq_phys_map_list;
>> +     else
>> +             return &vcpu->kvm->arch.vgic.irq_phys_map_list;
>> +}
>> +
>> +/**
>> + * kvm_vgic_map_phys_irq - map a virtual IRQ to a physical IRQ
>> + * @vcpu: The VCPU pointer
>> + * @virt_irq: The virtual irq number
>> + * @irq: The Linux IRQ number
>> + *
>> + * Establish a mapping between a guest visible irq (@virt_irq) and a
>> + * Linux irq (@irq). On injection, @virt_irq will be associated with
>> + * the physical interrupt represented by @irq.
>> + *
>> + * Returns a valid pointer on success, and an error pointer otherwise
>> + */
>> +struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
>> +                                        int virt_irq, int irq)
>> +{
>> +     struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>> +     struct list_head *root = vgic_get_irq_phys_map_list(vcpu, virt_irq);
>> +     struct irq_phys_map *map;
>> +     struct irq_phys_map_entry *entry;
>> +     struct irq_desc *desc;
>> +     struct irq_data *data;
>> +     int phys_irq;
>> +
>> +     desc = irq_to_desc(irq);
>> +     if (!desc) {
>> +             kvm_err("kvm_vgic_map_phys_irq: no interrupt descriptor\n");
> 
> super over-ridiculous nit:
> if you respin you could consider "%s: no...", __FUNC__

Sure.

> 
>> +             return ERR_PTR(-EINVAL);
>> +     }
>> +
>> +     data = irq_desc_get_irq_data(desc);
>> +     while (data->parent_data)
>> +             data = data->parent_data;
>> +
>> +     phys_irq = data->hwirq;
>> +
>> +     /* Create a new mapping */
>> +     entry = kzalloc(sizeof(*entry), GFP_KERNEL);
>> +     if (!entry)
>> +             return ERR_PTR(-ENOMEM);
>> +
>> +     spin_lock(&dist->irq_phys_map_lock);
>> +
>> +     /* Try to match an existing mapping */
>> +     map = vgic_irq_map_search(vcpu, virt_irq);
>> +     if (map) {
>> +             /* Make sure this mapping matches */
> 
> why do we need to check that it matches?  Is it ever allowed to have
> multiple mappings of a virtual interrupt?
> 

Multiple *different* mappings, no. What you can have though is issuing
requests for the *same* mapping repeatedly. This is what happens with
the timer, for example: kvm_timer_vcpu_reset is going to be called more
than once over the lifetime of a VM, requesting a mapping each time
without necessarily destroying the previous one.

>> +             if (map->phys_irq != phys_irq   ||
>> +                 map->irq      != irq)
>> +                     map = ERR_PTR(-EINVAL);
>> +
>> +             goto out;
> 
> why is this a valid situation?  Shouldn't you return -EEXIST or
> something instead?  (that would also simplify the error-checking free
> routine below).
> 
> If you want it to be valid to map the same virq to irq multiple times
> (to free the caller from some bookkeeping, I presume?) then that should
> be part of the function documentation, and I think we should add a
> "found existing mapping" comment above the goto out statement.

Fair enough. I'll add some comments to that effect.

>> +     }
>> +
>> +     map           = &entry->map;
>> +     map->virt_irq = virt_irq;
>> +     map->phys_irq = phys_irq;
>> +     map->irq      = irq;
>> +
>> +     list_add_tail_rcu(&entry->entry, root);
>> +
>> +out:
>> +     spin_unlock(&dist->irq_phys_map_lock);
>> +     /* If we've found a hit in the existing list, free the useless
>> +      * entry */
>> +     if (IS_ERR(map) || map != &entry->map)
>> +             kfree(entry);
>> +     return map;
>> +}
>> +
>> +static struct irq_phys_map *vgic_irq_map_search(struct kvm_vcpu *vcpu,
>> +                                             int virt_irq)
>> +{
>> +     struct list_head *root = vgic_get_irq_phys_map_list(vcpu, virt_irq);
>> +     struct irq_phys_map_entry *entry;
>> +     struct irq_phys_map *map;
>> +
>> +     rcu_read_lock();
>> +
>> +     list_for_each_entry_rcu(entry, root, entry) {
>> +             map = &entry->map;
>> +             if (map->virt_irq == virt_irq && !map->deleted) {
>> +                     rcu_read_unlock();
>> +                     return map;
>> +             }
>> +     }
>> +
>> +     rcu_read_unlock();
>> +
>> +     return NULL;
>> +}
>> +
>> +static void vgic_free_phys_irq_map_rcu(struct rcu_head *rcu)
>> +{
>> +     struct irq_phys_map_entry *entry;
>> +
>> +     entry = container_of(rcu, struct irq_phys_map_entry, rcu);
>> +     kfree(entry);
>> +}
>> +
>> +/**
>> + * kvm_vgic_unmap_phys_irq - Remove a virtual to physical IRQ mapping
>> + * @vcpu: The VCPU pointer
>> + * @map: The pointer to a mapping obtained through kvm_vgic_map_phys_irq
>> + *
>> + * Remove an existing mapping between virtual and physical interrupts.
>> + */
>> +int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map)
>> +{
>> +     struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>> +     struct irq_phys_map_entry *entry;
>> +     struct list_head *root;
>> +
>> +     if (!map)
>> +             return -EINVAL;
>> +
>> +     root = vgic_get_irq_phys_map_list(vcpu, map->virt_irq);
>> +
>> +     spin_lock(&dist->irq_phys_map_lock);
>> +
>> +     list_for_each_entry(entry, root, entry) {
>> +             if (&entry->map == map && !map->deleted) {
>> +                     map->deleted = true;
> 
> why do you need the 'deleted' flag?  You now search the list only after
> holding the lock, so the race I pointed out before doesn't exist
> anymore.  Is there an additional issue that the deleted flag is taking
> care of?

This could still race with a destroy occuring before we take the lock,
and before we get get RCU to do the cleanup (the list would still be
populated).

> 
>> +                     list_del_rcu(&entry->entry);
>> +                     call_rcu(&entry->rcu, vgic_free_phys_irq_map_rcu);
>> +                     break;
>> +             }
>> +     }
>> +
>> +     spin_unlock(&dist->irq_phys_map_lock);
>> +
>> +     return 0;
>> +}
>> +
>> +static void vgic_destroy_irq_phys_map(struct kvm *kvm, struct list_head *root)
>> +{
>> +     struct vgic_dist *dist = &kvm->arch.vgic;
>> +     struct irq_phys_map_entry *entry;
>> +
>> +     spin_lock(&dist->irq_phys_map_lock);
>> +
>> +     list_for_each_entry(entry, root, entry) {
>> +             if (!entry->map.deleted) {
>> +                     entry->map.deleted = true;
>> +                     list_del_rcu(&entry->entry);
>> +                     call_rcu(&entry->rcu, vgic_free_phys_irq_map_rcu);
>> +             }
>> +     }
>> +
>> +     spin_unlock(&dist->irq_phys_map_lock);
>> +}
>> +
>>  void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
>>  {
>>       struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>> @@ -1591,6 +1754,7 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
>>       kfree(vgic_cpu->active_shared);
>>       kfree(vgic_cpu->pend_act_shared);
>>       kfree(vgic_cpu->vgic_irq_lr_map);
>> +     vgic_destroy_irq_phys_map(vcpu->kvm, &vgic_cpu->irq_phys_map_list);
>>       vgic_cpu->pending_shared = NULL;
>>       vgic_cpu->active_shared = NULL;
>>       vgic_cpu->pend_act_shared = NULL;
>> @@ -1628,6 +1792,17 @@ static int vgic_vcpu_init_maps(struct kvm_vcpu *vcpu, int nr_irqs)
>>  }
>>
>>  /**
>> + * kvm_vgic_vcpu_early_init - Earliest possible per-vcpu vgic init stage
> 
> earliest possible... oh how I love the vgic code.
> 

You must be the only one! ;-)

> If you have the energy and context to write a comment in the beginning
> of this file describing the init flow, I think that would be useful...
> 

How about starting with this?

<quote>
There are multiple stages to the vgic initialization, both for the
distributor and the CPU interfaces.

Distributor:

- kvm_vgic_early_init(): initialization of static data that doesn't
depend on any sizing information or emulation type. No allocation is
allowed there.

- vgic_init(): allocation and initialization of the generic data
structures that depend on sizing information (number of CPUs, number of
interrupts). Also initializes the vcpu specific data structures. Can be
executed lazily for GICv2. [to be renamed to kvm_vgic_init??]

CPU Interface:

- kvm_vgic_cpu_early_init(): initialization of static data that doesn't
depend on any sizing information or emulation type. No allocation is
allowed there.
</quote>

I'm sure we can expand it as we go...

>> + *
>> + * No memory allocation should be performed here, only static init.
>> + */
>> +void kvm_vgic_vcpu_early_init(struct kvm_vcpu *vcpu)
>> +{
>> +     struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>> +     INIT_LIST_HEAD(&vgic_cpu->irq_phys_map_list);
>> +}
>> +
>> +/**
>>   * kvm_vgic_get_max_vcpus - Get the maximum number of VCPUs allowed by HW
>>   *
>>   * The host's GIC naturally limits the maximum amount of VCPUs a guest
>> @@ -1664,6 +1839,7 @@ void kvm_vgic_destroy(struct kvm *kvm)
>>       kfree(dist->irq_spi_target);
>>       kfree(dist->irq_pending_on_cpu);
>>       kfree(dist->irq_active_on_cpu);
>> +     vgic_destroy_irq_phys_map(kvm, &dist->irq_phys_map_list);
>>       dist->irq_sgi_sources = NULL;
>>       dist->irq_spi_cpu = NULL;
>>       dist->irq_spi_target = NULL;
>> @@ -1787,6 +1963,18 @@ static int init_vgic_model(struct kvm *kvm, int type)
>>       return 0;
>>  }
>>
>> +/**
>> + * kvm_vgic_early_init - Earliest possible vgic initialization stage
>> + *
>> + * No memory allocation should be performed here, only static init.
>> + */
>> +void kvm_vgic_early_init(struct kvm *kvm)
>> +{
>> +     spin_lock_init(&kvm->arch.vgic.lock);
> 
> so here we're initializing the data structures even before the vgic is
> created, right?  So therefore it's safe to move the vgic.lock init up
> here?

Yes. The distributor structure is part of the struct kvm, hence safe to
poke early on. This doesn't depend on any form of sizing or emulation
type either.

> 
>> +     spin_lock_init(&kvm->arch.vgic.irq_phys_map_lock);
>> +     INIT_LIST_HEAD(&kvm->arch.vgic.irq_phys_map_list);
>> +}
>> +
>>  int kvm_vgic_create(struct kvm *kvm, u32 type)
>>  {
>>       int i, vcpu_lock_idx = -1, ret;
>> @@ -1832,7 +2020,6 @@ int kvm_vgic_create(struct kvm *kvm, u32 type)
>>       if (ret)
>>               goto out_unlock;
>>
>> -     spin_lock_init(&kvm->arch.vgic.lock);
>>       kvm->arch.vgic.in_kernel = true;
>>       kvm->arch.vgic.vgic_model = type;
>>       kvm->arch.vgic.vctrl_base = vgic->vctrl_base;
>> --
>> 2.1.4
>>
> 
> Thanks,
> -Christoffer
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 06/11] KVM: arm/arm64: vgic: Allow dynamic mapping of physical/virtual interrupts
@ 2015-08-04 15:27       ` Marc Zyngier
  0 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-08-04 15:27 UTC (permalink / raw)
  To: linux-arm-kernel

On 04/08/15 14:04, Christoffer Dall wrote:
> On Fri, Jul 24, 2015 at 04:55:04PM +0100, Marc Zyngier wrote:
>> In order to be able to feed physical interrupts to a guest, we need
>> to be able to establish the virtual-physical mapping between the two
>> worlds.
>>
>> The mappings are kept in a set of RCU lists, indexed by virtual interrupts.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  arch/arm/kvm/arm.c     |   2 +
>>  include/kvm/arm_vgic.h |  26 +++++++
>>  virt/kvm/arm/vgic.c    | 189 ++++++++++++++++++++++++++++++++++++++++++++++++-
>>  3 files changed, 216 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>> index f1bf418..ce404a5 100644
>> --- a/arch/arm/kvm/arm.c
>> +++ b/arch/arm/kvm/arm.c
>> @@ -125,6 +125,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>>       if (ret)
>>               goto out_free_stage2_pgd;
>>
>> +     kvm_vgic_early_init(kvm);
>>       kvm_timer_init(kvm);
>>
>>       /* Mark the initial VMID generation invalid */
>> @@ -249,6 +250,7 @@ out:
>>
>>  void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
>>  {
>> +     kvm_vgic_vcpu_early_init(vcpu);
>>  }
>>
>>  void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu)
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index a881e39..68212a1 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -159,6 +159,20 @@ struct vgic_io_device {
>>       struct kvm_io_device dev;
>>  };
>>
>> +struct irq_phys_map {
>> +     u32                     virt_irq;
>> +     u32                     phys_irq;
>> +     u32                     irq;
>> +     bool                    deleted;
>> +     bool                    active;
>> +};
>> +
>> +struct irq_phys_map_entry {
>> +     struct list_head        entry;
>> +     struct rcu_head         rcu;
>> +     struct irq_phys_map     map;
>> +};
>> +
>>  struct vgic_dist {
>>       spinlock_t              lock;
>>       bool                    in_kernel;
>> @@ -256,6 +270,10 @@ struct vgic_dist {
>>       struct vgic_vm_ops      vm_ops;
>>       struct vgic_io_device   dist_iodev;
>>       struct vgic_io_device   *redist_iodevs;
>> +
>> +     /* Virtual irq to hwirq mapping */
>> +     spinlock_t              irq_phys_map_lock;
>> +     struct list_head        irq_phys_map_list;
>>  };
>>
>>  struct vgic_v2_cpu_if {
>> @@ -307,6 +325,9 @@ struct vgic_cpu {
>>               struct vgic_v2_cpu_if   vgic_v2;
>>               struct vgic_v3_cpu_if   vgic_v3;
>>       };
>> +
>> +     /* Protected by the distributor's irq_phys_map_lock */
>> +     struct list_head        irq_phys_map_list;
>>  };
>>
>>  #define LR_EMPTY     0xff
>> @@ -321,8 +342,10 @@ int kvm_vgic_addr(struct kvm *kvm, unsigned long type, u64 *addr, bool write);
>>  int kvm_vgic_hyp_init(void);
>>  int kvm_vgic_map_resources(struct kvm *kvm);
>>  int kvm_vgic_get_max_vcpus(void);
>> +void kvm_vgic_early_init(struct kvm *kvm);
>>  int kvm_vgic_create(struct kvm *kvm, u32 type);
>>  void kvm_vgic_destroy(struct kvm *kvm);
>> +void kvm_vgic_vcpu_early_init(struct kvm_vcpu *vcpu);
>>  void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu);
>>  void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
>>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
>> @@ -331,6 +354,9 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
>> +struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
>> +                                        int virt_irq, int irq);
>> +int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
>>
>>  #define irqchip_in_kernel(k) (!!((k)->arch.vgic.in_kernel))
>>  #define vgic_initialized(k)  (!!((k)->arch.vgic.nr_cpus))
>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>> index 5bd1695..c74d929 100644
>> --- a/virt/kvm/arm/vgic.c
>> +++ b/virt/kvm/arm/vgic.c
>> @@ -24,6 +24,7 @@
>>  #include <linux/of.h>
>>  #include <linux/of_address.h>
>>  #include <linux/of_irq.h>
>> +#include <linux/rculist.h>
>>  #include <linux/uaccess.h>
>>
>>  #include <asm/kvm_emulate.h>
>> @@ -82,6 +83,8 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu);
>>  static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu);
>>  static struct vgic_lr vgic_get_lr(const struct kvm_vcpu *vcpu, int lr);
>>  static void vgic_set_lr(struct kvm_vcpu *vcpu, int lr, struct vgic_lr lr_desc);
>> +static struct irq_phys_map *vgic_irq_map_search(struct kvm_vcpu *vcpu,
>> +                                             int virt_irq);
>>
>>  static const struct vgic_ops *vgic_ops;
>>  static const struct vgic_params *vgic;
>> @@ -1583,6 +1586,166 @@ static irqreturn_t vgic_maintenance_handler(int irq, void *data)
>>       return IRQ_HANDLED;
>>  }
>>
>> +static struct list_head *vgic_get_irq_phys_map_list(struct kvm_vcpu *vcpu,
>> +                                                 int virt_irq)
>> +{
>> +     if (virt_irq < VGIC_NR_PRIVATE_IRQS)
>> +             return &vcpu->arch.vgic_cpu.irq_phys_map_list;
>> +     else
>> +             return &vcpu->kvm->arch.vgic.irq_phys_map_list;
>> +}
>> +
>> +/**
>> + * kvm_vgic_map_phys_irq - map a virtual IRQ to a physical IRQ
>> + * @vcpu: The VCPU pointer
>> + * @virt_irq: The virtual irq number
>> + * @irq: The Linux IRQ number
>> + *
>> + * Establish a mapping between a guest visible irq (@virt_irq) and a
>> + * Linux irq (@irq). On injection, @virt_irq will be associated with
>> + * the physical interrupt represented by @irq.
>> + *
>> + * Returns a valid pointer on success, and an error pointer otherwise
>> + */
>> +struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
>> +                                        int virt_irq, int irq)
>> +{
>> +     struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>> +     struct list_head *root = vgic_get_irq_phys_map_list(vcpu, virt_irq);
>> +     struct irq_phys_map *map;
>> +     struct irq_phys_map_entry *entry;
>> +     struct irq_desc *desc;
>> +     struct irq_data *data;
>> +     int phys_irq;
>> +
>> +     desc = irq_to_desc(irq);
>> +     if (!desc) {
>> +             kvm_err("kvm_vgic_map_phys_irq: no interrupt descriptor\n");
> 
> super over-ridiculous nit:
> if you respin you could consider "%s: no...", __FUNC__

Sure.

> 
>> +             return ERR_PTR(-EINVAL);
>> +     }
>> +
>> +     data = irq_desc_get_irq_data(desc);
>> +     while (data->parent_data)
>> +             data = data->parent_data;
>> +
>> +     phys_irq = data->hwirq;
>> +
>> +     /* Create a new mapping */
>> +     entry = kzalloc(sizeof(*entry), GFP_KERNEL);
>> +     if (!entry)
>> +             return ERR_PTR(-ENOMEM);
>> +
>> +     spin_lock(&dist->irq_phys_map_lock);
>> +
>> +     /* Try to match an existing mapping */
>> +     map = vgic_irq_map_search(vcpu, virt_irq);
>> +     if (map) {
>> +             /* Make sure this mapping matches */
> 
> why do we need to check that it matches?  Is it ever allowed to have
> multiple mappings of a virtual interrupt?
> 

Multiple *different* mappings, no. What you can have though is issuing
requests for the *same* mapping repeatedly. This is what happens with
the timer, for example: kvm_timer_vcpu_reset is going to be called more
than once over the lifetime of a VM, requesting a mapping each time
without necessarily destroying the previous one.

>> +             if (map->phys_irq != phys_irq   ||
>> +                 map->irq      != irq)
>> +                     map = ERR_PTR(-EINVAL);
>> +
>> +             goto out;
> 
> why is this a valid situation?  Shouldn't you return -EEXIST or
> something instead?  (that would also simplify the error-checking free
> routine below).
> 
> If you want it to be valid to map the same virq to irq multiple times
> (to free the caller from some bookkeeping, I presume?) then that should
> be part of the function documentation, and I think we should add a
> "found existing mapping" comment above the goto out statement.

Fair enough. I'll add some comments to that effect.

>> +     }
>> +
>> +     map           = &entry->map;
>> +     map->virt_irq = virt_irq;
>> +     map->phys_irq = phys_irq;
>> +     map->irq      = irq;
>> +
>> +     list_add_tail_rcu(&entry->entry, root);
>> +
>> +out:
>> +     spin_unlock(&dist->irq_phys_map_lock);
>> +     /* If we've found a hit in the existing list, free the useless
>> +      * entry */
>> +     if (IS_ERR(map) || map != &entry->map)
>> +             kfree(entry);
>> +     return map;
>> +}
>> +
>> +static struct irq_phys_map *vgic_irq_map_search(struct kvm_vcpu *vcpu,
>> +                                             int virt_irq)
>> +{
>> +     struct list_head *root = vgic_get_irq_phys_map_list(vcpu, virt_irq);
>> +     struct irq_phys_map_entry *entry;
>> +     struct irq_phys_map *map;
>> +
>> +     rcu_read_lock();
>> +
>> +     list_for_each_entry_rcu(entry, root, entry) {
>> +             map = &entry->map;
>> +             if (map->virt_irq == virt_irq && !map->deleted) {
>> +                     rcu_read_unlock();
>> +                     return map;
>> +             }
>> +     }
>> +
>> +     rcu_read_unlock();
>> +
>> +     return NULL;
>> +}
>> +
>> +static void vgic_free_phys_irq_map_rcu(struct rcu_head *rcu)
>> +{
>> +     struct irq_phys_map_entry *entry;
>> +
>> +     entry = container_of(rcu, struct irq_phys_map_entry, rcu);
>> +     kfree(entry);
>> +}
>> +
>> +/**
>> + * kvm_vgic_unmap_phys_irq - Remove a virtual to physical IRQ mapping
>> + * @vcpu: The VCPU pointer
>> + * @map: The pointer to a mapping obtained through kvm_vgic_map_phys_irq
>> + *
>> + * Remove an existing mapping between virtual and physical interrupts.
>> + */
>> +int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map)
>> +{
>> +     struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>> +     struct irq_phys_map_entry *entry;
>> +     struct list_head *root;
>> +
>> +     if (!map)
>> +             return -EINVAL;
>> +
>> +     root = vgic_get_irq_phys_map_list(vcpu, map->virt_irq);
>> +
>> +     spin_lock(&dist->irq_phys_map_lock);
>> +
>> +     list_for_each_entry(entry, root, entry) {
>> +             if (&entry->map == map && !map->deleted) {
>> +                     map->deleted = true;
> 
> why do you need the 'deleted' flag?  You now search the list only after
> holding the lock, so the race I pointed out before doesn't exist
> anymore.  Is there an additional issue that the deleted flag is taking
> care of?

This could still race with a destroy occuring before we take the lock,
and before we get get RCU to do the cleanup (the list would still be
populated).

> 
>> +                     list_del_rcu(&entry->entry);
>> +                     call_rcu(&entry->rcu, vgic_free_phys_irq_map_rcu);
>> +                     break;
>> +             }
>> +     }
>> +
>> +     spin_unlock(&dist->irq_phys_map_lock);
>> +
>> +     return 0;
>> +}
>> +
>> +static void vgic_destroy_irq_phys_map(struct kvm *kvm, struct list_head *root)
>> +{
>> +     struct vgic_dist *dist = &kvm->arch.vgic;
>> +     struct irq_phys_map_entry *entry;
>> +
>> +     spin_lock(&dist->irq_phys_map_lock);
>> +
>> +     list_for_each_entry(entry, root, entry) {
>> +             if (!entry->map.deleted) {
>> +                     entry->map.deleted = true;
>> +                     list_del_rcu(&entry->entry);
>> +                     call_rcu(&entry->rcu, vgic_free_phys_irq_map_rcu);
>> +             }
>> +     }
>> +
>> +     spin_unlock(&dist->irq_phys_map_lock);
>> +}
>> +
>>  void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
>>  {
>>       struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>> @@ -1591,6 +1754,7 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
>>       kfree(vgic_cpu->active_shared);
>>       kfree(vgic_cpu->pend_act_shared);
>>       kfree(vgic_cpu->vgic_irq_lr_map);
>> +     vgic_destroy_irq_phys_map(vcpu->kvm, &vgic_cpu->irq_phys_map_list);
>>       vgic_cpu->pending_shared = NULL;
>>       vgic_cpu->active_shared = NULL;
>>       vgic_cpu->pend_act_shared = NULL;
>> @@ -1628,6 +1792,17 @@ static int vgic_vcpu_init_maps(struct kvm_vcpu *vcpu, int nr_irqs)
>>  }
>>
>>  /**
>> + * kvm_vgic_vcpu_early_init - Earliest possible per-vcpu vgic init stage
> 
> earliest possible... oh how I love the vgic code.
> 

You must be the only one! ;-)

> If you have the energy and context to write a comment in the beginning
> of this file describing the init flow, I think that would be useful...
> 

How about starting with this?

<quote>
There are multiple stages to the vgic initialization, both for the
distributor and the CPU interfaces.

Distributor:

- kvm_vgic_early_init(): initialization of static data that doesn't
depend on any sizing information or emulation type. No allocation is
allowed there.

- vgic_init(): allocation and initialization of the generic data
structures that depend on sizing information (number of CPUs, number of
interrupts). Also initializes the vcpu specific data structures. Can be
executed lazily for GICv2. [to be renamed to kvm_vgic_init??]

CPU Interface:

- kvm_vgic_cpu_early_init(): initialization of static data that doesn't
depend on any sizing information or emulation type. No allocation is
allowed there.
</quote>

I'm sure we can expand it as we go...

>> + *
>> + * No memory allocation should be performed here, only static init.
>> + */
>> +void kvm_vgic_vcpu_early_init(struct kvm_vcpu *vcpu)
>> +{
>> +     struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>> +     INIT_LIST_HEAD(&vgic_cpu->irq_phys_map_list);
>> +}
>> +
>> +/**
>>   * kvm_vgic_get_max_vcpus - Get the maximum number of VCPUs allowed by HW
>>   *
>>   * The host's GIC naturally limits the maximum amount of VCPUs a guest
>> @@ -1664,6 +1839,7 @@ void kvm_vgic_destroy(struct kvm *kvm)
>>       kfree(dist->irq_spi_target);
>>       kfree(dist->irq_pending_on_cpu);
>>       kfree(dist->irq_active_on_cpu);
>> +     vgic_destroy_irq_phys_map(kvm, &dist->irq_phys_map_list);
>>       dist->irq_sgi_sources = NULL;
>>       dist->irq_spi_cpu = NULL;
>>       dist->irq_spi_target = NULL;
>> @@ -1787,6 +1963,18 @@ static int init_vgic_model(struct kvm *kvm, int type)
>>       return 0;
>>  }
>>
>> +/**
>> + * kvm_vgic_early_init - Earliest possible vgic initialization stage
>> + *
>> + * No memory allocation should be performed here, only static init.
>> + */
>> +void kvm_vgic_early_init(struct kvm *kvm)
>> +{
>> +     spin_lock_init(&kvm->arch.vgic.lock);
> 
> so here we're initializing the data structures even before the vgic is
> created, right?  So therefore it's safe to move the vgic.lock init up
> here?

Yes. The distributor structure is part of the struct kvm, hence safe to
poke early on. This doesn't depend on any form of sizing or emulation
type either.

> 
>> +     spin_lock_init(&kvm->arch.vgic.irq_phys_map_lock);
>> +     INIT_LIST_HEAD(&kvm->arch.vgic.irq_phys_map_list);
>> +}
>> +
>>  int kvm_vgic_create(struct kvm *kvm, u32 type)
>>  {
>>       int i, vcpu_lock_idx = -1, ret;
>> @@ -1832,7 +2020,6 @@ int kvm_vgic_create(struct kvm *kvm, u32 type)
>>       if (ret)
>>               goto out_unlock;
>>
>> -     spin_lock_init(&kvm->arch.vgic.lock);
>>       kvm->arch.vgic.in_kernel = true;
>>       kvm->arch.vgic.vgic_model = type;
>>       kvm->arch.vgic.vctrl_base = vgic->vctrl_base;
>> --
>> 2.1.4
>>
> 
> Thanks,
> -Christoffer
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 09/11] KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interrupt
  2015-08-04 13:45     ` Christoffer Dall
@ 2015-08-04 16:02       ` Marc Zyngier
  -1 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-08-04 16:02 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: kvm, kvmarm, linux-arm-kernel, Eric Auger, Alex Bennée,
	Andre Przywara

On 04/08/15 14:45, Christoffer Dall wrote:
> On Fri, Jul 24, 2015 at 04:55:07PM +0100, Marc Zyngier wrote:
>> Virtual interrupts mapped to a HW interrupt should only be triggered
>> from inside the kernel. Otherwise, you could end up confusing the
>> kernel (and the GIC's) state machine.
>>
>> Rearrange the injection path so that kvm_vgic_inject_irq is
>> used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is
>> used for mapped interrupts. The latter should only be called from
>> inside the kernel (timer, VFIO).
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  include/kvm/arm_vgic.h |  2 +
>>  virt/kvm/arm/vgic.c    | 99 ++++++++++++++++++++++++++++++++++----------------
>>  2 files changed, 70 insertions(+), 31 deletions(-)
>>
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index 7306b4b..f6bfd79 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -351,6 +351,8 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
>>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
>>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>>  			bool level);
>> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
>> +			       struct irq_phys_map *map, bool level);
>>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>> index 3f7b690..e40ef70 100644
>> --- a/virt/kvm/arm/vgic.c
>> +++ b/virt/kvm/arm/vgic.c
>> @@ -1533,7 +1533,8 @@ static int vgic_validate_injection(struct kvm_vcpu *vcpu, int irq, int level)
>>  }
>>  
>>  static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>> -				  unsigned int irq_num, bool level)
>> +				   struct irq_phys_map *map,
>> +				   unsigned int irq_num, bool level)
>>  {
>>  	struct vgic_dist *dist = &kvm->arch.vgic;
>>  	struct kvm_vcpu *vcpu;
>> @@ -1541,6 +1542,9 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>>  	int enabled;
>>  	bool ret = true, can_inject = true;
>>  
>> +	if (irq_num >= min(kvm->arch.vgic.nr_irqs, 1020))
>> +		return -EINVAL;
>> +
>>  	spin_lock(&dist->lock);
>>  
>>  	vcpu = kvm_get_vcpu(kvm, cpuid);
>> @@ -1603,14 +1607,42 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>>  out:
>>  	spin_unlock(&dist->lock);
>>  
>> -	return ret ? cpuid : -EINVAL;
>> +	if (!ret) {
> 
> don't you mean if (ret) here?  hint: ret is a bool

Ouch. Nice catch!

> 
>> +		/* kick the specified vcpu */
>> +		kvm_vcpu_kick(kvm_get_vcpu(kvm, cpuid));
>> +	}
>> +
>> +	return 0;
> 
> isn't this a change in the internal API?
> Before, we would return -EINVAL when ret is false.  Not sure if this
> has any consequences though?

I don't think this is a change in API. Before this patch, we would
either return a vcpuid or -EINVAL. But the error would not be propagated
beyond kvm_vgic_inject_irq, effectively discarding the error code.

Also, it is a bit odd to return an error because the toggling of the
line wasn't significant (like bringing the line down on an
edge-triggered interrupt).

> 
>> +}
>> +
>> +static int vgic_lazy_init(struct kvm *kvm)
>> +{
>> +	int ret = 0;
>> +
>> +	if (unlikely(!vgic_initialized(kvm))) {
>> +		/*
>> +		 * We only provide the automatic initialization of the VGIC
>> +		 * for the legacy case of a GICv2. Any other type must
>> +		 * be explicitly initialized once setup with the respective
>> +		 * KVM device call.
>> +		 */
>> +		if (kvm->arch.vgic.vgic_model != KVM_DEV_TYPE_ARM_VGIC_V2)
>> +			return -EBUSY;
>> +
>> +		mutex_lock(&kvm->lock);
>> +		ret = vgic_init(kvm);
>> +		mutex_unlock(&kvm->lock);
>> +	}
>> +
>> +	return ret;
>>  }
>>  
>>  /**
>>   * kvm_vgic_inject_irq - Inject an IRQ from a device to the vgic
>>   * @kvm:     The VM structure pointer
>>   * @cpuid:   The CPU for PPIs
>> - * @irq_num: The IRQ number that is assigned to the device
>> + * @irq_num: The IRQ number that is assigned to the device. This IRQ
>> + *           must not be mapped to a HW interrupt.
>>   * @level:   Edge-triggered:  true:  to trigger the interrupt
>>   *			      false: to ignore the call
>>   *	     Level-sensitive  true:  activates an interrupt
>> @@ -1623,39 +1655,44 @@ out:
>>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>>  			bool level)
>>  {
>> -	int ret = 0;
>> -	int vcpu_id;
>> -
>> -	if (unlikely(!vgic_initialized(kvm))) {
>> -		/*
>> -		 * We only provide the automatic initialization of the VGIC
>> -		 * for the legacy case of a GICv2. Any other type must
>> -		 * be explicitly initialized once setup with the respective
>> -		 * KVM device call.
>> -		 */
>> -		if (kvm->arch.vgic.vgic_model != KVM_DEV_TYPE_ARM_VGIC_V2) {
>> -			ret = -EBUSY;
>> -			goto out;
>> -		}
>> -		mutex_lock(&kvm->lock);
>> -		ret = vgic_init(kvm);
>> -		mutex_unlock(&kvm->lock);
>> +	struct irq_phys_map *map;
>> +	int ret;
>>  
>> -		if (ret)
>> -			goto out;
>> -	}
>> +	ret = vgic_lazy_init(kvm);
>> +	if (ret)
>> +		return ret;
>>  
>> -	if (irq_num >= min(kvm->arch.vgic.nr_irqs, 1020))
>> +	map = vgic_irq_map_search(kvm_get_vcpu(kvm, cpuid), irq_num);
>> +	if (map)
>>  		return -EINVAL;
>>  
>> -	vcpu_id = vgic_update_irq_pending(kvm, cpuid, irq_num, level);
>> -	if (vcpu_id >= 0) {
>> -		/* kick the specified vcpu */
>> -		kvm_vcpu_kick(kvm_get_vcpu(kvm, vcpu_id));
>> -	}
>> +	return vgic_update_irq_pending(kvm, cpuid, NULL, irq_num, level);
>> +}
>>  
>> -out:
>> -	return ret;
>> +/**
>> + * kvm_vgic_inject_mapped_irq - Inject a physically mapped IRQ to the vgic
>> + * @kvm:     The VM structure pointer
>> + * @cpuid:   The CPU for PPIs
>> + * @map:     Pointer to a irq_phys_map structure describing the mapping
>> + * @level:   Edge-triggered:  true:  to trigger the interrupt
>> + *			      false: to ignore the call
>> + *	     Level-sensitive  true:  activates an interrupt
>> + *			      false: deactivates an interrupt
> 
> just noticed this unfortunate use of the words 'activate/deactivate'
> here, which is not true, it just raises/lowers the input signal...
> 

I'll clean that up.

>> + *
>> + * The GIC is not concerned with devices being active-LOW or active-HIGH for
>> + * level-sensitive interrupts.  You can think of the level parameter as 1
>> + * being HIGH and 0 being LOW and all devices being active-HIGH.
>> + */
>> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
>> +			       struct irq_phys_map *map, bool level)
>> +{
>> +	int ret;
>> +
>> +	ret = vgic_lazy_init(kvm);
>> +	if (ret)
>> +		return ret;
>> +
>> +	return vgic_update_irq_pending(kvm, cpuid, map, map->virt_irq, level);
>>  }
>>  
>>  static irqreturn_t vgic_maintenance_handler(int irq, void *data)
>> -- 
>> 2.1.4
>>
> 
> Thanks,
> -Christoffer
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 09/11] KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interrupt
@ 2015-08-04 16:02       ` Marc Zyngier
  0 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-08-04 16:02 UTC (permalink / raw)
  To: linux-arm-kernel

On 04/08/15 14:45, Christoffer Dall wrote:
> On Fri, Jul 24, 2015 at 04:55:07PM +0100, Marc Zyngier wrote:
>> Virtual interrupts mapped to a HW interrupt should only be triggered
>> from inside the kernel. Otherwise, you could end up confusing the
>> kernel (and the GIC's) state machine.
>>
>> Rearrange the injection path so that kvm_vgic_inject_irq is
>> used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is
>> used for mapped interrupts. The latter should only be called from
>> inside the kernel (timer, VFIO).
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  include/kvm/arm_vgic.h |  2 +
>>  virt/kvm/arm/vgic.c    | 99 ++++++++++++++++++++++++++++++++++----------------
>>  2 files changed, 70 insertions(+), 31 deletions(-)
>>
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index 7306b4b..f6bfd79 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -351,6 +351,8 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
>>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
>>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>>  			bool level);
>> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
>> +			       struct irq_phys_map *map, bool level);
>>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>> index 3f7b690..e40ef70 100644
>> --- a/virt/kvm/arm/vgic.c
>> +++ b/virt/kvm/arm/vgic.c
>> @@ -1533,7 +1533,8 @@ static int vgic_validate_injection(struct kvm_vcpu *vcpu, int irq, int level)
>>  }
>>  
>>  static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>> -				  unsigned int irq_num, bool level)
>> +				   struct irq_phys_map *map,
>> +				   unsigned int irq_num, bool level)
>>  {
>>  	struct vgic_dist *dist = &kvm->arch.vgic;
>>  	struct kvm_vcpu *vcpu;
>> @@ -1541,6 +1542,9 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>>  	int enabled;
>>  	bool ret = true, can_inject = true;
>>  
>> +	if (irq_num >= min(kvm->arch.vgic.nr_irqs, 1020))
>> +		return -EINVAL;
>> +
>>  	spin_lock(&dist->lock);
>>  
>>  	vcpu = kvm_get_vcpu(kvm, cpuid);
>> @@ -1603,14 +1607,42 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>>  out:
>>  	spin_unlock(&dist->lock);
>>  
>> -	return ret ? cpuid : -EINVAL;
>> +	if (!ret) {
> 
> don't you mean if (ret) here?  hint: ret is a bool

Ouch. Nice catch!

> 
>> +		/* kick the specified vcpu */
>> +		kvm_vcpu_kick(kvm_get_vcpu(kvm, cpuid));
>> +	}
>> +
>> +	return 0;
> 
> isn't this a change in the internal API?
> Before, we would return -EINVAL when ret is false.  Not sure if this
> has any consequences though?

I don't think this is a change in API. Before this patch, we would
either return a vcpuid or -EINVAL. But the error would not be propagated
beyond kvm_vgic_inject_irq, effectively discarding the error code.

Also, it is a bit odd to return an error because the toggling of the
line wasn't significant (like bringing the line down on an
edge-triggered interrupt).

> 
>> +}
>> +
>> +static int vgic_lazy_init(struct kvm *kvm)
>> +{
>> +	int ret = 0;
>> +
>> +	if (unlikely(!vgic_initialized(kvm))) {
>> +		/*
>> +		 * We only provide the automatic initialization of the VGIC
>> +		 * for the legacy case of a GICv2. Any other type must
>> +		 * be explicitly initialized once setup with the respective
>> +		 * KVM device call.
>> +		 */
>> +		if (kvm->arch.vgic.vgic_model != KVM_DEV_TYPE_ARM_VGIC_V2)
>> +			return -EBUSY;
>> +
>> +		mutex_lock(&kvm->lock);
>> +		ret = vgic_init(kvm);
>> +		mutex_unlock(&kvm->lock);
>> +	}
>> +
>> +	return ret;
>>  }
>>  
>>  /**
>>   * kvm_vgic_inject_irq - Inject an IRQ from a device to the vgic
>>   * @kvm:     The VM structure pointer
>>   * @cpuid:   The CPU for PPIs
>> - * @irq_num: The IRQ number that is assigned to the device
>> + * @irq_num: The IRQ number that is assigned to the device. This IRQ
>> + *           must not be mapped to a HW interrupt.
>>   * @level:   Edge-triggered:  true:  to trigger the interrupt
>>   *			      false: to ignore the call
>>   *	     Level-sensitive  true:  activates an interrupt
>> @@ -1623,39 +1655,44 @@ out:
>>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>>  			bool level)
>>  {
>> -	int ret = 0;
>> -	int vcpu_id;
>> -
>> -	if (unlikely(!vgic_initialized(kvm))) {
>> -		/*
>> -		 * We only provide the automatic initialization of the VGIC
>> -		 * for the legacy case of a GICv2. Any other type must
>> -		 * be explicitly initialized once setup with the respective
>> -		 * KVM device call.
>> -		 */
>> -		if (kvm->arch.vgic.vgic_model != KVM_DEV_TYPE_ARM_VGIC_V2) {
>> -			ret = -EBUSY;
>> -			goto out;
>> -		}
>> -		mutex_lock(&kvm->lock);
>> -		ret = vgic_init(kvm);
>> -		mutex_unlock(&kvm->lock);
>> +	struct irq_phys_map *map;
>> +	int ret;
>>  
>> -		if (ret)
>> -			goto out;
>> -	}
>> +	ret = vgic_lazy_init(kvm);
>> +	if (ret)
>> +		return ret;
>>  
>> -	if (irq_num >= min(kvm->arch.vgic.nr_irqs, 1020))
>> +	map = vgic_irq_map_search(kvm_get_vcpu(kvm, cpuid), irq_num);
>> +	if (map)
>>  		return -EINVAL;
>>  
>> -	vcpu_id = vgic_update_irq_pending(kvm, cpuid, irq_num, level);
>> -	if (vcpu_id >= 0) {
>> -		/* kick the specified vcpu */
>> -		kvm_vcpu_kick(kvm_get_vcpu(kvm, vcpu_id));
>> -	}
>> +	return vgic_update_irq_pending(kvm, cpuid, NULL, irq_num, level);
>> +}
>>  
>> -out:
>> -	return ret;
>> +/**
>> + * kvm_vgic_inject_mapped_irq - Inject a physically mapped IRQ to the vgic
>> + * @kvm:     The VM structure pointer
>> + * @cpuid:   The CPU for PPIs
>> + * @map:     Pointer to a irq_phys_map structure describing the mapping
>> + * @level:   Edge-triggered:  true:  to trigger the interrupt
>> + *			      false: to ignore the call
>> + *	     Level-sensitive  true:  activates an interrupt
>> + *			      false: deactivates an interrupt
> 
> just noticed this unfortunate use of the words 'activate/deactivate'
> here, which is not true, it just raises/lowers the input signal...
> 

I'll clean that up.

>> + *
>> + * The GIC is not concerned with devices being active-LOW or active-HIGH for
>> + * level-sensitive interrupts.  You can think of the level parameter as 1
>> + * being HIGH and 0 being LOW and all devices being active-HIGH.
>> + */
>> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
>> +			       struct irq_phys_map *map, bool level)
>> +{
>> +	int ret;
>> +
>> +	ret = vgic_lazy_init(kvm);
>> +	if (ret)
>> +		return ret;
>> +
>> +	return vgic_update_irq_pending(kvm, cpuid, map, map->virt_irq, level);
>>  }
>>  
>>  static irqreturn_t vgic_maintenance_handler(int irq, void *data)
>> -- 
>> 2.1.4
>>
> 
> Thanks,
> -Christoffer
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 10/11] KVM: arm/arm64: timer: Allow the timer to control the active state
  2015-08-04 13:56     ` Christoffer Dall
@ 2015-08-04 16:14       ` Marc Zyngier
  -1 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-08-04 16:14 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: kvm, kvmarm, linux-arm-kernel, Eric Auger, Alex Bennée,
	Andre Przywara

On 04/08/15 14:56, Christoffer Dall wrote:
> On Fri, Jul 24, 2015 at 04:55:08PM +0100, Marc Zyngier wrote:
>> In order to remove the crude hack where we sneak the masked bit
>> into the timer's control register, make use of the phys_irq_map
>> API control the active state of the interrupt.
>>
>> This causes some limited changes to allow for potential error
>> propagation.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  arch/arm/kvm/reset.c         |  4 +---
>>  arch/arm64/kvm/reset.c       |  4 +---
>>  include/kvm/arm_arch_timer.h |  7 +++++--
>>  virt/kvm/arm/arch_timer.c    | 29 ++++++++++++++++++++++-------
>>  4 files changed, 29 insertions(+), 15 deletions(-)
>>
>> diff --git a/arch/arm/kvm/reset.c b/arch/arm/kvm/reset.c
>> index f558c07..eeb85858 100644
>> --- a/arch/arm/kvm/reset.c
>> +++ b/arch/arm/kvm/reset.c
>> @@ -77,7 +77,5 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>>  	kvm_reset_coprocs(vcpu);
>>  
>>  	/* Reset arch_timer context */
>> -	kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
>> -
>> -	return 0;
>> +	return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
>>  }
>> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
>> index b4af618..91cf535 100644
>> --- a/arch/arm64/kvm/reset.c
>> +++ b/arch/arm64/kvm/reset.c
>> @@ -121,7 +121,5 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>>  	kvm_reset_sys_regs(vcpu);
>>  
>>  	/* Reset timer */
>> -	kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
>> -
>> -	return 0;
>> +	return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
>>  }
>> diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
>> index e596675..e1e4d7c 100644
>> --- a/include/kvm/arm_arch_timer.h
>> +++ b/include/kvm/arm_arch_timer.h
>> @@ -52,13 +52,16 @@ struct arch_timer_cpu {
>>  
>>  	/* Timer IRQ */
>>  	const struct kvm_irq_level	*irq;
>> +
>> +	/* VGIC mapping */
>> +	struct irq_phys_map		*map;
>>  };
>>  
>>  int kvm_timer_hyp_init(void);
>>  void kvm_timer_enable(struct kvm *kvm);
>>  void kvm_timer_init(struct kvm *kvm);
>> -void kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
>> -			  const struct kvm_irq_level *irq);
>> +int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
>> +			 const struct kvm_irq_level *irq);
>>  void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu);
>>  void kvm_timer_flush_hwstate(struct kvm_vcpu *vcpu);
>>  void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu);
>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
>> index 98c95f2..76e38d2 100644
>> --- a/virt/kvm/arm/arch_timer.c
>> +++ b/virt/kvm/arm/arch_timer.c
>> @@ -64,10 +64,10 @@ static void kvm_timer_inject_irq(struct kvm_vcpu *vcpu)
>>  	int ret;
>>  	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
>>  
>> -	timer->cntv_ctl |= ARCH_TIMER_CTRL_IT_MASK;
>> -	ret = kvm_vgic_inject_irq(vcpu->kvm, vcpu->vcpu_id,
>> -				  timer->irq->irq,
>> -				  timer->irq->level);
>> +	kvm_vgic_set_phys_irq_active(timer->map, true);
> 
> actually, I'm not sure I understand the point of setting this boolean?
> Is it just a check?  It seems everywhere we find the map in the VGIC
> code, we have a BUG_ON(!map->active), so why doesn't (!!map) imply that
> it should be marked as active?
> 
> Is this again for non-shared devices?
> 

Setting the boolean here means that the interrupt should be marked
active at the distributor level when the interrupt is injected. This
also allows the timer code to find out when we can inject another
interrupt (by checking that the interrupt is now inactive).

As for the BUG_ON(), this is really some debug stuff to make sure we
have a consistent state.

Non-shared devices are handled separately (and setting the active state
doesn't make much sense, as we expect this to be entirely HW driven).

Do you see anything I should add to make it more clear?

>> +	ret = kvm_vgic_inject_mapped_irq(vcpu->kvm, vcpu->vcpu_id,
>> +					 timer->map,
>> +					 timer->irq->level);
>>  	WARN_ON(ret);
>>  }
>>  
>> @@ -117,7 +117,8 @@ bool kvm_timer_should_fire(struct kvm_vcpu *vcpu)
>>  	cycle_t cval, now;
>>  
>>  	if ((timer->cntv_ctl & ARCH_TIMER_CTRL_IT_MASK) ||
>> -		!(timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE))
>> +	    !(timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE) ||
>> +	    kvm_vgic_get_phys_irq_active(timer->map))
> 
> hmm, maybe this answers my question above.  It is the virtual active
> state preventing us to keep injecting the virtual side?
> 

Exactly. This is effectively what the MASK state was doing, now
replicated in SW...

> I don't see anything functional wrong with this code though.
> 
> Thanks,
> -Christoffer
> 
> 
>>  		return false;
>>  
>>  	cval = timer->cntv_cval;
>> @@ -184,10 +185,11 @@ void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
>>  	timer_arm(timer, ns);
>>  }
>>  
>> -void kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
>> -			  const struct kvm_irq_level *irq)
>> +int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
>> +			 const struct kvm_irq_level *irq)
>>  {
>>  	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
>> +	struct irq_phys_map *map;
>>  
>>  	/*
>>  	 * The vcpu timer irq number cannot be determined in
>> @@ -196,6 +198,17 @@ void kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
>>  	 * vcpu timer irq number when the vcpu is reset.
>>  	 */
>>  	timer->irq = irq;
>> +
>> +	/*
>> +	 * Tell the VGIC that the virtual interrupt is tied to a
>> +	 * physical interrupt. We do that once per VCPU.
>> +	 */
>> +	map = kvm_vgic_map_phys_irq(vcpu, irq->irq, host_vtimer_irq);
>> +	if (WARN_ON(IS_ERR(map)))
>> +		return PTR_ERR(map);
>> +
>> +	timer->map = map;
>> +	return 0;
>>  }
>>  
>>  void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu)
>> @@ -335,6 +348,8 @@ void kvm_timer_vcpu_terminate(struct kvm_vcpu *vcpu)
>>  	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
>>  
>>  	timer_disarm(timer);
>> +	if (timer->map)
>> +		kvm_vgic_unmap_phys_irq(vcpu, timer->map);
>>  }
>>  
>>  void kvm_timer_enable(struct kvm *kvm)
>> -- 
>> 2.1.4
>>
> 
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 10/11] KVM: arm/arm64: timer: Allow the timer to control the active state
@ 2015-08-04 16:14       ` Marc Zyngier
  0 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-08-04 16:14 UTC (permalink / raw)
  To: linux-arm-kernel

On 04/08/15 14:56, Christoffer Dall wrote:
> On Fri, Jul 24, 2015 at 04:55:08PM +0100, Marc Zyngier wrote:
>> In order to remove the crude hack where we sneak the masked bit
>> into the timer's control register, make use of the phys_irq_map
>> API control the active state of the interrupt.
>>
>> This causes some limited changes to allow for potential error
>> propagation.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  arch/arm/kvm/reset.c         |  4 +---
>>  arch/arm64/kvm/reset.c       |  4 +---
>>  include/kvm/arm_arch_timer.h |  7 +++++--
>>  virt/kvm/arm/arch_timer.c    | 29 ++++++++++++++++++++++-------
>>  4 files changed, 29 insertions(+), 15 deletions(-)
>>
>> diff --git a/arch/arm/kvm/reset.c b/arch/arm/kvm/reset.c
>> index f558c07..eeb85858 100644
>> --- a/arch/arm/kvm/reset.c
>> +++ b/arch/arm/kvm/reset.c
>> @@ -77,7 +77,5 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>>  	kvm_reset_coprocs(vcpu);
>>  
>>  	/* Reset arch_timer context */
>> -	kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
>> -
>> -	return 0;
>> +	return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
>>  }
>> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
>> index b4af618..91cf535 100644
>> --- a/arch/arm64/kvm/reset.c
>> +++ b/arch/arm64/kvm/reset.c
>> @@ -121,7 +121,5 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>>  	kvm_reset_sys_regs(vcpu);
>>  
>>  	/* Reset timer */
>> -	kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
>> -
>> -	return 0;
>> +	return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
>>  }
>> diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
>> index e596675..e1e4d7c 100644
>> --- a/include/kvm/arm_arch_timer.h
>> +++ b/include/kvm/arm_arch_timer.h
>> @@ -52,13 +52,16 @@ struct arch_timer_cpu {
>>  
>>  	/* Timer IRQ */
>>  	const struct kvm_irq_level	*irq;
>> +
>> +	/* VGIC mapping */
>> +	struct irq_phys_map		*map;
>>  };
>>  
>>  int kvm_timer_hyp_init(void);
>>  void kvm_timer_enable(struct kvm *kvm);
>>  void kvm_timer_init(struct kvm *kvm);
>> -void kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
>> -			  const struct kvm_irq_level *irq);
>> +int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
>> +			 const struct kvm_irq_level *irq);
>>  void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu);
>>  void kvm_timer_flush_hwstate(struct kvm_vcpu *vcpu);
>>  void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu);
>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
>> index 98c95f2..76e38d2 100644
>> --- a/virt/kvm/arm/arch_timer.c
>> +++ b/virt/kvm/arm/arch_timer.c
>> @@ -64,10 +64,10 @@ static void kvm_timer_inject_irq(struct kvm_vcpu *vcpu)
>>  	int ret;
>>  	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
>>  
>> -	timer->cntv_ctl |= ARCH_TIMER_CTRL_IT_MASK;
>> -	ret = kvm_vgic_inject_irq(vcpu->kvm, vcpu->vcpu_id,
>> -				  timer->irq->irq,
>> -				  timer->irq->level);
>> +	kvm_vgic_set_phys_irq_active(timer->map, true);
> 
> actually, I'm not sure I understand the point of setting this boolean?
> Is it just a check?  It seems everywhere we find the map in the VGIC
> code, we have a BUG_ON(!map->active), so why doesn't (!!map) imply that
> it should be marked as active?
> 
> Is this again for non-shared devices?
> 

Setting the boolean here means that the interrupt should be marked
active at the distributor level when the interrupt is injected. This
also allows the timer code to find out when we can inject another
interrupt (by checking that the interrupt is now inactive).

As for the BUG_ON(), this is really some debug stuff to make sure we
have a consistent state.

Non-shared devices are handled separately (and setting the active state
doesn't make much sense, as we expect this to be entirely HW driven).

Do you see anything I should add to make it more clear?

>> +	ret = kvm_vgic_inject_mapped_irq(vcpu->kvm, vcpu->vcpu_id,
>> +					 timer->map,
>> +					 timer->irq->level);
>>  	WARN_ON(ret);
>>  }
>>  
>> @@ -117,7 +117,8 @@ bool kvm_timer_should_fire(struct kvm_vcpu *vcpu)
>>  	cycle_t cval, now;
>>  
>>  	if ((timer->cntv_ctl & ARCH_TIMER_CTRL_IT_MASK) ||
>> -		!(timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE))
>> +	    !(timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE) ||
>> +	    kvm_vgic_get_phys_irq_active(timer->map))
> 
> hmm, maybe this answers my question above.  It is the virtual active
> state preventing us to keep injecting the virtual side?
> 

Exactly. This is effectively what the MASK state was doing, now
replicated in SW...

> I don't see anything functional wrong with this code though.
> 
> Thanks,
> -Christoffer
> 
> 
>>  		return false;
>>  
>>  	cval = timer->cntv_cval;
>> @@ -184,10 +185,11 @@ void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
>>  	timer_arm(timer, ns);
>>  }
>>  
>> -void kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
>> -			  const struct kvm_irq_level *irq)
>> +int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
>> +			 const struct kvm_irq_level *irq)
>>  {
>>  	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
>> +	struct irq_phys_map *map;
>>  
>>  	/*
>>  	 * The vcpu timer irq number cannot be determined in
>> @@ -196,6 +198,17 @@ void kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
>>  	 * vcpu timer irq number when the vcpu is reset.
>>  	 */
>>  	timer->irq = irq;
>> +
>> +	/*
>> +	 * Tell the VGIC that the virtual interrupt is tied to a
>> +	 * physical interrupt. We do that once per VCPU.
>> +	 */
>> +	map = kvm_vgic_map_phys_irq(vcpu, irq->irq, host_vtimer_irq);
>> +	if (WARN_ON(IS_ERR(map)))
>> +		return PTR_ERR(map);
>> +
>> +	timer->map = map;
>> +	return 0;
>>  }
>>  
>>  void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu)
>> @@ -335,6 +348,8 @@ void kvm_timer_vcpu_terminate(struct kvm_vcpu *vcpu)
>>  	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
>>  
>>  	timer_disarm(timer);
>> +	if (timer->map)
>> +		kvm_vgic_unmap_phys_irq(vcpu, timer->map);
>>  }
>>  
>>  void kvm_timer_enable(struct kvm *kvm)
>> -- 
>> 2.1.4
>>
> 
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 09/11] KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interrupt
  2015-07-24 15:55   ` Marc Zyngier
@ 2015-08-04 16:21     ` Eric Auger
  -1 siblings, 0 replies; 74+ messages in thread
From: Eric Auger @ 2015-08-04 16:21 UTC (permalink / raw)
  To: Marc Zyngier, kvm, kvmarm, linux-arm-kernel
  Cc: Christoffer Dall, Alex Bennée, Andre Przywara

Hi Marc,
On 07/24/2015 05:55 PM, Marc Zyngier wrote:
> Virtual interrupts mapped to a HW interrupt should only be triggered
> from inside the kernel. Otherwise, you could end up confusing the
> kernel (and the GIC's) state machine.
> 
> Rearrange the injection path so that kvm_vgic_inject_irq is
> used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is
> used for mapped interrupts. The latter should only be called from
> inside the kernel (timer, VFIO).
nit: I would replace VFIO by irqfd.
VFIO just triggers the eventfd/irqfd. This is KVM/irqfd that injects the
virtual irq upon the irqfd signaling and he irqfd adaptation/ARM
currently is implemented in vgic.c
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  include/kvm/arm_vgic.h |  2 +
>  virt/kvm/arm/vgic.c    | 99 ++++++++++++++++++++++++++++++++++----------------
>  2 files changed, 70 insertions(+), 31 deletions(-)
> 
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 7306b4b..f6bfd79 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -351,6 +351,8 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>  			bool level);
> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
> +			       struct irq_phys_map *map, bool level);
>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index 3f7b690..e40ef70 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -1533,7 +1533,8 @@ static int vgic_validate_injection(struct kvm_vcpu *vcpu, int irq, int level)
>  }
>  
>  static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
> -				  unsigned int irq_num, bool level)
> +				   struct irq_phys_map *map,
> +				   unsigned int irq_num, bool level)
>  {
In vgic_update_irq_pending, I needed to modify the following line and
add the "&& !map".

        if (!vgic_validate_injection(vcpu, irq_num, level) && !map) {

Without that, the level being not properly modeled for level sensitive
forwarded IRQs, the 2d injection fails.

Eric


>  	struct vgic_dist *dist = &kvm->arch.vgic;
>  	struct kvm_vcpu *vcpu;
> @@ -1541,6 +1542,9 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>  	int enabled;
>  	bool ret = true, can_inject = true;
>  
> +	if (irq_num >= min(kvm->arch.vgic.nr_irqs, 1020))
> +		return -EINVAL;
> +
>  	spin_lock(&dist->lock);
>  
>  	vcpu = kvm_get_vcpu(kvm, cpuid);
> @@ -1603,14 +1607,42 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>  out:
>  	spin_unlock(&dist->lock);
>  
> -	return ret ? cpuid : -EINVAL;
> +	if (!ret) {
> +		/* kick the specified vcpu */
> +		kvm_vcpu_kick(kvm_get_vcpu(kvm, cpuid));
> +	}
> +
> +	return 0;
> +}
> +
> +static int vgic_lazy_init(struct kvm *kvm)
> +{
> +	int ret = 0;
> +
> +	if (unlikely(!vgic_initialized(kvm))) {
> +		/*
> +		 * We only provide the automatic initialization of the VGIC
> +		 * for the legacy case of a GICv2. Any other type must
> +		 * be explicitly initialized once setup with the respective
> +		 * KVM device call.
> +		 */
> +		if (kvm->arch.vgic.vgic_model != KVM_DEV_TYPE_ARM_VGIC_V2)
> +			return -EBUSY;
> +
> +		mutex_lock(&kvm->lock);
> +		ret = vgic_init(kvm);
> +		mutex_unlock(&kvm->lock);
> +	}
> +
> +	return ret;
>  }
>  
>  /**
>   * kvm_vgic_inject_irq - Inject an IRQ from a device to the vgic
>   * @kvm:     The VM structure pointer
>   * @cpuid:   The CPU for PPIs
> - * @irq_num: The IRQ number that is assigned to the device
> + * @irq_num: The IRQ number that is assigned to the device. This IRQ
> + *           must not be mapped to a HW interrupt.
>   * @level:   Edge-triggered:  true:  to trigger the interrupt
>   *			      false: to ignore the call
>   *	     Level-sensitive  true:  activates an interrupt
> @@ -1623,39 +1655,44 @@ out:
>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>  			bool level)
>  {
> -	int ret = 0;
> -	int vcpu_id;
> -
> -	if (unlikely(!vgic_initialized(kvm))) {
> -		/*
> -		 * We only provide the automatic initialization of the VGIC
> -		 * for the legacy case of a GICv2. Any other type must
> -		 * be explicitly initialized once setup with the respective
> -		 * KVM device call.
> -		 */
> -		if (kvm->arch.vgic.vgic_model != KVM_DEV_TYPE_ARM_VGIC_V2) {
> -			ret = -EBUSY;
> -			goto out;
> -		}
> -		mutex_lock(&kvm->lock);
> -		ret = vgic_init(kvm);
> -		mutex_unlock(&kvm->lock);
> +	struct irq_phys_map *map;
> +	int ret;
>  
> -		if (ret)
> -			goto out;
> -	}
> +	ret = vgic_lazy_init(kvm);
> +	if (ret)
> +		return ret;
>  
> -	if (irq_num >= min(kvm->arch.vgic.nr_irqs, 1020))
> +	map = vgic_irq_map_search(kvm_get_vcpu(kvm, cpuid), irq_num);
> +	if (map)
>  		return -EINVAL;
>  
> -	vcpu_id = vgic_update_irq_pending(kvm, cpuid, irq_num, level);
> -	if (vcpu_id >= 0) {
> -		/* kick the specified vcpu */
> -		kvm_vcpu_kick(kvm_get_vcpu(kvm, vcpu_id));
> -	}
> +	return vgic_update_irq_pending(kvm, cpuid, NULL, irq_num, level);
> +}
>  
> -out:
> -	return ret;
> +/**
> + * kvm_vgic_inject_mapped_irq - Inject a physically mapped IRQ to the vgic
> + * @kvm:     The VM structure pointer
> + * @cpuid:   The CPU for PPIs
> + * @map:     Pointer to a irq_phys_map structure describing the mapping
> + * @level:   Edge-triggered:  true:  to trigger the interrupt
> + *			      false: to ignore the call
> + *	     Level-sensitive  true:  activates an interrupt
> + *			      false: deactivates an interrupt
> + *
> + * The GIC is not concerned with devices being active-LOW or active-HIGH for
> + * level-sensitive interrupts.  You can think of the level parameter as 1
> + * being HIGH and 0 being LOW and all devices being active-HIGH.
> + */
> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
> +			       struct irq_phys_map *map, bool level)
> +{
> +	int ret;
> +
> +	ret = vgic_lazy_init(kvm);
> +	if (ret)
> +		return ret;
> +
> +	return vgic_update_irq_pending(kvm, cpuid, map, map->virt_irq, level);
>  }
>  
>  static irqreturn_t vgic_maintenance_handler(int irq, void *data)
> 


^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 09/11] KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interrupt
@ 2015-08-04 16:21     ` Eric Auger
  0 siblings, 0 replies; 74+ messages in thread
From: Eric Auger @ 2015-08-04 16:21 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Marc,
On 07/24/2015 05:55 PM, Marc Zyngier wrote:
> Virtual interrupts mapped to a HW interrupt should only be triggered
> from inside the kernel. Otherwise, you could end up confusing the
> kernel (and the GIC's) state machine.
> 
> Rearrange the injection path so that kvm_vgic_inject_irq is
> used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is
> used for mapped interrupts. The latter should only be called from
> inside the kernel (timer, VFIO).
nit: I would replace VFIO by irqfd.
VFIO just triggers the eventfd/irqfd. This is KVM/irqfd that injects the
virtual irq upon the irqfd signaling and he irqfd adaptation/ARM
currently is implemented in vgic.c
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  include/kvm/arm_vgic.h |  2 +
>  virt/kvm/arm/vgic.c    | 99 ++++++++++++++++++++++++++++++++++----------------
>  2 files changed, 70 insertions(+), 31 deletions(-)
> 
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 7306b4b..f6bfd79 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -351,6 +351,8 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>  			bool level);
> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
> +			       struct irq_phys_map *map, bool level);
>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index 3f7b690..e40ef70 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -1533,7 +1533,8 @@ static int vgic_validate_injection(struct kvm_vcpu *vcpu, int irq, int level)
>  }
>  
>  static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
> -				  unsigned int irq_num, bool level)
> +				   struct irq_phys_map *map,
> +				   unsigned int irq_num, bool level)
>  {
In vgic_update_irq_pending, I needed to modify the following line and
add the "&& !map".

        if (!vgic_validate_injection(vcpu, irq_num, level) && !map) {

Without that, the level being not properly modeled for level sensitive
forwarded IRQs, the 2d injection fails.

Eric


>  	struct vgic_dist *dist = &kvm->arch.vgic;
>  	struct kvm_vcpu *vcpu;
> @@ -1541,6 +1542,9 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>  	int enabled;
>  	bool ret = true, can_inject = true;
>  
> +	if (irq_num >= min(kvm->arch.vgic.nr_irqs, 1020))
> +		return -EINVAL;
> +
>  	spin_lock(&dist->lock);
>  
>  	vcpu = kvm_get_vcpu(kvm, cpuid);
> @@ -1603,14 +1607,42 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>  out:
>  	spin_unlock(&dist->lock);
>  
> -	return ret ? cpuid : -EINVAL;
> +	if (!ret) {
> +		/* kick the specified vcpu */
> +		kvm_vcpu_kick(kvm_get_vcpu(kvm, cpuid));
> +	}
> +
> +	return 0;
> +}
> +
> +static int vgic_lazy_init(struct kvm *kvm)
> +{
> +	int ret = 0;
> +
> +	if (unlikely(!vgic_initialized(kvm))) {
> +		/*
> +		 * We only provide the automatic initialization of the VGIC
> +		 * for the legacy case of a GICv2. Any other type must
> +		 * be explicitly initialized once setup with the respective
> +		 * KVM device call.
> +		 */
> +		if (kvm->arch.vgic.vgic_model != KVM_DEV_TYPE_ARM_VGIC_V2)
> +			return -EBUSY;
> +
> +		mutex_lock(&kvm->lock);
> +		ret = vgic_init(kvm);
> +		mutex_unlock(&kvm->lock);
> +	}
> +
> +	return ret;
>  }
>  
>  /**
>   * kvm_vgic_inject_irq - Inject an IRQ from a device to the vgic
>   * @kvm:     The VM structure pointer
>   * @cpuid:   The CPU for PPIs
> - * @irq_num: The IRQ number that is assigned to the device
> + * @irq_num: The IRQ number that is assigned to the device. This IRQ
> + *           must not be mapped to a HW interrupt.
>   * @level:   Edge-triggered:  true:  to trigger the interrupt
>   *			      false: to ignore the call
>   *	     Level-sensitive  true:  activates an interrupt
> @@ -1623,39 +1655,44 @@ out:
>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>  			bool level)
>  {
> -	int ret = 0;
> -	int vcpu_id;
> -
> -	if (unlikely(!vgic_initialized(kvm))) {
> -		/*
> -		 * We only provide the automatic initialization of the VGIC
> -		 * for the legacy case of a GICv2. Any other type must
> -		 * be explicitly initialized once setup with the respective
> -		 * KVM device call.
> -		 */
> -		if (kvm->arch.vgic.vgic_model != KVM_DEV_TYPE_ARM_VGIC_V2) {
> -			ret = -EBUSY;
> -			goto out;
> -		}
> -		mutex_lock(&kvm->lock);
> -		ret = vgic_init(kvm);
> -		mutex_unlock(&kvm->lock);
> +	struct irq_phys_map *map;
> +	int ret;
>  
> -		if (ret)
> -			goto out;
> -	}
> +	ret = vgic_lazy_init(kvm);
> +	if (ret)
> +		return ret;
>  
> -	if (irq_num >= min(kvm->arch.vgic.nr_irqs, 1020))
> +	map = vgic_irq_map_search(kvm_get_vcpu(kvm, cpuid), irq_num);
> +	if (map)
>  		return -EINVAL;
>  
> -	vcpu_id = vgic_update_irq_pending(kvm, cpuid, irq_num, level);
> -	if (vcpu_id >= 0) {
> -		/* kick the specified vcpu */
> -		kvm_vcpu_kick(kvm_get_vcpu(kvm, vcpu_id));
> -	}
> +	return vgic_update_irq_pending(kvm, cpuid, NULL, irq_num, level);
> +}
>  
> -out:
> -	return ret;
> +/**
> + * kvm_vgic_inject_mapped_irq - Inject a physically mapped IRQ to the vgic
> + * @kvm:     The VM structure pointer
> + * @cpuid:   The CPU for PPIs
> + * @map:     Pointer to a irq_phys_map structure describing the mapping
> + * @level:   Edge-triggered:  true:  to trigger the interrupt
> + *			      false: to ignore the call
> + *	     Level-sensitive  true:  activates an interrupt
> + *			      false: deactivates an interrupt
> + *
> + * The GIC is not concerned with devices being active-LOW or active-HIGH for
> + * level-sensitive interrupts.  You can think of the level parameter as 1
> + * being HIGH and 0 being LOW and all devices being active-HIGH.
> + */
> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
> +			       struct irq_phys_map *map, bool level)
> +{
> +	int ret;
> +
> +	ret = vgic_lazy_init(kvm);
> +	if (ret)
> +		return ret;
> +
> +	return vgic_update_irq_pending(kvm, cpuid, map, map->virt_irq, level);
>  }
>  
>  static irqreturn_t vgic_maintenance_handler(int irq, void *data)
> 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 09/11] KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interrupt
  2015-08-04 16:21     ` Eric Auger
@ 2015-08-04 16:44       ` Marc Zyngier
  -1 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-08-04 16:44 UTC (permalink / raw)
  To: Eric Auger, kvm, kvmarm, linux-arm-kernel
  Cc: Christoffer Dall, Alex Bennée, Andre Przywara

On 04/08/15 17:21, Eric Auger wrote:
> Hi Marc,
> On 07/24/2015 05:55 PM, Marc Zyngier wrote:
>> Virtual interrupts mapped to a HW interrupt should only be triggered
>> from inside the kernel. Otherwise, you could end up confusing the
>> kernel (and the GIC's) state machine.
>>
>> Rearrange the injection path so that kvm_vgic_inject_irq is
>> used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is
>> used for mapped interrupts. The latter should only be called from
>> inside the kernel (timer, VFIO).
> nit: I would replace VFIO by irqfd.
> VFIO just triggers the eventfd/irqfd. This is KVM/irqfd that injects the
> virtual irq upon the irqfd signaling and he irqfd adaptation/ARM
> currently is implemented in vgic.c

Ah, thanks for reminding me of the right terminology, I tend to think of
it as one big bag of nasty tricks... ;-)

I'll update the commit message.

>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  include/kvm/arm_vgic.h |  2 +
>>  virt/kvm/arm/vgic.c    | 99 ++++++++++++++++++++++++++++++++++----------------
>>  2 files changed, 70 insertions(+), 31 deletions(-)
>>
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index 7306b4b..f6bfd79 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -351,6 +351,8 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
>>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
>>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>>  			bool level);
>> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
>> +			       struct irq_phys_map *map, bool level);
>>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>> index 3f7b690..e40ef70 100644
>> --- a/virt/kvm/arm/vgic.c
>> +++ b/virt/kvm/arm/vgic.c
>> @@ -1533,7 +1533,8 @@ static int vgic_validate_injection(struct kvm_vcpu *vcpu, int irq, int level)
>>  }
>>  
>>  static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>> -				  unsigned int irq_num, bool level)
>> +				   struct irq_phys_map *map,
>> +				   unsigned int irq_num, bool level)
>>  {
> In vgic_update_irq_pending, I needed to modify the following line and
> add the "&& !map".
> 
>         if (!vgic_validate_injection(vcpu, irq_num, level) && !map) {
> 
> Without that, the level being not properly modeled for level sensitive
> forwarded IRQs, the 2d injection fails.

Ah! Is that because we never see the line being reset to zero, and the
VGIC still sees the line as pending at the distributor level?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 09/11] KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interrupt
@ 2015-08-04 16:44       ` Marc Zyngier
  0 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-08-04 16:44 UTC (permalink / raw)
  To: linux-arm-kernel

On 04/08/15 17:21, Eric Auger wrote:
> Hi Marc,
> On 07/24/2015 05:55 PM, Marc Zyngier wrote:
>> Virtual interrupts mapped to a HW interrupt should only be triggered
>> from inside the kernel. Otherwise, you could end up confusing the
>> kernel (and the GIC's) state machine.
>>
>> Rearrange the injection path so that kvm_vgic_inject_irq is
>> used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is
>> used for mapped interrupts. The latter should only be called from
>> inside the kernel (timer, VFIO).
> nit: I would replace VFIO by irqfd.
> VFIO just triggers the eventfd/irqfd. This is KVM/irqfd that injects the
> virtual irq upon the irqfd signaling and he irqfd adaptation/ARM
> currently is implemented in vgic.c

Ah, thanks for reminding me of the right terminology, I tend to think of
it as one big bag of nasty tricks... ;-)

I'll update the commit message.

>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  include/kvm/arm_vgic.h |  2 +
>>  virt/kvm/arm/vgic.c    | 99 ++++++++++++++++++++++++++++++++++----------------
>>  2 files changed, 70 insertions(+), 31 deletions(-)
>>
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index 7306b4b..f6bfd79 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -351,6 +351,8 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
>>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
>>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>>  			bool level);
>> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
>> +			       struct irq_phys_map *map, bool level);
>>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>> index 3f7b690..e40ef70 100644
>> --- a/virt/kvm/arm/vgic.c
>> +++ b/virt/kvm/arm/vgic.c
>> @@ -1533,7 +1533,8 @@ static int vgic_validate_injection(struct kvm_vcpu *vcpu, int irq, int level)
>>  }
>>  
>>  static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>> -				  unsigned int irq_num, bool level)
>> +				   struct irq_phys_map *map,
>> +				   unsigned int irq_num, bool level)
>>  {
> In vgic_update_irq_pending, I needed to modify the following line and
> add the "&& !map".
> 
>         if (!vgic_validate_injection(vcpu, irq_num, level) && !map) {
> 
> Without that, the level being not properly modeled for level sensitive
> forwarded IRQs, the 2d injection fails.

Ah! Is that because we never see the line being reset to zero, and the
VGIC still sees the line as pending at the distributor level?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 11/11] KVM: arm/arm64: vgic: Allow HW interrupts for non-shared devices
  2015-08-04 14:32     ` Christoffer Dall
@ 2015-08-04 17:08       ` Marc Zyngier
  -1 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-08-04 17:08 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: kvm, kvmarm, linux-arm-kernel, Eric Auger, Alex Bennée,
	Andre Przywara

On 04/08/15 15:32, Christoffer Dall wrote:
> On Fri, Jul 24, 2015 at 04:55:09PM +0100, Marc Zyngier wrote:
>> So far, the only use of the HW interrupt facility is the timer,
>> implying that the active state is context-switched for each vcpu,
>> as the device is is shared across all vcpus.
>>
>> This does not work for a device that has been assigned to a VM,
>> as the guest is entierely in control of that device (the HW is
>> not shared). In that case, it makes sense to bypass the whole
>> active state switching, and only track the deactivation of the
>> interrupt.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  include/kvm/arm_vgic.h    |  6 ++--
>>  virt/kvm/arm/arch_timer.c |  3 +-
>>  virt/kvm/arm/vgic.c       | 73 +++++++++++++++++++++++++++++++++++++----------
>>  3 files changed, 64 insertions(+), 18 deletions(-)
>>
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index f6bfd79..6f0a4e1 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -164,7 +164,8 @@ struct irq_phys_map {
>>  	u32			phys_irq;
>>  	u32			irq;
>>  	bool			deleted;
>> -	bool			active;
>> +	bool			shared;
>> +	bool			active; /* Only valid if shared */
>>  };
>>  
>>  struct irq_phys_map_entry {
>> @@ -357,7 +358,8 @@ void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
>>  struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
>> -					   int virt_irq, int irq);
>> +					   int virt_irq, int irq,
>> +					   bool shared);
>>  int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
>>  bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map);
>>  void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active);
>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
>> index 76e38d2..db21d8f 100644
>> --- a/virt/kvm/arm/arch_timer.c
>> +++ b/virt/kvm/arm/arch_timer.c
>> @@ -203,7 +203,8 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
>>  	 * Tell the VGIC that the virtual interrupt is tied to a
>>  	 * physical interrupt. We do that once per VCPU.
>>  	 */
>> -	map = kvm_vgic_map_phys_irq(vcpu, irq->irq, host_vtimer_irq);
>> +	map = kvm_vgic_map_phys_irq(vcpu, irq->irq,
>> +				    host_vtimer_irq, true);
>>  	if (WARN_ON(IS_ERR(map)))
>>  		return PTR_ERR(map);
>>  
>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>> index e40ef70..5e6b816 100644
>> --- a/virt/kvm/arm/vgic.c
>> +++ b/virt/kvm/arm/vgic.c
>> @@ -1128,19 +1128,25 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
>>  		 * active in the physical world. Otherwise the
>>  		 * physical interrupt will fire and the guest will
>>  		 * exit before processing the virtual interrupt.
>> +		 *
>> +		 * This is of course only valid for a shared
>> +		 * interrupt. A non shared interrupt should already be
>> +		 * active.
>>  		 */
>>  		if (map) {
>> -			int ret;
>> -
>> -			BUG_ON(!map->active);
>>  			vlr.hwirq = map->phys_irq;
>>  			vlr.state |= LR_HW;
>>  			vlr.state &= ~LR_EOI_INT;
>>  
>> -			ret = irq_set_irqchip_state(map->irq,
>> -						    IRQCHIP_STATE_ACTIVE,
>> -						    true);
>> -			WARN_ON(ret);
>> +			if (map->shared) {
>> +				int ret;
>> +
>> +				BUG_ON(!map->active);
>> +				ret = irq_set_irqchip_state(map->irq,
>> +							    IRQCHIP_STATE_ACTIVE,
>> +							    true);
>> +				WARN_ON(ret);
>> +			}
>>  
>>  			/*
>>  			 * Make sure we're not going to sample this
>> @@ -1383,21 +1389,41 @@ static bool vgic_process_maintenance(struct kvm_vcpu *vcpu)
>>  static int vgic_sync_hwirq(struct kvm_vcpu *vcpu, struct vgic_lr vlr)
>>  {
>>  	struct irq_phys_map *map;
>> +	bool active;
>>  	int ret;
>>  
>>  	if (!(vlr.state & LR_HW))
>>  		return 0;
>>  
>>  	map = vgic_irq_map_search(vcpu, vlr.irq);
>> -	BUG_ON(!map || !map->active);
>> +	BUG_ON(!map);
>> +	BUG_ON(map->shared && !map->active);
>>  
>>  	ret = irq_get_irqchip_state(map->irq,
>>  				    IRQCHIP_STATE_ACTIVE,
>> -				    &map->active);
>> +				    &active);
>>  
>>  	WARN_ON(ret);
>>  
>> -	if (map->active) {
>> +	/*
>> +	 * For a non-shared interrupt, we have to cater for two
>> +	 * possible deactivation conditions:
>> +	 *
>> +	 * - the physical interrupt is now inactive (EOIed from the
>> +         *   guest)
> 
> nit: whitespace funkyness
> 
>> +	 * - the physical interrupt is still active, but its virtual
>> +	 *   counterpart is flagged as "not queued", indicating another
>> +	 *   interrupt has fired between the EOI and the guest exit.
>> +	 *
>> +	 * Also, we are not reactivating a non-shared interrupt
> 
> what does reactivating mean? did you mean deactivate?

Deactivate indeed.

>> +	 * ourselves. This is always left to the guest.
> 
> In which case, add ", because the device is solely owned by the guest."

Sure.

>> +	 */
>> +	if (!map->shared)
>> +		return !active || !vgic_irq_is_queued(vcpu, vlr.irq);
> 
> do you really need the second part of the disjunction?
> 
> The effect seems to be that we clear the queued flag once again, and
> clear the LR.  If we don't do this, won't we simply pick up the pending
> flag next time we're about to run this VCPU and piggy-back on the
> existing LR.  Perhaps this is a weird flow though?

Crucially, we free the LR in this case (the set_bit on elrsr_ptr). If we
don't do this, we're indeed going to schedule the vcpu (it has something
to process), but we never allow piggy-backing on level interrupts. We'd
need some special hack to handle this.

I definitely feel more comfortable reporting that the interrupt has been
deactivated (which is the case), and let the normal flow pick up the
next injected interrupt.

> 
>> +
>> +	map->active = active;
>> +
>> +	if (active) {
>>  		ret = irq_set_irqchip_state(map->irq,
>>  					    IRQCHIP_STATE_ACTIVE,
>>  					    false);
>> @@ -1590,6 +1616,17 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>>  		goto out;
>>  	}
>>  
>> +	if (map && !map->shared) {
>> +		/*
>> +		 * We are told to inject a HW irq, so we have to trust
>> +		 * the caller that the previous one has been EOIed,
>> +		 * and that a new one is now active. Clearing the
>> +		 * queued state will have the effect of making it
>> +		 * sample-able again.
>> +		 */
>> +		vgic_irq_clear_queued(vcpu, irq_num);
>> +	}
>> +
>>  	if (!vgic_can_sample_irq(vcpu, irq_num)) {
>>  		/*
>>  		 * Level interrupt in progress, will be picked up
>> @@ -1720,15 +1757,19 @@ static struct list_head *vgic_get_irq_phys_map_list(struct kvm_vcpu *vcpu,
>>   * @vcpu: The VCPU pointer
>>   * @virt_irq: The virtual irq number
>>   * @irq: The Linux IRQ number
>> + * @shared: Indicates if the interrupt has to be context-switched or
>> + *          if it is private to a VM
>>   *
>>   * Establish a mapping between a guest visible irq (@virt_irq) and a
>>   * Linux irq (@irq). On injection, @virt_irq will be associated with
>> - * the physical interrupt represented by @irq.
>> + * the physical interrupt represented by @irq. If @shared is true,
>> + * the active state of the interrupt will be context-switched.
>>   *
>>   * Returns a valid pointer on success, and an error pointer otherwise
>>   */
>>  struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
>> -					   int virt_irq, int irq)
>> +					   int virt_irq, int irq,
>> +					   bool shared)
>>  {
>>  	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>>  	struct list_head *root = vgic_get_irq_phys_map_list(vcpu, virt_irq);
>> @@ -1762,7 +1803,8 @@ struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
>>  	if (map) {
>>  		/* Make sure this mapping matches */
>>  		if (map->phys_irq != phys_irq	||
>> -		    map->irq      != irq)
>> +		    map->irq      != irq	||
>> +		    map->shared   != shared)
>>  			map = ERR_PTR(-EINVAL);
>>  
>>  		goto out;
>> @@ -1772,6 +1814,7 @@ struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
>>  	map->virt_irq = virt_irq;
>>  	map->phys_irq = phys_irq;
>>  	map->irq      = irq;
>> +	map->shared   = shared;
>>  
>>  	list_add_tail_rcu(&entry->entry, root);
>>  
>> @@ -1822,7 +1865,7 @@ static void vgic_free_phys_irq_map_rcu(struct rcu_head *rcu)
>>   */
>>  bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map)
>>  {
>> -	BUG_ON(!map);
>> +	BUG_ON(!map || !map->shared);
>>  	return map->active;
>>  }
>>  
>> @@ -1834,7 +1877,7 @@ bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map)
>>   */
>>  void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active)
>>  {
>> -	BUG_ON(!map);
>> +	BUG_ON(!map || !map->shared);
>>  	map->active = active;
>>  }
>>  
>> -- 
>> 2.1.4
>>
> 
> I think I'm seeing this in the correct light this time around, so this
> is functionally looking pretty good to me.
> 
> Thanks,
> -Christoffer
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 11/11] KVM: arm/arm64: vgic: Allow HW interrupts for non-shared devices
@ 2015-08-04 17:08       ` Marc Zyngier
  0 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-08-04 17:08 UTC (permalink / raw)
  To: linux-arm-kernel

On 04/08/15 15:32, Christoffer Dall wrote:
> On Fri, Jul 24, 2015 at 04:55:09PM +0100, Marc Zyngier wrote:
>> So far, the only use of the HW interrupt facility is the timer,
>> implying that the active state is context-switched for each vcpu,
>> as the device is is shared across all vcpus.
>>
>> This does not work for a device that has been assigned to a VM,
>> as the guest is entierely in control of that device (the HW is
>> not shared). In that case, it makes sense to bypass the whole
>> active state switching, and only track the deactivation of the
>> interrupt.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  include/kvm/arm_vgic.h    |  6 ++--
>>  virt/kvm/arm/arch_timer.c |  3 +-
>>  virt/kvm/arm/vgic.c       | 73 +++++++++++++++++++++++++++++++++++++----------
>>  3 files changed, 64 insertions(+), 18 deletions(-)
>>
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index f6bfd79..6f0a4e1 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -164,7 +164,8 @@ struct irq_phys_map {
>>  	u32			phys_irq;
>>  	u32			irq;
>>  	bool			deleted;
>> -	bool			active;
>> +	bool			shared;
>> +	bool			active; /* Only valid if shared */
>>  };
>>  
>>  struct irq_phys_map_entry {
>> @@ -357,7 +358,8 @@ void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
>>  struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
>> -					   int virt_irq, int irq);
>> +					   int virt_irq, int irq,
>> +					   bool shared);
>>  int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
>>  bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map);
>>  void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active);
>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
>> index 76e38d2..db21d8f 100644
>> --- a/virt/kvm/arm/arch_timer.c
>> +++ b/virt/kvm/arm/arch_timer.c
>> @@ -203,7 +203,8 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
>>  	 * Tell the VGIC that the virtual interrupt is tied to a
>>  	 * physical interrupt. We do that once per VCPU.
>>  	 */
>> -	map = kvm_vgic_map_phys_irq(vcpu, irq->irq, host_vtimer_irq);
>> +	map = kvm_vgic_map_phys_irq(vcpu, irq->irq,
>> +				    host_vtimer_irq, true);
>>  	if (WARN_ON(IS_ERR(map)))
>>  		return PTR_ERR(map);
>>  
>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>> index e40ef70..5e6b816 100644
>> --- a/virt/kvm/arm/vgic.c
>> +++ b/virt/kvm/arm/vgic.c
>> @@ -1128,19 +1128,25 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
>>  		 * active in the physical world. Otherwise the
>>  		 * physical interrupt will fire and the guest will
>>  		 * exit before processing the virtual interrupt.
>> +		 *
>> +		 * This is of course only valid for a shared
>> +		 * interrupt. A non shared interrupt should already be
>> +		 * active.
>>  		 */
>>  		if (map) {
>> -			int ret;
>> -
>> -			BUG_ON(!map->active);
>>  			vlr.hwirq = map->phys_irq;
>>  			vlr.state |= LR_HW;
>>  			vlr.state &= ~LR_EOI_INT;
>>  
>> -			ret = irq_set_irqchip_state(map->irq,
>> -						    IRQCHIP_STATE_ACTIVE,
>> -						    true);
>> -			WARN_ON(ret);
>> +			if (map->shared) {
>> +				int ret;
>> +
>> +				BUG_ON(!map->active);
>> +				ret = irq_set_irqchip_state(map->irq,
>> +							    IRQCHIP_STATE_ACTIVE,
>> +							    true);
>> +				WARN_ON(ret);
>> +			}
>>  
>>  			/*
>>  			 * Make sure we're not going to sample this
>> @@ -1383,21 +1389,41 @@ static bool vgic_process_maintenance(struct kvm_vcpu *vcpu)
>>  static int vgic_sync_hwirq(struct kvm_vcpu *vcpu, struct vgic_lr vlr)
>>  {
>>  	struct irq_phys_map *map;
>> +	bool active;
>>  	int ret;
>>  
>>  	if (!(vlr.state & LR_HW))
>>  		return 0;
>>  
>>  	map = vgic_irq_map_search(vcpu, vlr.irq);
>> -	BUG_ON(!map || !map->active);
>> +	BUG_ON(!map);
>> +	BUG_ON(map->shared && !map->active);
>>  
>>  	ret = irq_get_irqchip_state(map->irq,
>>  				    IRQCHIP_STATE_ACTIVE,
>> -				    &map->active);
>> +				    &active);
>>  
>>  	WARN_ON(ret);
>>  
>> -	if (map->active) {
>> +	/*
>> +	 * For a non-shared interrupt, we have to cater for two
>> +	 * possible deactivation conditions:
>> +	 *
>> +	 * - the physical interrupt is now inactive (EOIed from the
>> +         *   guest)
> 
> nit: whitespace funkyness
> 
>> +	 * - the physical interrupt is still active, but its virtual
>> +	 *   counterpart is flagged as "not queued", indicating another
>> +	 *   interrupt has fired between the EOI and the guest exit.
>> +	 *
>> +	 * Also, we are not reactivating a non-shared interrupt
> 
> what does reactivating mean? did you mean deactivate?

Deactivate indeed.

>> +	 * ourselves. This is always left to the guest.
> 
> In which case, add ", because the device is solely owned by the guest."

Sure.

>> +	 */
>> +	if (!map->shared)
>> +		return !active || !vgic_irq_is_queued(vcpu, vlr.irq);
> 
> do you really need the second part of the disjunction?
> 
> The effect seems to be that we clear the queued flag once again, and
> clear the LR.  If we don't do this, won't we simply pick up the pending
> flag next time we're about to run this VCPU and piggy-back on the
> existing LR.  Perhaps this is a weird flow though?

Crucially, we free the LR in this case (the set_bit on elrsr_ptr). If we
don't do this, we're indeed going to schedule the vcpu (it has something
to process), but we never allow piggy-backing on level interrupts. We'd
need some special hack to handle this.

I definitely feel more comfortable reporting that the interrupt has been
deactivated (which is the case), and let the normal flow pick up the
next injected interrupt.

> 
>> +
>> +	map->active = active;
>> +
>> +	if (active) {
>>  		ret = irq_set_irqchip_state(map->irq,
>>  					    IRQCHIP_STATE_ACTIVE,
>>  					    false);
>> @@ -1590,6 +1616,17 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>>  		goto out;
>>  	}
>>  
>> +	if (map && !map->shared) {
>> +		/*
>> +		 * We are told to inject a HW irq, so we have to trust
>> +		 * the caller that the previous one has been EOIed,
>> +		 * and that a new one is now active. Clearing the
>> +		 * queued state will have the effect of making it
>> +		 * sample-able again.
>> +		 */
>> +		vgic_irq_clear_queued(vcpu, irq_num);
>> +	}
>> +
>>  	if (!vgic_can_sample_irq(vcpu, irq_num)) {
>>  		/*
>>  		 * Level interrupt in progress, will be picked up
>> @@ -1720,15 +1757,19 @@ static struct list_head *vgic_get_irq_phys_map_list(struct kvm_vcpu *vcpu,
>>   * @vcpu: The VCPU pointer
>>   * @virt_irq: The virtual irq number
>>   * @irq: The Linux IRQ number
>> + * @shared: Indicates if the interrupt has to be context-switched or
>> + *          if it is private to a VM
>>   *
>>   * Establish a mapping between a guest visible irq (@virt_irq) and a
>>   * Linux irq (@irq). On injection, @virt_irq will be associated with
>> - * the physical interrupt represented by @irq.
>> + * the physical interrupt represented by @irq. If @shared is true,
>> + * the active state of the interrupt will be context-switched.
>>   *
>>   * Returns a valid pointer on success, and an error pointer otherwise
>>   */
>>  struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
>> -					   int virt_irq, int irq)
>> +					   int virt_irq, int irq,
>> +					   bool shared)
>>  {
>>  	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>>  	struct list_head *root = vgic_get_irq_phys_map_list(vcpu, virt_irq);
>> @@ -1762,7 +1803,8 @@ struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
>>  	if (map) {
>>  		/* Make sure this mapping matches */
>>  		if (map->phys_irq != phys_irq	||
>> -		    map->irq      != irq)
>> +		    map->irq      != irq	||
>> +		    map->shared   != shared)
>>  			map = ERR_PTR(-EINVAL);
>>  
>>  		goto out;
>> @@ -1772,6 +1814,7 @@ struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
>>  	map->virt_irq = virt_irq;
>>  	map->phys_irq = phys_irq;
>>  	map->irq      = irq;
>> +	map->shared   = shared;
>>  
>>  	list_add_tail_rcu(&entry->entry, root);
>>  
>> @@ -1822,7 +1865,7 @@ static void vgic_free_phys_irq_map_rcu(struct rcu_head *rcu)
>>   */
>>  bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map)
>>  {
>> -	BUG_ON(!map);
>> +	BUG_ON(!map || !map->shared);
>>  	return map->active;
>>  }
>>  
>> @@ -1834,7 +1877,7 @@ bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map)
>>   */
>>  void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active)
>>  {
>> -	BUG_ON(!map);
>> +	BUG_ON(!map || !map->shared);
>>  	map->active = active;
>>  }
>>  
>> -- 
>> 2.1.4
>>
> 
> I think I'm seeing this in the correct light this time around, so this
> is functionally looking pretty good to me.
> 
> Thanks,
> -Christoffer
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 06/11] KVM: arm/arm64: vgic: Allow dynamic mapping of physical/virtual interrupts
  2015-08-04 15:27       ` Marc Zyngier
@ 2015-08-04 17:36         ` Christoffer Dall
  -1 siblings, 0 replies; 74+ messages in thread
From: Christoffer Dall @ 2015-08-04 17:36 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvm, kvmarm, linux-arm-kernel, Eric Auger, Alex Bennée,
	Andre Przywara

On Tue, Aug 04, 2015 at 04:27:03PM +0100, Marc Zyngier wrote:
> On 04/08/15 14:04, Christoffer Dall wrote:
> > On Fri, Jul 24, 2015 at 04:55:04PM +0100, Marc Zyngier wrote:
> >> In order to be able to feed physical interrupts to a guest, we need
> >> to be able to establish the virtual-physical mapping between the two
> >> worlds.
> >>
> >> The mappings are kept in a set of RCU lists, indexed by virtual interrupts.
> >>
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> ---
> >>  arch/arm/kvm/arm.c     |   2 +
> >>  include/kvm/arm_vgic.h |  26 +++++++
> >>  virt/kvm/arm/vgic.c    | 189 ++++++++++++++++++++++++++++++++++++++++++++++++-
> >>  3 files changed, 216 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> >> index f1bf418..ce404a5 100644
> >> --- a/arch/arm/kvm/arm.c
> >> +++ b/arch/arm/kvm/arm.c
> >> @@ -125,6 +125,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
> >>       if (ret)
> >>               goto out_free_stage2_pgd;
> >>
> >> +     kvm_vgic_early_init(kvm);
> >>       kvm_timer_init(kvm);
> >>
> >>       /* Mark the initial VMID generation invalid */
> >> @@ -249,6 +250,7 @@ out:
> >>
> >>  void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
> >>  {
> >> +     kvm_vgic_vcpu_early_init(vcpu);
> >>  }
> >>
> >>  void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu)
> >> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> >> index a881e39..68212a1 100644
> >> --- a/include/kvm/arm_vgic.h
> >> +++ b/include/kvm/arm_vgic.h
> >> @@ -159,6 +159,20 @@ struct vgic_io_device {
> >>       struct kvm_io_device dev;
> >>  };
> >>
> >> +struct irq_phys_map {
> >> +     u32                     virt_irq;
> >> +     u32                     phys_irq;
> >> +     u32                     irq;
> >> +     bool                    deleted;
> >> +     bool                    active;
> >> +};
> >> +
> >> +struct irq_phys_map_entry {
> >> +     struct list_head        entry;
> >> +     struct rcu_head         rcu;
> >> +     struct irq_phys_map     map;
> >> +};
> >> +
> >>  struct vgic_dist {
> >>       spinlock_t              lock;
> >>       bool                    in_kernel;
> >> @@ -256,6 +270,10 @@ struct vgic_dist {
> >>       struct vgic_vm_ops      vm_ops;
> >>       struct vgic_io_device   dist_iodev;
> >>       struct vgic_io_device   *redist_iodevs;
> >> +
> >> +     /* Virtual irq to hwirq mapping */
> >> +     spinlock_t              irq_phys_map_lock;
> >> +     struct list_head        irq_phys_map_list;
> >>  };
> >>
> >>  struct vgic_v2_cpu_if {
> >> @@ -307,6 +325,9 @@ struct vgic_cpu {
> >>               struct vgic_v2_cpu_if   vgic_v2;
> >>               struct vgic_v3_cpu_if   vgic_v3;
> >>       };
> >> +
> >> +     /* Protected by the distributor's irq_phys_map_lock */
> >> +     struct list_head        irq_phys_map_list;
> >>  };
> >>
> >>  #define LR_EMPTY     0xff
> >> @@ -321,8 +342,10 @@ int kvm_vgic_addr(struct kvm *kvm, unsigned long type, u64 *addr, bool write);
> >>  int kvm_vgic_hyp_init(void);
> >>  int kvm_vgic_map_resources(struct kvm *kvm);
> >>  int kvm_vgic_get_max_vcpus(void);
> >> +void kvm_vgic_early_init(struct kvm *kvm);
> >>  int kvm_vgic_create(struct kvm *kvm, u32 type);
> >>  void kvm_vgic_destroy(struct kvm *kvm);
> >> +void kvm_vgic_vcpu_early_init(struct kvm_vcpu *vcpu);
> >>  void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu);
> >>  void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
> >>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
> >> @@ -331,6 +354,9 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
> >>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
> >>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
> >>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
> >> +struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
> >> +                                        int virt_irq, int irq);
> >> +int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
> >>
> >>  #define irqchip_in_kernel(k) (!!((k)->arch.vgic.in_kernel))
> >>  #define vgic_initialized(k)  (!!((k)->arch.vgic.nr_cpus))
> >> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> >> index 5bd1695..c74d929 100644
> >> --- a/virt/kvm/arm/vgic.c
> >> +++ b/virt/kvm/arm/vgic.c
> >> @@ -24,6 +24,7 @@
> >>  #include <linux/of.h>
> >>  #include <linux/of_address.h>
> >>  #include <linux/of_irq.h>
> >> +#include <linux/rculist.h>
> >>  #include <linux/uaccess.h>
> >>
> >>  #include <asm/kvm_emulate.h>
> >> @@ -82,6 +83,8 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu);
> >>  static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu);
> >>  static struct vgic_lr vgic_get_lr(const struct kvm_vcpu *vcpu, int lr);
> >>  static void vgic_set_lr(struct kvm_vcpu *vcpu, int lr, struct vgic_lr lr_desc);
> >> +static struct irq_phys_map *vgic_irq_map_search(struct kvm_vcpu *vcpu,
> >> +                                             int virt_irq);
> >>
> >>  static const struct vgic_ops *vgic_ops;
> >>  static const struct vgic_params *vgic;
> >> @@ -1583,6 +1586,166 @@ static irqreturn_t vgic_maintenance_handler(int irq, void *data)
> >>       return IRQ_HANDLED;
> >>  }
> >>
> >> +static struct list_head *vgic_get_irq_phys_map_list(struct kvm_vcpu *vcpu,
> >> +                                                 int virt_irq)
> >> +{
> >> +     if (virt_irq < VGIC_NR_PRIVATE_IRQS)
> >> +             return &vcpu->arch.vgic_cpu.irq_phys_map_list;
> >> +     else
> >> +             return &vcpu->kvm->arch.vgic.irq_phys_map_list;
> >> +}
> >> +
> >> +/**
> >> + * kvm_vgic_map_phys_irq - map a virtual IRQ to a physical IRQ
> >> + * @vcpu: The VCPU pointer
> >> + * @virt_irq: The virtual irq number
> >> + * @irq: The Linux IRQ number
> >> + *
> >> + * Establish a mapping between a guest visible irq (@virt_irq) and a
> >> + * Linux irq (@irq). On injection, @virt_irq will be associated with
> >> + * the physical interrupt represented by @irq.
> >> + *
> >> + * Returns a valid pointer on success, and an error pointer otherwise
> >> + */
> >> +struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
> >> +                                        int virt_irq, int irq)
> >> +{
> >> +     struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> >> +     struct list_head *root = vgic_get_irq_phys_map_list(vcpu, virt_irq);
> >> +     struct irq_phys_map *map;
> >> +     struct irq_phys_map_entry *entry;
> >> +     struct irq_desc *desc;
> >> +     struct irq_data *data;
> >> +     int phys_irq;
> >> +
> >> +     desc = irq_to_desc(irq);
> >> +     if (!desc) {
> >> +             kvm_err("kvm_vgic_map_phys_irq: no interrupt descriptor\n");
> > 
> > super over-ridiculous nit:
> > if you respin you could consider "%s: no...", __FUNC__
> 
> Sure.
> 
> > 
> >> +             return ERR_PTR(-EINVAL);
> >> +     }
> >> +
> >> +     data = irq_desc_get_irq_data(desc);
> >> +     while (data->parent_data)
> >> +             data = data->parent_data;
> >> +
> >> +     phys_irq = data->hwirq;
> >> +
> >> +     /* Create a new mapping */
> >> +     entry = kzalloc(sizeof(*entry), GFP_KERNEL);
> >> +     if (!entry)
> >> +             return ERR_PTR(-ENOMEM);
> >> +
> >> +     spin_lock(&dist->irq_phys_map_lock);
> >> +
> >> +     /* Try to match an existing mapping */
> >> +     map = vgic_irq_map_search(vcpu, virt_irq);
> >> +     if (map) {
> >> +             /* Make sure this mapping matches */
> > 
> > why do we need to check that it matches?  Is it ever allowed to have
> > multiple mappings of a virtual interrupt?
> > 
> 
> Multiple *different* mappings, no. What you can have though is issuing
> requests for the *same* mapping repeatedly. This is what happens with
> the timer, for example: kvm_timer_vcpu_reset is going to be called more
> than once over the lifetime of a VM, requesting a mapping each time
> without necessarily destroying the previous one.
> 
> >> +             if (map->phys_irq != phys_irq   ||
> >> +                 map->irq      != irq)
> >> +                     map = ERR_PTR(-EINVAL);
> >> +
> >> +             goto out;
> > 
> > why is this a valid situation?  Shouldn't you return -EEXIST or
> > something instead?  (that would also simplify the error-checking free
> > routine below).
> > 
> > If you want it to be valid to map the same virq to irq multiple times
> > (to free the caller from some bookkeeping, I presume?) then that should
> > be part of the function documentation, and I think we should add a
> > "found existing mapping" comment above the goto out statement.
> 
> Fair enough. I'll add some comments to that effect.
> 

thanks!

> >> +     }
> >> +
> >> +     map           = &entry->map;
> >> +     map->virt_irq = virt_irq;
> >> +     map->phys_irq = phys_irq;
> >> +     map->irq      = irq;
> >> +
> >> +     list_add_tail_rcu(&entry->entry, root);
> >> +
> >> +out:
> >> +     spin_unlock(&dist->irq_phys_map_lock);
> >> +     /* If we've found a hit in the existing list, free the useless
> >> +      * entry */
> >> +     if (IS_ERR(map) || map != &entry->map)
> >> +             kfree(entry);
> >> +     return map;
> >> +}
> >> +
> >> +static struct irq_phys_map *vgic_irq_map_search(struct kvm_vcpu *vcpu,
> >> +                                             int virt_irq)
> >> +{
> >> +     struct list_head *root = vgic_get_irq_phys_map_list(vcpu, virt_irq);
> >> +     struct irq_phys_map_entry *entry;
> >> +     struct irq_phys_map *map;
> >> +
> >> +     rcu_read_lock();
> >> +
> >> +     list_for_each_entry_rcu(entry, root, entry) {
> >> +             map = &entry->map;
> >> +             if (map->virt_irq == virt_irq && !map->deleted) {
> >> +                     rcu_read_unlock();
> >> +                     return map;
> >> +             }
> >> +     }
> >> +
> >> +     rcu_read_unlock();
> >> +
> >> +     return NULL;
> >> +}
> >> +
> >> +static void vgic_free_phys_irq_map_rcu(struct rcu_head *rcu)
> >> +{
> >> +     struct irq_phys_map_entry *entry;
> >> +
> >> +     entry = container_of(rcu, struct irq_phys_map_entry, rcu);
> >> +     kfree(entry);
> >> +}
> >> +
> >> +/**
> >> + * kvm_vgic_unmap_phys_irq - Remove a virtual to physical IRQ mapping
> >> + * @vcpu: The VCPU pointer
> >> + * @map: The pointer to a mapping obtained through kvm_vgic_map_phys_irq
> >> + *
> >> + * Remove an existing mapping between virtual and physical interrupts.
> >> + */
> >> +int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map)
> >> +{
> >> +     struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> >> +     struct irq_phys_map_entry *entry;
> >> +     struct list_head *root;
> >> +
> >> +     if (!map)
> >> +             return -EINVAL;
> >> +
> >> +     root = vgic_get_irq_phys_map_list(vcpu, map->virt_irq);
> >> +
> >> +     spin_lock(&dist->irq_phys_map_lock);
> >> +
> >> +     list_for_each_entry(entry, root, entry) {
> >> +             if (&entry->map == map && !map->deleted) {
> >> +                     map->deleted = true;
> > 
> > why do you need the 'deleted' flag?  You now search the list only after
> > holding the lock, so the race I pointed out before doesn't exist
> > anymore.  Is there an additional issue that the deleted flag is taking
> > care of?
> 
> This could still race with a destroy occuring before we take the lock,
> and before we get get RCU to do the cleanup (the list would still be
> populated).
> 

That's not how I understand list_del_rcu; I think it deletes the entry
immediately with the right memory barriers.  It is only the free
operation that happens on rcu sync and can be/is deferred.

So I'm pretty sure that when you search the list after holding the lock,
there is no race.

> > 
> >> +                     list_del_rcu(&entry->entry);
> >> +                     call_rcu(&entry->rcu, vgic_free_phys_irq_map_rcu);
> >> +                     break;
> >> +             }
> >> +     }
> >> +
> >> +     spin_unlock(&dist->irq_phys_map_lock);
> >> +
> >> +     return 0;
> >> +}
> >> +
> >> +static void vgic_destroy_irq_phys_map(struct kvm *kvm, struct list_head *root)
> >> +{
> >> +     struct vgic_dist *dist = &kvm->arch.vgic;
> >> +     struct irq_phys_map_entry *entry;
> >> +
> >> +     spin_lock(&dist->irq_phys_map_lock);
> >> +
> >> +     list_for_each_entry(entry, root, entry) {
> >> +             if (!entry->map.deleted) {
> >> +                     entry->map.deleted = true;
> >> +                     list_del_rcu(&entry->entry);
> >> +                     call_rcu(&entry->rcu, vgic_free_phys_irq_map_rcu);
> >> +             }
> >> +     }
> >> +
> >> +     spin_unlock(&dist->irq_phys_map_lock);
> >> +}
> >> +
> >>  void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
> >>  {
> >>       struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> >> @@ -1591,6 +1754,7 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
> >>       kfree(vgic_cpu->active_shared);
> >>       kfree(vgic_cpu->pend_act_shared);
> >>       kfree(vgic_cpu->vgic_irq_lr_map);
> >> +     vgic_destroy_irq_phys_map(vcpu->kvm, &vgic_cpu->irq_phys_map_list);
> >>       vgic_cpu->pending_shared = NULL;
> >>       vgic_cpu->active_shared = NULL;
> >>       vgic_cpu->pend_act_shared = NULL;
> >> @@ -1628,6 +1792,17 @@ static int vgic_vcpu_init_maps(struct kvm_vcpu *vcpu, int nr_irqs)
> >>  }
> >>
> >>  /**
> >> + * kvm_vgic_vcpu_early_init - Earliest possible per-vcpu vgic init stage
> > 
> > earliest possible... oh how I love the vgic code.
> > 
> 
> You must be the only one! ;-)
> 
> > If you have the energy and context to write a comment in the beginning
> > of this file describing the init flow, I think that would be useful...
> > 
> 
> How about starting with this?
> 
> <quote>
> There are multiple stages to the vgic initialization, both for the
> distributor and the CPU interfaces.
> 
> Distributor:
> 
> - kvm_vgic_early_init(): initialization of static data that doesn't
> depend on any sizing information or emulation type. No allocation is
> allowed there.
> 
> - vgic_init(): allocation and initialization of the generic data
> structures that depend on sizing information (number of CPUs, number of
> interrupts). Also initializes the vcpu specific data structures. Can be
> executed lazily for GICv2. [to be renamed to kvm_vgic_init??]
> 
> CPU Interface:
> 
> - kvm_vgic_cpu_early_init(): initialization of static data that doesn't
> depend on any sizing information or emulation type. No allocation is
> allowed there.
> </quote>
> 
> I'm sure we can expand it as we go...
> 

yes, this is great, it made me feel comfortable with the whole thing.
Thanks!

> >> + *
> >> + * No memory allocation should be performed here, only static init.
> >> + */
> >> +void kvm_vgic_vcpu_early_init(struct kvm_vcpu *vcpu)
> >> +{
> >> +     struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> >> +     INIT_LIST_HEAD(&vgic_cpu->irq_phys_map_list);
> >> +}
> >> +
> >> +/**
> >>   * kvm_vgic_get_max_vcpus - Get the maximum number of VCPUs allowed by HW
> >>   *
> >>   * The host's GIC naturally limits the maximum amount of VCPUs a guest
> >> @@ -1664,6 +1839,7 @@ void kvm_vgic_destroy(struct kvm *kvm)
> >>       kfree(dist->irq_spi_target);
> >>       kfree(dist->irq_pending_on_cpu);
> >>       kfree(dist->irq_active_on_cpu);
> >> +     vgic_destroy_irq_phys_map(kvm, &dist->irq_phys_map_list);
> >>       dist->irq_sgi_sources = NULL;
> >>       dist->irq_spi_cpu = NULL;
> >>       dist->irq_spi_target = NULL;
> >> @@ -1787,6 +1963,18 @@ static int init_vgic_model(struct kvm *kvm, int type)
> >>       return 0;
> >>  }
> >>
> >> +/**
> >> + * kvm_vgic_early_init - Earliest possible vgic initialization stage
> >> + *
> >> + * No memory allocation should be performed here, only static init.
> >> + */
> >> +void kvm_vgic_early_init(struct kvm *kvm)
> >> +{
> >> +     spin_lock_init(&kvm->arch.vgic.lock);
> > 
> > so here we're initializing the data structures even before the vgic is
> > created, right?  So therefore it's safe to move the vgic.lock init up
> > here?
> 
> Yes. The distributor structure is part of the struct kvm, hence safe to
> poke early on. This doesn't depend on any form of sizing or emulation
> type either.
> 

ok, slight warm fuzzy feeling.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 06/11] KVM: arm/arm64: vgic: Allow dynamic mapping of physical/virtual interrupts
@ 2015-08-04 17:36         ` Christoffer Dall
  0 siblings, 0 replies; 74+ messages in thread
From: Christoffer Dall @ 2015-08-04 17:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 04, 2015 at 04:27:03PM +0100, Marc Zyngier wrote:
> On 04/08/15 14:04, Christoffer Dall wrote:
> > On Fri, Jul 24, 2015 at 04:55:04PM +0100, Marc Zyngier wrote:
> >> In order to be able to feed physical interrupts to a guest, we need
> >> to be able to establish the virtual-physical mapping between the two
> >> worlds.
> >>
> >> The mappings are kept in a set of RCU lists, indexed by virtual interrupts.
> >>
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> ---
> >>  arch/arm/kvm/arm.c     |   2 +
> >>  include/kvm/arm_vgic.h |  26 +++++++
> >>  virt/kvm/arm/vgic.c    | 189 ++++++++++++++++++++++++++++++++++++++++++++++++-
> >>  3 files changed, 216 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> >> index f1bf418..ce404a5 100644
> >> --- a/arch/arm/kvm/arm.c
> >> +++ b/arch/arm/kvm/arm.c
> >> @@ -125,6 +125,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
> >>       if (ret)
> >>               goto out_free_stage2_pgd;
> >>
> >> +     kvm_vgic_early_init(kvm);
> >>       kvm_timer_init(kvm);
> >>
> >>       /* Mark the initial VMID generation invalid */
> >> @@ -249,6 +250,7 @@ out:
> >>
> >>  void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
> >>  {
> >> +     kvm_vgic_vcpu_early_init(vcpu);
> >>  }
> >>
> >>  void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu)
> >> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> >> index a881e39..68212a1 100644
> >> --- a/include/kvm/arm_vgic.h
> >> +++ b/include/kvm/arm_vgic.h
> >> @@ -159,6 +159,20 @@ struct vgic_io_device {
> >>       struct kvm_io_device dev;
> >>  };
> >>
> >> +struct irq_phys_map {
> >> +     u32                     virt_irq;
> >> +     u32                     phys_irq;
> >> +     u32                     irq;
> >> +     bool                    deleted;
> >> +     bool                    active;
> >> +};
> >> +
> >> +struct irq_phys_map_entry {
> >> +     struct list_head        entry;
> >> +     struct rcu_head         rcu;
> >> +     struct irq_phys_map     map;
> >> +};
> >> +
> >>  struct vgic_dist {
> >>       spinlock_t              lock;
> >>       bool                    in_kernel;
> >> @@ -256,6 +270,10 @@ struct vgic_dist {
> >>       struct vgic_vm_ops      vm_ops;
> >>       struct vgic_io_device   dist_iodev;
> >>       struct vgic_io_device   *redist_iodevs;
> >> +
> >> +     /* Virtual irq to hwirq mapping */
> >> +     spinlock_t              irq_phys_map_lock;
> >> +     struct list_head        irq_phys_map_list;
> >>  };
> >>
> >>  struct vgic_v2_cpu_if {
> >> @@ -307,6 +325,9 @@ struct vgic_cpu {
> >>               struct vgic_v2_cpu_if   vgic_v2;
> >>               struct vgic_v3_cpu_if   vgic_v3;
> >>       };
> >> +
> >> +     /* Protected by the distributor's irq_phys_map_lock */
> >> +     struct list_head        irq_phys_map_list;
> >>  };
> >>
> >>  #define LR_EMPTY     0xff
> >> @@ -321,8 +342,10 @@ int kvm_vgic_addr(struct kvm *kvm, unsigned long type, u64 *addr, bool write);
> >>  int kvm_vgic_hyp_init(void);
> >>  int kvm_vgic_map_resources(struct kvm *kvm);
> >>  int kvm_vgic_get_max_vcpus(void);
> >> +void kvm_vgic_early_init(struct kvm *kvm);
> >>  int kvm_vgic_create(struct kvm *kvm, u32 type);
> >>  void kvm_vgic_destroy(struct kvm *kvm);
> >> +void kvm_vgic_vcpu_early_init(struct kvm_vcpu *vcpu);
> >>  void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu);
> >>  void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
> >>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
> >> @@ -331,6 +354,9 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
> >>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
> >>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
> >>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
> >> +struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
> >> +                                        int virt_irq, int irq);
> >> +int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
> >>
> >>  #define irqchip_in_kernel(k) (!!((k)->arch.vgic.in_kernel))
> >>  #define vgic_initialized(k)  (!!((k)->arch.vgic.nr_cpus))
> >> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> >> index 5bd1695..c74d929 100644
> >> --- a/virt/kvm/arm/vgic.c
> >> +++ b/virt/kvm/arm/vgic.c
> >> @@ -24,6 +24,7 @@
> >>  #include <linux/of.h>
> >>  #include <linux/of_address.h>
> >>  #include <linux/of_irq.h>
> >> +#include <linux/rculist.h>
> >>  #include <linux/uaccess.h>
> >>
> >>  #include <asm/kvm_emulate.h>
> >> @@ -82,6 +83,8 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu);
> >>  static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu);
> >>  static struct vgic_lr vgic_get_lr(const struct kvm_vcpu *vcpu, int lr);
> >>  static void vgic_set_lr(struct kvm_vcpu *vcpu, int lr, struct vgic_lr lr_desc);
> >> +static struct irq_phys_map *vgic_irq_map_search(struct kvm_vcpu *vcpu,
> >> +                                             int virt_irq);
> >>
> >>  static const struct vgic_ops *vgic_ops;
> >>  static const struct vgic_params *vgic;
> >> @@ -1583,6 +1586,166 @@ static irqreturn_t vgic_maintenance_handler(int irq, void *data)
> >>       return IRQ_HANDLED;
> >>  }
> >>
> >> +static struct list_head *vgic_get_irq_phys_map_list(struct kvm_vcpu *vcpu,
> >> +                                                 int virt_irq)
> >> +{
> >> +     if (virt_irq < VGIC_NR_PRIVATE_IRQS)
> >> +             return &vcpu->arch.vgic_cpu.irq_phys_map_list;
> >> +     else
> >> +             return &vcpu->kvm->arch.vgic.irq_phys_map_list;
> >> +}
> >> +
> >> +/**
> >> + * kvm_vgic_map_phys_irq - map a virtual IRQ to a physical IRQ
> >> + * @vcpu: The VCPU pointer
> >> + * @virt_irq: The virtual irq number
> >> + * @irq: The Linux IRQ number
> >> + *
> >> + * Establish a mapping between a guest visible irq (@virt_irq) and a
> >> + * Linux irq (@irq). On injection, @virt_irq will be associated with
> >> + * the physical interrupt represented by @irq.
> >> + *
> >> + * Returns a valid pointer on success, and an error pointer otherwise
> >> + */
> >> +struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
> >> +                                        int virt_irq, int irq)
> >> +{
> >> +     struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> >> +     struct list_head *root = vgic_get_irq_phys_map_list(vcpu, virt_irq);
> >> +     struct irq_phys_map *map;
> >> +     struct irq_phys_map_entry *entry;
> >> +     struct irq_desc *desc;
> >> +     struct irq_data *data;
> >> +     int phys_irq;
> >> +
> >> +     desc = irq_to_desc(irq);
> >> +     if (!desc) {
> >> +             kvm_err("kvm_vgic_map_phys_irq: no interrupt descriptor\n");
> > 
> > super over-ridiculous nit:
> > if you respin you could consider "%s: no...", __FUNC__
> 
> Sure.
> 
> > 
> >> +             return ERR_PTR(-EINVAL);
> >> +     }
> >> +
> >> +     data = irq_desc_get_irq_data(desc);
> >> +     while (data->parent_data)
> >> +             data = data->parent_data;
> >> +
> >> +     phys_irq = data->hwirq;
> >> +
> >> +     /* Create a new mapping */
> >> +     entry = kzalloc(sizeof(*entry), GFP_KERNEL);
> >> +     if (!entry)
> >> +             return ERR_PTR(-ENOMEM);
> >> +
> >> +     spin_lock(&dist->irq_phys_map_lock);
> >> +
> >> +     /* Try to match an existing mapping */
> >> +     map = vgic_irq_map_search(vcpu, virt_irq);
> >> +     if (map) {
> >> +             /* Make sure this mapping matches */
> > 
> > why do we need to check that it matches?  Is it ever allowed to have
> > multiple mappings of a virtual interrupt?
> > 
> 
> Multiple *different* mappings, no. What you can have though is issuing
> requests for the *same* mapping repeatedly. This is what happens with
> the timer, for example: kvm_timer_vcpu_reset is going to be called more
> than once over the lifetime of a VM, requesting a mapping each time
> without necessarily destroying the previous one.
> 
> >> +             if (map->phys_irq != phys_irq   ||
> >> +                 map->irq      != irq)
> >> +                     map = ERR_PTR(-EINVAL);
> >> +
> >> +             goto out;
> > 
> > why is this a valid situation?  Shouldn't you return -EEXIST or
> > something instead?  (that would also simplify the error-checking free
> > routine below).
> > 
> > If you want it to be valid to map the same virq to irq multiple times
> > (to free the caller from some bookkeeping, I presume?) then that should
> > be part of the function documentation, and I think we should add a
> > "found existing mapping" comment above the goto out statement.
> 
> Fair enough. I'll add some comments to that effect.
> 

thanks!

> >> +     }
> >> +
> >> +     map           = &entry->map;
> >> +     map->virt_irq = virt_irq;
> >> +     map->phys_irq = phys_irq;
> >> +     map->irq      = irq;
> >> +
> >> +     list_add_tail_rcu(&entry->entry, root);
> >> +
> >> +out:
> >> +     spin_unlock(&dist->irq_phys_map_lock);
> >> +     /* If we've found a hit in the existing list, free the useless
> >> +      * entry */
> >> +     if (IS_ERR(map) || map != &entry->map)
> >> +             kfree(entry);
> >> +     return map;
> >> +}
> >> +
> >> +static struct irq_phys_map *vgic_irq_map_search(struct kvm_vcpu *vcpu,
> >> +                                             int virt_irq)
> >> +{
> >> +     struct list_head *root = vgic_get_irq_phys_map_list(vcpu, virt_irq);
> >> +     struct irq_phys_map_entry *entry;
> >> +     struct irq_phys_map *map;
> >> +
> >> +     rcu_read_lock();
> >> +
> >> +     list_for_each_entry_rcu(entry, root, entry) {
> >> +             map = &entry->map;
> >> +             if (map->virt_irq == virt_irq && !map->deleted) {
> >> +                     rcu_read_unlock();
> >> +                     return map;
> >> +             }
> >> +     }
> >> +
> >> +     rcu_read_unlock();
> >> +
> >> +     return NULL;
> >> +}
> >> +
> >> +static void vgic_free_phys_irq_map_rcu(struct rcu_head *rcu)
> >> +{
> >> +     struct irq_phys_map_entry *entry;
> >> +
> >> +     entry = container_of(rcu, struct irq_phys_map_entry, rcu);
> >> +     kfree(entry);
> >> +}
> >> +
> >> +/**
> >> + * kvm_vgic_unmap_phys_irq - Remove a virtual to physical IRQ mapping
> >> + * @vcpu: The VCPU pointer
> >> + * @map: The pointer to a mapping obtained through kvm_vgic_map_phys_irq
> >> + *
> >> + * Remove an existing mapping between virtual and physical interrupts.
> >> + */
> >> +int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map)
> >> +{
> >> +     struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> >> +     struct irq_phys_map_entry *entry;
> >> +     struct list_head *root;
> >> +
> >> +     if (!map)
> >> +             return -EINVAL;
> >> +
> >> +     root = vgic_get_irq_phys_map_list(vcpu, map->virt_irq);
> >> +
> >> +     spin_lock(&dist->irq_phys_map_lock);
> >> +
> >> +     list_for_each_entry(entry, root, entry) {
> >> +             if (&entry->map == map && !map->deleted) {
> >> +                     map->deleted = true;
> > 
> > why do you need the 'deleted' flag?  You now search the list only after
> > holding the lock, so the race I pointed out before doesn't exist
> > anymore.  Is there an additional issue that the deleted flag is taking
> > care of?
> 
> This could still race with a destroy occuring before we take the lock,
> and before we get get RCU to do the cleanup (the list would still be
> populated).
> 

That's not how I understand list_del_rcu; I think it deletes the entry
immediately with the right memory barriers.  It is only the free
operation that happens on rcu sync and can be/is deferred.

So I'm pretty sure that when you search the list after holding the lock,
there is no race.

> > 
> >> +                     list_del_rcu(&entry->entry);
> >> +                     call_rcu(&entry->rcu, vgic_free_phys_irq_map_rcu);
> >> +                     break;
> >> +             }
> >> +     }
> >> +
> >> +     spin_unlock(&dist->irq_phys_map_lock);
> >> +
> >> +     return 0;
> >> +}
> >> +
> >> +static void vgic_destroy_irq_phys_map(struct kvm *kvm, struct list_head *root)
> >> +{
> >> +     struct vgic_dist *dist = &kvm->arch.vgic;
> >> +     struct irq_phys_map_entry *entry;
> >> +
> >> +     spin_lock(&dist->irq_phys_map_lock);
> >> +
> >> +     list_for_each_entry(entry, root, entry) {
> >> +             if (!entry->map.deleted) {
> >> +                     entry->map.deleted = true;
> >> +                     list_del_rcu(&entry->entry);
> >> +                     call_rcu(&entry->rcu, vgic_free_phys_irq_map_rcu);
> >> +             }
> >> +     }
> >> +
> >> +     spin_unlock(&dist->irq_phys_map_lock);
> >> +}
> >> +
> >>  void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
> >>  {
> >>       struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> >> @@ -1591,6 +1754,7 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
> >>       kfree(vgic_cpu->active_shared);
> >>       kfree(vgic_cpu->pend_act_shared);
> >>       kfree(vgic_cpu->vgic_irq_lr_map);
> >> +     vgic_destroy_irq_phys_map(vcpu->kvm, &vgic_cpu->irq_phys_map_list);
> >>       vgic_cpu->pending_shared = NULL;
> >>       vgic_cpu->active_shared = NULL;
> >>       vgic_cpu->pend_act_shared = NULL;
> >> @@ -1628,6 +1792,17 @@ static int vgic_vcpu_init_maps(struct kvm_vcpu *vcpu, int nr_irqs)
> >>  }
> >>
> >>  /**
> >> + * kvm_vgic_vcpu_early_init - Earliest possible per-vcpu vgic init stage
> > 
> > earliest possible... oh how I love the vgic code.
> > 
> 
> You must be the only one! ;-)
> 
> > If you have the energy and context to write a comment in the beginning
> > of this file describing the init flow, I think that would be useful...
> > 
> 
> How about starting with this?
> 
> <quote>
> There are multiple stages to the vgic initialization, both for the
> distributor and the CPU interfaces.
> 
> Distributor:
> 
> - kvm_vgic_early_init(): initialization of static data that doesn't
> depend on any sizing information or emulation type. No allocation is
> allowed there.
> 
> - vgic_init(): allocation and initialization of the generic data
> structures that depend on sizing information (number of CPUs, number of
> interrupts). Also initializes the vcpu specific data structures. Can be
> executed lazily for GICv2. [to be renamed to kvm_vgic_init??]
> 
> CPU Interface:
> 
> - kvm_vgic_cpu_early_init(): initialization of static data that doesn't
> depend on any sizing information or emulation type. No allocation is
> allowed there.
> </quote>
> 
> I'm sure we can expand it as we go...
> 

yes, this is great, it made me feel comfortable with the whole thing.
Thanks!

> >> + *
> >> + * No memory allocation should be performed here, only static init.
> >> + */
> >> +void kvm_vgic_vcpu_early_init(struct kvm_vcpu *vcpu)
> >> +{
> >> +     struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> >> +     INIT_LIST_HEAD(&vgic_cpu->irq_phys_map_list);
> >> +}
> >> +
> >> +/**
> >>   * kvm_vgic_get_max_vcpus - Get the maximum number of VCPUs allowed by HW
> >>   *
> >>   * The host's GIC naturally limits the maximum amount of VCPUs a guest
> >> @@ -1664,6 +1839,7 @@ void kvm_vgic_destroy(struct kvm *kvm)
> >>       kfree(dist->irq_spi_target);
> >>       kfree(dist->irq_pending_on_cpu);
> >>       kfree(dist->irq_active_on_cpu);
> >> +     vgic_destroy_irq_phys_map(kvm, &dist->irq_phys_map_list);
> >>       dist->irq_sgi_sources = NULL;
> >>       dist->irq_spi_cpu = NULL;
> >>       dist->irq_spi_target = NULL;
> >> @@ -1787,6 +1963,18 @@ static int init_vgic_model(struct kvm *kvm, int type)
> >>       return 0;
> >>  }
> >>
> >> +/**
> >> + * kvm_vgic_early_init - Earliest possible vgic initialization stage
> >> + *
> >> + * No memory allocation should be performed here, only static init.
> >> + */
> >> +void kvm_vgic_early_init(struct kvm *kvm)
> >> +{
> >> +     spin_lock_init(&kvm->arch.vgic.lock);
> > 
> > so here we're initializing the data structures even before the vgic is
> > created, right?  So therefore it's safe to move the vgic.lock init up
> > here?
> 
> Yes. The distributor structure is part of the struct kvm, hence safe to
> poke early on. This doesn't depend on any form of sizing or emulation
> type either.
> 

ok, slight warm fuzzy feeling.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 09/11] KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interrupt
  2015-08-04 16:02       ` Marc Zyngier
@ 2015-08-04 17:38         ` Christoffer Dall
  -1 siblings, 0 replies; 74+ messages in thread
From: Christoffer Dall @ 2015-08-04 17:38 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvm, Andre Przywara, kvmarm, linux-arm-kernel

On Tue, Aug 04, 2015 at 05:02:41PM +0100, Marc Zyngier wrote:
> On 04/08/15 14:45, Christoffer Dall wrote:
> > On Fri, Jul 24, 2015 at 04:55:07PM +0100, Marc Zyngier wrote:
> >> Virtual interrupts mapped to a HW interrupt should only be triggered
> >> from inside the kernel. Otherwise, you could end up confusing the
> >> kernel (and the GIC's) state machine.
> >>
> >> Rearrange the injection path so that kvm_vgic_inject_irq is
> >> used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is
> >> used for mapped interrupts. The latter should only be called from
> >> inside the kernel (timer, VFIO).
> >>
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> ---
> >>  include/kvm/arm_vgic.h |  2 +
> >>  virt/kvm/arm/vgic.c    | 99 ++++++++++++++++++++++++++++++++++----------------
> >>  2 files changed, 70 insertions(+), 31 deletions(-)
> >>
> >> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> >> index 7306b4b..f6bfd79 100644
> >> --- a/include/kvm/arm_vgic.h
> >> +++ b/include/kvm/arm_vgic.h
> >> @@ -351,6 +351,8 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
> >>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
> >>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
> >>  			bool level);
> >> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
> >> +			       struct irq_phys_map *map, bool level);
> >>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
> >>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
> >>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
> >> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> >> index 3f7b690..e40ef70 100644
> >> --- a/virt/kvm/arm/vgic.c
> >> +++ b/virt/kvm/arm/vgic.c
> >> @@ -1533,7 +1533,8 @@ static int vgic_validate_injection(struct kvm_vcpu *vcpu, int irq, int level)
> >>  }
> >>  
> >>  static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
> >> -				  unsigned int irq_num, bool level)
> >> +				   struct irq_phys_map *map,
> >> +				   unsigned int irq_num, bool level)
> >>  {
> >>  	struct vgic_dist *dist = &kvm->arch.vgic;
> >>  	struct kvm_vcpu *vcpu;
> >> @@ -1541,6 +1542,9 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
> >>  	int enabled;
> >>  	bool ret = true, can_inject = true;
> >>  
> >> +	if (irq_num >= min(kvm->arch.vgic.nr_irqs, 1020))
> >> +		return -EINVAL;
> >> +
> >>  	spin_lock(&dist->lock);
> >>  
> >>  	vcpu = kvm_get_vcpu(kvm, cpuid);
> >> @@ -1603,14 +1607,42 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
> >>  out:
> >>  	spin_unlock(&dist->lock);
> >>  
> >> -	return ret ? cpuid : -EINVAL;
> >> +	if (!ret) {
> > 
> > don't you mean if (ret) here?  hint: ret is a bool
> 
> Ouch. Nice catch!
> 
> > 
> >> +		/* kick the specified vcpu */
> >> +		kvm_vcpu_kick(kvm_get_vcpu(kvm, cpuid));
> >> +	}
> >> +
> >> +	return 0;
> > 
> > isn't this a change in the internal API?
> > Before, we would return -EINVAL when ret is false.  Not sure if this
> > has any consequences though?
> 
> I don't think this is a change in API. Before this patch, we would
> either return a vcpuid or -EINVAL. But the error would not be propagated
> beyond kvm_vgic_inject_irq, effectively discarding the error code.
> 
> Also, it is a bit odd to return an error because the toggling of the
> line wasn't significant (like bringing the line down on an
> edge-triggered interrupt).
> 

true, indeed, my brain was too fried to think it through.

(why does coming back from vacation always involve paging in weird vgic
stuff for me?)

> > 
> >> +}
> >> +
> >> +static int vgic_lazy_init(struct kvm *kvm)
> >> +{
> >> +	int ret = 0;
> >> +
> >> +	if (unlikely(!vgic_initialized(kvm))) {
> >> +		/*
> >> +		 * We only provide the automatic initialization of the VGIC
> >> +		 * for the legacy case of a GICv2. Any other type must
> >> +		 * be explicitly initialized once setup with the respective
> >> +		 * KVM device call.
> >> +		 */
> >> +		if (kvm->arch.vgic.vgic_model != KVM_DEV_TYPE_ARM_VGIC_V2)
> >> +			return -EBUSY;
> >> +
> >> +		mutex_lock(&kvm->lock);
> >> +		ret = vgic_init(kvm);
> >> +		mutex_unlock(&kvm->lock);
> >> +	}
> >> +
> >> +	return ret;
> >>  }
> >>  
> >>  /**
> >>   * kvm_vgic_inject_irq - Inject an IRQ from a device to the vgic
> >>   * @kvm:     The VM structure pointer
> >>   * @cpuid:   The CPU for PPIs
> >> - * @irq_num: The IRQ number that is assigned to the device
> >> + * @irq_num: The IRQ number that is assigned to the device. This IRQ
> >> + *           must not be mapped to a HW interrupt.
> >>   * @level:   Edge-triggered:  true:  to trigger the interrupt
> >>   *			      false: to ignore the call
> >>   *	     Level-sensitive  true:  activates an interrupt
> >> @@ -1623,39 +1655,44 @@ out:
> >>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
> >>  			bool level)
> >>  {
> >> -	int ret = 0;
> >> -	int vcpu_id;
> >> -
> >> -	if (unlikely(!vgic_initialized(kvm))) {
> >> -		/*
> >> -		 * We only provide the automatic initialization of the VGIC
> >> -		 * for the legacy case of a GICv2. Any other type must
> >> -		 * be explicitly initialized once setup with the respective
> >> -		 * KVM device call.
> >> -		 */
> >> -		if (kvm->arch.vgic.vgic_model != KVM_DEV_TYPE_ARM_VGIC_V2) {
> >> -			ret = -EBUSY;
> >> -			goto out;
> >> -		}
> >> -		mutex_lock(&kvm->lock);
> >> -		ret = vgic_init(kvm);
> >> -		mutex_unlock(&kvm->lock);
> >> +	struct irq_phys_map *map;
> >> +	int ret;
> >>  
> >> -		if (ret)
> >> -			goto out;
> >> -	}
> >> +	ret = vgic_lazy_init(kvm);
> >> +	if (ret)
> >> +		return ret;
> >>  
> >> -	if (irq_num >= min(kvm->arch.vgic.nr_irqs, 1020))
> >> +	map = vgic_irq_map_search(kvm_get_vcpu(kvm, cpuid), irq_num);
> >> +	if (map)
> >>  		return -EINVAL;
> >>  
> >> -	vcpu_id = vgic_update_irq_pending(kvm, cpuid, irq_num, level);
> >> -	if (vcpu_id >= 0) {
> >> -		/* kick the specified vcpu */
> >> -		kvm_vcpu_kick(kvm_get_vcpu(kvm, vcpu_id));
> >> -	}
> >> +	return vgic_update_irq_pending(kvm, cpuid, NULL, irq_num, level);
> >> +}
> >>  
> >> -out:
> >> -	return ret;
> >> +/**
> >> + * kvm_vgic_inject_mapped_irq - Inject a physically mapped IRQ to the vgic
> >> + * @kvm:     The VM structure pointer
> >> + * @cpuid:   The CPU for PPIs
> >> + * @map:     Pointer to a irq_phys_map structure describing the mapping
> >> + * @level:   Edge-triggered:  true:  to trigger the interrupt
> >> + *			      false: to ignore the call
> >> + *	     Level-sensitive  true:  activates an interrupt
> >> + *			      false: deactivates an interrupt
> > 
> > just noticed this unfortunate use of the words 'activate/deactivate'
> > here, which is not true, it just raises/lowers the input signal...
> > 
> 
> I'll clean that up.
> 

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 09/11] KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interrupt
@ 2015-08-04 17:38         ` Christoffer Dall
  0 siblings, 0 replies; 74+ messages in thread
From: Christoffer Dall @ 2015-08-04 17:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 04, 2015 at 05:02:41PM +0100, Marc Zyngier wrote:
> On 04/08/15 14:45, Christoffer Dall wrote:
> > On Fri, Jul 24, 2015 at 04:55:07PM +0100, Marc Zyngier wrote:
> >> Virtual interrupts mapped to a HW interrupt should only be triggered
> >> from inside the kernel. Otherwise, you could end up confusing the
> >> kernel (and the GIC's) state machine.
> >>
> >> Rearrange the injection path so that kvm_vgic_inject_irq is
> >> used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is
> >> used for mapped interrupts. The latter should only be called from
> >> inside the kernel (timer, VFIO).
> >>
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> ---
> >>  include/kvm/arm_vgic.h |  2 +
> >>  virt/kvm/arm/vgic.c    | 99 ++++++++++++++++++++++++++++++++++----------------
> >>  2 files changed, 70 insertions(+), 31 deletions(-)
> >>
> >> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> >> index 7306b4b..f6bfd79 100644
> >> --- a/include/kvm/arm_vgic.h
> >> +++ b/include/kvm/arm_vgic.h
> >> @@ -351,6 +351,8 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
> >>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
> >>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
> >>  			bool level);
> >> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
> >> +			       struct irq_phys_map *map, bool level);
> >>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
> >>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
> >>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
> >> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> >> index 3f7b690..e40ef70 100644
> >> --- a/virt/kvm/arm/vgic.c
> >> +++ b/virt/kvm/arm/vgic.c
> >> @@ -1533,7 +1533,8 @@ static int vgic_validate_injection(struct kvm_vcpu *vcpu, int irq, int level)
> >>  }
> >>  
> >>  static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
> >> -				  unsigned int irq_num, bool level)
> >> +				   struct irq_phys_map *map,
> >> +				   unsigned int irq_num, bool level)
> >>  {
> >>  	struct vgic_dist *dist = &kvm->arch.vgic;
> >>  	struct kvm_vcpu *vcpu;
> >> @@ -1541,6 +1542,9 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
> >>  	int enabled;
> >>  	bool ret = true, can_inject = true;
> >>  
> >> +	if (irq_num >= min(kvm->arch.vgic.nr_irqs, 1020))
> >> +		return -EINVAL;
> >> +
> >>  	spin_lock(&dist->lock);
> >>  
> >>  	vcpu = kvm_get_vcpu(kvm, cpuid);
> >> @@ -1603,14 +1607,42 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
> >>  out:
> >>  	spin_unlock(&dist->lock);
> >>  
> >> -	return ret ? cpuid : -EINVAL;
> >> +	if (!ret) {
> > 
> > don't you mean if (ret) here?  hint: ret is a bool
> 
> Ouch. Nice catch!
> 
> > 
> >> +		/* kick the specified vcpu */
> >> +		kvm_vcpu_kick(kvm_get_vcpu(kvm, cpuid));
> >> +	}
> >> +
> >> +	return 0;
> > 
> > isn't this a change in the internal API?
> > Before, we would return -EINVAL when ret is false.  Not sure if this
> > has any consequences though?
> 
> I don't think this is a change in API. Before this patch, we would
> either return a vcpuid or -EINVAL. But the error would not be propagated
> beyond kvm_vgic_inject_irq, effectively discarding the error code.
> 
> Also, it is a bit odd to return an error because the toggling of the
> line wasn't significant (like bringing the line down on an
> edge-triggered interrupt).
> 

true, indeed, my brain was too fried to think it through.

(why does coming back from vacation always involve paging in weird vgic
stuff for me?)

> > 
> >> +}
> >> +
> >> +static int vgic_lazy_init(struct kvm *kvm)
> >> +{
> >> +	int ret = 0;
> >> +
> >> +	if (unlikely(!vgic_initialized(kvm))) {
> >> +		/*
> >> +		 * We only provide the automatic initialization of the VGIC
> >> +		 * for the legacy case of a GICv2. Any other type must
> >> +		 * be explicitly initialized once setup with the respective
> >> +		 * KVM device call.
> >> +		 */
> >> +		if (kvm->arch.vgic.vgic_model != KVM_DEV_TYPE_ARM_VGIC_V2)
> >> +			return -EBUSY;
> >> +
> >> +		mutex_lock(&kvm->lock);
> >> +		ret = vgic_init(kvm);
> >> +		mutex_unlock(&kvm->lock);
> >> +	}
> >> +
> >> +	return ret;
> >>  }
> >>  
> >>  /**
> >>   * kvm_vgic_inject_irq - Inject an IRQ from a device to the vgic
> >>   * @kvm:     The VM structure pointer
> >>   * @cpuid:   The CPU for PPIs
> >> - * @irq_num: The IRQ number that is assigned to the device
> >> + * @irq_num: The IRQ number that is assigned to the device. This IRQ
> >> + *           must not be mapped to a HW interrupt.
> >>   * @level:   Edge-triggered:  true:  to trigger the interrupt
> >>   *			      false: to ignore the call
> >>   *	     Level-sensitive  true:  activates an interrupt
> >> @@ -1623,39 +1655,44 @@ out:
> >>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
> >>  			bool level)
> >>  {
> >> -	int ret = 0;
> >> -	int vcpu_id;
> >> -
> >> -	if (unlikely(!vgic_initialized(kvm))) {
> >> -		/*
> >> -		 * We only provide the automatic initialization of the VGIC
> >> -		 * for the legacy case of a GICv2. Any other type must
> >> -		 * be explicitly initialized once setup with the respective
> >> -		 * KVM device call.
> >> -		 */
> >> -		if (kvm->arch.vgic.vgic_model != KVM_DEV_TYPE_ARM_VGIC_V2) {
> >> -			ret = -EBUSY;
> >> -			goto out;
> >> -		}
> >> -		mutex_lock(&kvm->lock);
> >> -		ret = vgic_init(kvm);
> >> -		mutex_unlock(&kvm->lock);
> >> +	struct irq_phys_map *map;
> >> +	int ret;
> >>  
> >> -		if (ret)
> >> -			goto out;
> >> -	}
> >> +	ret = vgic_lazy_init(kvm);
> >> +	if (ret)
> >> +		return ret;
> >>  
> >> -	if (irq_num >= min(kvm->arch.vgic.nr_irqs, 1020))
> >> +	map = vgic_irq_map_search(kvm_get_vcpu(kvm, cpuid), irq_num);
> >> +	if (map)
> >>  		return -EINVAL;
> >>  
> >> -	vcpu_id = vgic_update_irq_pending(kvm, cpuid, irq_num, level);
> >> -	if (vcpu_id >= 0) {
> >> -		/* kick the specified vcpu */
> >> -		kvm_vcpu_kick(kvm_get_vcpu(kvm, vcpu_id));
> >> -	}
> >> +	return vgic_update_irq_pending(kvm, cpuid, NULL, irq_num, level);
> >> +}
> >>  
> >> -out:
> >> -	return ret;
> >> +/**
> >> + * kvm_vgic_inject_mapped_irq - Inject a physically mapped IRQ to the vgic
> >> + * @kvm:     The VM structure pointer
> >> + * @cpuid:   The CPU for PPIs
> >> + * @map:     Pointer to a irq_phys_map structure describing the mapping
> >> + * @level:   Edge-triggered:  true:  to trigger the interrupt
> >> + *			      false: to ignore the call
> >> + *	     Level-sensitive  true:  activates an interrupt
> >> + *			      false: deactivates an interrupt
> > 
> > just noticed this unfortunate use of the words 'activate/deactivate'
> > here, which is not true, it just raises/lowers the input signal...
> > 
> 
> I'll clean that up.
> 

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 10/11] KVM: arm/arm64: timer: Allow the timer to control the active state
  2015-08-04 16:14       ` Marc Zyngier
@ 2015-08-04 17:40         ` Christoffer Dall
  -1 siblings, 0 replies; 74+ messages in thread
From: Christoffer Dall @ 2015-08-04 17:40 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvm, kvmarm, linux-arm-kernel, Eric Auger, Alex Bennée,
	Andre Przywara

On Tue, Aug 04, 2015 at 05:14:53PM +0100, Marc Zyngier wrote:
> On 04/08/15 14:56, Christoffer Dall wrote:
> > On Fri, Jul 24, 2015 at 04:55:08PM +0100, Marc Zyngier wrote:
> >> In order to remove the crude hack where we sneak the masked bit
> >> into the timer's control register, make use of the phys_irq_map
> >> API control the active state of the interrupt.
> >>
> >> This causes some limited changes to allow for potential error
> >> propagation.
> >>
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> ---
> >>  arch/arm/kvm/reset.c         |  4 +---
> >>  arch/arm64/kvm/reset.c       |  4 +---
> >>  include/kvm/arm_arch_timer.h |  7 +++++--
> >>  virt/kvm/arm/arch_timer.c    | 29 ++++++++++++++++++++++-------
> >>  4 files changed, 29 insertions(+), 15 deletions(-)
> >>
> >> diff --git a/arch/arm/kvm/reset.c b/arch/arm/kvm/reset.c
> >> index f558c07..eeb85858 100644
> >> --- a/arch/arm/kvm/reset.c
> >> +++ b/arch/arm/kvm/reset.c
> >> @@ -77,7 +77,5 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
> >>  	kvm_reset_coprocs(vcpu);
> >>  
> >>  	/* Reset arch_timer context */
> >> -	kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
> >> -
> >> -	return 0;
> >> +	return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
> >>  }
> >> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> >> index b4af618..91cf535 100644
> >> --- a/arch/arm64/kvm/reset.c
> >> +++ b/arch/arm64/kvm/reset.c
> >> @@ -121,7 +121,5 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
> >>  	kvm_reset_sys_regs(vcpu);
> >>  
> >>  	/* Reset timer */
> >> -	kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
> >> -
> >> -	return 0;
> >> +	return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
> >>  }
> >> diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
> >> index e596675..e1e4d7c 100644
> >> --- a/include/kvm/arm_arch_timer.h
> >> +++ b/include/kvm/arm_arch_timer.h
> >> @@ -52,13 +52,16 @@ struct arch_timer_cpu {
> >>  
> >>  	/* Timer IRQ */
> >>  	const struct kvm_irq_level	*irq;
> >> +
> >> +	/* VGIC mapping */
> >> +	struct irq_phys_map		*map;
> >>  };
> >>  
> >>  int kvm_timer_hyp_init(void);
> >>  void kvm_timer_enable(struct kvm *kvm);
> >>  void kvm_timer_init(struct kvm *kvm);
> >> -void kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
> >> -			  const struct kvm_irq_level *irq);
> >> +int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
> >> +			 const struct kvm_irq_level *irq);
> >>  void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu);
> >>  void kvm_timer_flush_hwstate(struct kvm_vcpu *vcpu);
> >>  void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu);
> >> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> >> index 98c95f2..76e38d2 100644
> >> --- a/virt/kvm/arm/arch_timer.c
> >> +++ b/virt/kvm/arm/arch_timer.c
> >> @@ -64,10 +64,10 @@ static void kvm_timer_inject_irq(struct kvm_vcpu *vcpu)
> >>  	int ret;
> >>  	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
> >>  
> >> -	timer->cntv_ctl |= ARCH_TIMER_CTRL_IT_MASK;
> >> -	ret = kvm_vgic_inject_irq(vcpu->kvm, vcpu->vcpu_id,
> >> -				  timer->irq->irq,
> >> -				  timer->irq->level);
> >> +	kvm_vgic_set_phys_irq_active(timer->map, true);
> > 
> > actually, I'm not sure I understand the point of setting this boolean?
> > Is it just a check?  It seems everywhere we find the map in the VGIC
> > code, we have a BUG_ON(!map->active), so why doesn't (!!map) imply that
> > it should be marked as active?
> > 
> > Is this again for non-shared devices?
> > 
> 
> Setting the boolean here means that the interrupt should be marked
> active at the distributor level when the interrupt is injected. This
> also allows the timer code to find out when we can inject another
> interrupt (by checking that the interrupt is now inactive).
> 
> As for the BUG_ON(), this is really some debug stuff to make sure we
> have a consistent state.
> 
> Non-shared devices are handled separately (and setting the active state
> doesn't make much sense, as we expect this to be entirely HW driven).
> 
> Do you see anything I should add to make it more clear?
> 

no, I'm good with it now, I think adding more commentary etc. will just
confuse people even more... One simply has to understand this part of
the logic.

> >> +	ret = kvm_vgic_inject_mapped_irq(vcpu->kvm, vcpu->vcpu_id,
> >> +					 timer->map,
> >> +					 timer->irq->level);
> >>  	WARN_ON(ret);
> >>  }
> >>  
> >> @@ -117,7 +117,8 @@ bool kvm_timer_should_fire(struct kvm_vcpu *vcpu)
> >>  	cycle_t cval, now;
> >>  
> >>  	if ((timer->cntv_ctl & ARCH_TIMER_CTRL_IT_MASK) ||
> >> -		!(timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE))
> >> +	    !(timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE) ||
> >> +	    kvm_vgic_get_phys_irq_active(timer->map))
> > 
> > hmm, maybe this answers my question above.  It is the virtual active
> > state preventing us to keep injecting the virtual side?
> > 
> 
> Exactly. This is effectively what the MASK state was doing, now
> replicated in SW...
> 

ok, in that case:

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 10/11] KVM: arm/arm64: timer: Allow the timer to control the active state
@ 2015-08-04 17:40         ` Christoffer Dall
  0 siblings, 0 replies; 74+ messages in thread
From: Christoffer Dall @ 2015-08-04 17:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 04, 2015 at 05:14:53PM +0100, Marc Zyngier wrote:
> On 04/08/15 14:56, Christoffer Dall wrote:
> > On Fri, Jul 24, 2015 at 04:55:08PM +0100, Marc Zyngier wrote:
> >> In order to remove the crude hack where we sneak the masked bit
> >> into the timer's control register, make use of the phys_irq_map
> >> API control the active state of the interrupt.
> >>
> >> This causes some limited changes to allow for potential error
> >> propagation.
> >>
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> ---
> >>  arch/arm/kvm/reset.c         |  4 +---
> >>  arch/arm64/kvm/reset.c       |  4 +---
> >>  include/kvm/arm_arch_timer.h |  7 +++++--
> >>  virt/kvm/arm/arch_timer.c    | 29 ++++++++++++++++++++++-------
> >>  4 files changed, 29 insertions(+), 15 deletions(-)
> >>
> >> diff --git a/arch/arm/kvm/reset.c b/arch/arm/kvm/reset.c
> >> index f558c07..eeb85858 100644
> >> --- a/arch/arm/kvm/reset.c
> >> +++ b/arch/arm/kvm/reset.c
> >> @@ -77,7 +77,5 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
> >>  	kvm_reset_coprocs(vcpu);
> >>  
> >>  	/* Reset arch_timer context */
> >> -	kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
> >> -
> >> -	return 0;
> >> +	return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
> >>  }
> >> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> >> index b4af618..91cf535 100644
> >> --- a/arch/arm64/kvm/reset.c
> >> +++ b/arch/arm64/kvm/reset.c
> >> @@ -121,7 +121,5 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
> >>  	kvm_reset_sys_regs(vcpu);
> >>  
> >>  	/* Reset timer */
> >> -	kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
> >> -
> >> -	return 0;
> >> +	return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
> >>  }
> >> diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
> >> index e596675..e1e4d7c 100644
> >> --- a/include/kvm/arm_arch_timer.h
> >> +++ b/include/kvm/arm_arch_timer.h
> >> @@ -52,13 +52,16 @@ struct arch_timer_cpu {
> >>  
> >>  	/* Timer IRQ */
> >>  	const struct kvm_irq_level	*irq;
> >> +
> >> +	/* VGIC mapping */
> >> +	struct irq_phys_map		*map;
> >>  };
> >>  
> >>  int kvm_timer_hyp_init(void);
> >>  void kvm_timer_enable(struct kvm *kvm);
> >>  void kvm_timer_init(struct kvm *kvm);
> >> -void kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
> >> -			  const struct kvm_irq_level *irq);
> >> +int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
> >> +			 const struct kvm_irq_level *irq);
> >>  void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu);
> >>  void kvm_timer_flush_hwstate(struct kvm_vcpu *vcpu);
> >>  void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu);
> >> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> >> index 98c95f2..76e38d2 100644
> >> --- a/virt/kvm/arm/arch_timer.c
> >> +++ b/virt/kvm/arm/arch_timer.c
> >> @@ -64,10 +64,10 @@ static void kvm_timer_inject_irq(struct kvm_vcpu *vcpu)
> >>  	int ret;
> >>  	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
> >>  
> >> -	timer->cntv_ctl |= ARCH_TIMER_CTRL_IT_MASK;
> >> -	ret = kvm_vgic_inject_irq(vcpu->kvm, vcpu->vcpu_id,
> >> -				  timer->irq->irq,
> >> -				  timer->irq->level);
> >> +	kvm_vgic_set_phys_irq_active(timer->map, true);
> > 
> > actually, I'm not sure I understand the point of setting this boolean?
> > Is it just a check?  It seems everywhere we find the map in the VGIC
> > code, we have a BUG_ON(!map->active), so why doesn't (!!map) imply that
> > it should be marked as active?
> > 
> > Is this again for non-shared devices?
> > 
> 
> Setting the boolean here means that the interrupt should be marked
> active at the distributor level when the interrupt is injected. This
> also allows the timer code to find out when we can inject another
> interrupt (by checking that the interrupt is now inactive).
> 
> As for the BUG_ON(), this is really some debug stuff to make sure we
> have a consistent state.
> 
> Non-shared devices are handled separately (and setting the active state
> doesn't make much sense, as we expect this to be entirely HW driven).
> 
> Do you see anything I should add to make it more clear?
> 

no, I'm good with it now, I think adding more commentary etc. will just
confuse people even more... One simply has to understand this part of
the logic.

> >> +	ret = kvm_vgic_inject_mapped_irq(vcpu->kvm, vcpu->vcpu_id,
> >> +					 timer->map,
> >> +					 timer->irq->level);
> >>  	WARN_ON(ret);
> >>  }
> >>  
> >> @@ -117,7 +117,8 @@ bool kvm_timer_should_fire(struct kvm_vcpu *vcpu)
> >>  	cycle_t cval, now;
> >>  
> >>  	if ((timer->cntv_ctl & ARCH_TIMER_CTRL_IT_MASK) ||
> >> -		!(timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE))
> >> +	    !(timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE) ||
> >> +	    kvm_vgic_get_phys_irq_active(timer->map))
> > 
> > hmm, maybe this answers my question above.  It is the virtual active
> > state preventing us to keep injecting the virtual side?
> > 
> 
> Exactly. This is effectively what the MASK state was doing, now
> replicated in SW...
> 

ok, in that case:

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 11/11] KVM: arm/arm64: vgic: Allow HW interrupts for non-shared devices
  2015-08-04 17:08       ` Marc Zyngier
@ 2015-08-04 18:07         ` Christoffer Dall
  -1 siblings, 0 replies; 74+ messages in thread
From: Christoffer Dall @ 2015-08-04 18:07 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvm, Andre Przywara, kvmarm, linux-arm-kernel

On Tue, Aug 04, 2015 at 06:08:53PM +0100, Marc Zyngier wrote:
> On 04/08/15 15:32, Christoffer Dall wrote:
> > On Fri, Jul 24, 2015 at 04:55:09PM +0100, Marc Zyngier wrote:
> >> So far, the only use of the HW interrupt facility is the timer,
> >> implying that the active state is context-switched for each vcpu,
> >> as the device is is shared across all vcpus.
> >>
> >> This does not work for a device that has been assigned to a VM,
> >> as the guest is entierely in control of that device (the HW is
> >> not shared). In that case, it makes sense to bypass the whole
> >> active state switching, and only track the deactivation of the
> >> interrupt.
> >>
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> ---
> >>  include/kvm/arm_vgic.h    |  6 ++--
> >>  virt/kvm/arm/arch_timer.c |  3 +-
> >>  virt/kvm/arm/vgic.c       | 73 +++++++++++++++++++++++++++++++++++++----------
> >>  3 files changed, 64 insertions(+), 18 deletions(-)
> >>
> >> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> >> index f6bfd79..6f0a4e1 100644
> >> --- a/include/kvm/arm_vgic.h
> >> +++ b/include/kvm/arm_vgic.h
> >> @@ -164,7 +164,8 @@ struct irq_phys_map {
> >>  	u32			phys_irq;
> >>  	u32			irq;
> >>  	bool			deleted;
> >> -	bool			active;
> >> +	bool			shared;
> >> +	bool			active; /* Only valid if shared */
> >>  };
> >>  
> >>  struct irq_phys_map_entry {
> >> @@ -357,7 +358,8 @@ void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
> >>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
> >>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
> >>  struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
> >> -					   int virt_irq, int irq);
> >> +					   int virt_irq, int irq,
> >> +					   bool shared);
> >>  int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
> >>  bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map);
> >>  void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active);
> >> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> >> index 76e38d2..db21d8f 100644
> >> --- a/virt/kvm/arm/arch_timer.c
> >> +++ b/virt/kvm/arm/arch_timer.c
> >> @@ -203,7 +203,8 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
> >>  	 * Tell the VGIC that the virtual interrupt is tied to a
> >>  	 * physical interrupt. We do that once per VCPU.
> >>  	 */
> >> -	map = kvm_vgic_map_phys_irq(vcpu, irq->irq, host_vtimer_irq);
> >> +	map = kvm_vgic_map_phys_irq(vcpu, irq->irq,
> >> +				    host_vtimer_irq, true);
> >>  	if (WARN_ON(IS_ERR(map)))
> >>  		return PTR_ERR(map);
> >>  
> >> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> >> index e40ef70..5e6b816 100644
> >> --- a/virt/kvm/arm/vgic.c
> >> +++ b/virt/kvm/arm/vgic.c
> >> @@ -1128,19 +1128,25 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
> >>  		 * active in the physical world. Otherwise the
> >>  		 * physical interrupt will fire and the guest will
> >>  		 * exit before processing the virtual interrupt.
> >> +		 *
> >> +		 * This is of course only valid for a shared
> >> +		 * interrupt. A non shared interrupt should already be
> >> +		 * active.
> >>  		 */
> >>  		if (map) {
> >> -			int ret;
> >> -
> >> -			BUG_ON(!map->active);
> >>  			vlr.hwirq = map->phys_irq;
> >>  			vlr.state |= LR_HW;
> >>  			vlr.state &= ~LR_EOI_INT;
> >>  
> >> -			ret = irq_set_irqchip_state(map->irq,
> >> -						    IRQCHIP_STATE_ACTIVE,
> >> -						    true);
> >> -			WARN_ON(ret);
> >> +			if (map->shared) {
> >> +				int ret;
> >> +
> >> +				BUG_ON(!map->active);
> >> +				ret = irq_set_irqchip_state(map->irq,
> >> +							    IRQCHIP_STATE_ACTIVE,
> >> +							    true);
> >> +				WARN_ON(ret);
> >> +			}
> >>  
> >>  			/*
> >>  			 * Make sure we're not going to sample this
> >> @@ -1383,21 +1389,41 @@ static bool vgic_process_maintenance(struct kvm_vcpu *vcpu)
> >>  static int vgic_sync_hwirq(struct kvm_vcpu *vcpu, struct vgic_lr vlr)
> >>  {
> >>  	struct irq_phys_map *map;
> >> +	bool active;
> >>  	int ret;
> >>  
> >>  	if (!(vlr.state & LR_HW))
> >>  		return 0;
> >>  
> >>  	map = vgic_irq_map_search(vcpu, vlr.irq);
> >> -	BUG_ON(!map || !map->active);
> >> +	BUG_ON(!map);
> >> +	BUG_ON(map->shared && !map->active);
> >>  
> >>  	ret = irq_get_irqchip_state(map->irq,
> >>  				    IRQCHIP_STATE_ACTIVE,
> >> -				    &map->active);
> >> +				    &active);
> >>  
> >>  	WARN_ON(ret);
> >>  
> >> -	if (map->active) {
> >> +	/*
> >> +	 * For a non-shared interrupt, we have to cater for two
> >> +	 * possible deactivation conditions:
> >> +	 *
> >> +	 * - the physical interrupt is now inactive (EOIed from the
> >> +         *   guest)
> > 
> > nit: whitespace funkyness
> > 
> >> +	 * - the physical interrupt is still active, but its virtual
> >> +	 *   counterpart is flagged as "not queued", indicating another
> >> +	 *   interrupt has fired between the EOI and the guest exit.
> >> +	 *
> >> +	 * Also, we are not reactivating a non-shared interrupt
> > 
> > what does reactivating mean? did you mean deactivate?
> 
> Deactivate indeed.
> 
> >> +	 * ourselves. This is always left to the guest.
> > 
> > In which case, add ", because the device is solely owned by the guest."
> 
> Sure.
> 
> >> +	 */
> >> +	if (!map->shared)
> >> +		return !active || !vgic_irq_is_queued(vcpu, vlr.irq);
> > 
> > do you really need the second part of the disjunction?
> > 
> > The effect seems to be that we clear the queued flag once again, and
> > clear the LR.  If we don't do this, won't we simply pick up the pending
> > flag next time we're about to run this VCPU and piggy-back on the
> > existing LR.  Perhaps this is a weird flow though?
> 
> Crucially, we free the LR in this case (the set_bit on elrsr_ptr). If we
> don't do this, we're indeed going to schedule the vcpu (it has something
> to process), but we never allow piggy-backing on level interrupts. We'd
> need some special hack to handle this.
> 
> I definitely feel more comfortable reporting that the interrupt has been
> deactivated (which is the case), and let the normal flow pick up the
> next injected interrupt.
> 
Yes, you're right, it's much better this way.

I assume you'll still wait with this stuff until the priority drop / EOI
stuff is in, so I'll do a formal review again once all the dependencies
are there.

But this looks good to me.

-Christoffer

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 11/11] KVM: arm/arm64: vgic: Allow HW interrupts for non-shared devices
@ 2015-08-04 18:07         ` Christoffer Dall
  0 siblings, 0 replies; 74+ messages in thread
From: Christoffer Dall @ 2015-08-04 18:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 04, 2015 at 06:08:53PM +0100, Marc Zyngier wrote:
> On 04/08/15 15:32, Christoffer Dall wrote:
> > On Fri, Jul 24, 2015 at 04:55:09PM +0100, Marc Zyngier wrote:
> >> So far, the only use of the HW interrupt facility is the timer,
> >> implying that the active state is context-switched for each vcpu,
> >> as the device is is shared across all vcpus.
> >>
> >> This does not work for a device that has been assigned to a VM,
> >> as the guest is entierely in control of that device (the HW is
> >> not shared). In that case, it makes sense to bypass the whole
> >> active state switching, and only track the deactivation of the
> >> interrupt.
> >>
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> ---
> >>  include/kvm/arm_vgic.h    |  6 ++--
> >>  virt/kvm/arm/arch_timer.c |  3 +-
> >>  virt/kvm/arm/vgic.c       | 73 +++++++++++++++++++++++++++++++++++++----------
> >>  3 files changed, 64 insertions(+), 18 deletions(-)
> >>
> >> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> >> index f6bfd79..6f0a4e1 100644
> >> --- a/include/kvm/arm_vgic.h
> >> +++ b/include/kvm/arm_vgic.h
> >> @@ -164,7 +164,8 @@ struct irq_phys_map {
> >>  	u32			phys_irq;
> >>  	u32			irq;
> >>  	bool			deleted;
> >> -	bool			active;
> >> +	bool			shared;
> >> +	bool			active; /* Only valid if shared */
> >>  };
> >>  
> >>  struct irq_phys_map_entry {
> >> @@ -357,7 +358,8 @@ void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
> >>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
> >>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
> >>  struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
> >> -					   int virt_irq, int irq);
> >> +					   int virt_irq, int irq,
> >> +					   bool shared);
> >>  int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
> >>  bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map);
> >>  void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active);
> >> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> >> index 76e38d2..db21d8f 100644
> >> --- a/virt/kvm/arm/arch_timer.c
> >> +++ b/virt/kvm/arm/arch_timer.c
> >> @@ -203,7 +203,8 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
> >>  	 * Tell the VGIC that the virtual interrupt is tied to a
> >>  	 * physical interrupt. We do that once per VCPU.
> >>  	 */
> >> -	map = kvm_vgic_map_phys_irq(vcpu, irq->irq, host_vtimer_irq);
> >> +	map = kvm_vgic_map_phys_irq(vcpu, irq->irq,
> >> +				    host_vtimer_irq, true);
> >>  	if (WARN_ON(IS_ERR(map)))
> >>  		return PTR_ERR(map);
> >>  
> >> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> >> index e40ef70..5e6b816 100644
> >> --- a/virt/kvm/arm/vgic.c
> >> +++ b/virt/kvm/arm/vgic.c
> >> @@ -1128,19 +1128,25 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
> >>  		 * active in the physical world. Otherwise the
> >>  		 * physical interrupt will fire and the guest will
> >>  		 * exit before processing the virtual interrupt.
> >> +		 *
> >> +		 * This is of course only valid for a shared
> >> +		 * interrupt. A non shared interrupt should already be
> >> +		 * active.
> >>  		 */
> >>  		if (map) {
> >> -			int ret;
> >> -
> >> -			BUG_ON(!map->active);
> >>  			vlr.hwirq = map->phys_irq;
> >>  			vlr.state |= LR_HW;
> >>  			vlr.state &= ~LR_EOI_INT;
> >>  
> >> -			ret = irq_set_irqchip_state(map->irq,
> >> -						    IRQCHIP_STATE_ACTIVE,
> >> -						    true);
> >> -			WARN_ON(ret);
> >> +			if (map->shared) {
> >> +				int ret;
> >> +
> >> +				BUG_ON(!map->active);
> >> +				ret = irq_set_irqchip_state(map->irq,
> >> +							    IRQCHIP_STATE_ACTIVE,
> >> +							    true);
> >> +				WARN_ON(ret);
> >> +			}
> >>  
> >>  			/*
> >>  			 * Make sure we're not going to sample this
> >> @@ -1383,21 +1389,41 @@ static bool vgic_process_maintenance(struct kvm_vcpu *vcpu)
> >>  static int vgic_sync_hwirq(struct kvm_vcpu *vcpu, struct vgic_lr vlr)
> >>  {
> >>  	struct irq_phys_map *map;
> >> +	bool active;
> >>  	int ret;
> >>  
> >>  	if (!(vlr.state & LR_HW))
> >>  		return 0;
> >>  
> >>  	map = vgic_irq_map_search(vcpu, vlr.irq);
> >> -	BUG_ON(!map || !map->active);
> >> +	BUG_ON(!map);
> >> +	BUG_ON(map->shared && !map->active);
> >>  
> >>  	ret = irq_get_irqchip_state(map->irq,
> >>  				    IRQCHIP_STATE_ACTIVE,
> >> -				    &map->active);
> >> +				    &active);
> >>  
> >>  	WARN_ON(ret);
> >>  
> >> -	if (map->active) {
> >> +	/*
> >> +	 * For a non-shared interrupt, we have to cater for two
> >> +	 * possible deactivation conditions:
> >> +	 *
> >> +	 * - the physical interrupt is now inactive (EOIed from the
> >> +         *   guest)
> > 
> > nit: whitespace funkyness
> > 
> >> +	 * - the physical interrupt is still active, but its virtual
> >> +	 *   counterpart is flagged as "not queued", indicating another
> >> +	 *   interrupt has fired between the EOI and the guest exit.
> >> +	 *
> >> +	 * Also, we are not reactivating a non-shared interrupt
> > 
> > what does reactivating mean? did you mean deactivate?
> 
> Deactivate indeed.
> 
> >> +	 * ourselves. This is always left to the guest.
> > 
> > In which case, add ", because the device is solely owned by the guest."
> 
> Sure.
> 
> >> +	 */
> >> +	if (!map->shared)
> >> +		return !active || !vgic_irq_is_queued(vcpu, vlr.irq);
> > 
> > do you really need the second part of the disjunction?
> > 
> > The effect seems to be that we clear the queued flag once again, and
> > clear the LR.  If we don't do this, won't we simply pick up the pending
> > flag next time we're about to run this VCPU and piggy-back on the
> > existing LR.  Perhaps this is a weird flow though?
> 
> Crucially, we free the LR in this case (the set_bit on elrsr_ptr). If we
> don't do this, we're indeed going to schedule the vcpu (it has something
> to process), but we never allow piggy-backing on level interrupts. We'd
> need some special hack to handle this.
> 
> I definitely feel more comfortable reporting that the interrupt has been
> deactivated (which is the case), and let the normal flow pick up the
> next injected interrupt.
> 
Yes, you're right, it's much better this way.

I assume you'll still wait with this stuff until the priority drop / EOI
stuff is in, so I'll do a formal review again once all the dependencies
are there.

But this looks good to me.

-Christoffer

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 06/11] KVM: arm/arm64: vgic: Allow dynamic mapping of physical/virtual interrupts
  2015-08-04 17:36         ` Christoffer Dall
@ 2015-08-04 18:10           ` Marc Zyngier
  -1 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-08-04 18:10 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: kvm, kvmarm, linux-arm-kernel, Eric Auger, Alex Bennée,
	Andre Przywara

On 04/08/15 18:36, Christoffer Dall wrote:
> On Tue, Aug 04, 2015 at 04:27:03PM +0100, Marc Zyngier wrote:
>> On 04/08/15 14:04, Christoffer Dall wrote:
>>> On Fri, Jul 24, 2015 at 04:55:04PM +0100, Marc Zyngier wrote:
>>>> In order to be able to feed physical interrupts to a guest, we need
>>>> to be able to establish the virtual-physical mapping between the two
>>>> worlds.
>>>>
>>>> The mappings are kept in a set of RCU lists, indexed by virtual interrupts.
>>>>
>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>>> ---
>>>>  arch/arm/kvm/arm.c     |   2 +
>>>>  include/kvm/arm_vgic.h |  26 +++++++
>>>>  virt/kvm/arm/vgic.c    | 189 ++++++++++++++++++++++++++++++++++++++++++++++++-
>>>>  3 files changed, 216 insertions(+), 1 deletion(-)
>>>>

[...]

>>>> +/**
>>>> + * kvm_vgic_unmap_phys_irq - Remove a virtual to physical IRQ mapping
>>>> + * @vcpu: The VCPU pointer
>>>> + * @map: The pointer to a mapping obtained through kvm_vgic_map_phys_irq
>>>> + *
>>>> + * Remove an existing mapping between virtual and physical interrupts.
>>>> + */
>>>> +int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map)
>>>> +{
>>>> +     struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>>>> +     struct irq_phys_map_entry *entry;
>>>> +     struct list_head *root;
>>>> +
>>>> +     if (!map)
>>>> +             return -EINVAL;
>>>> +
>>>> +     root = vgic_get_irq_phys_map_list(vcpu, map->virt_irq);
>>>> +
>>>> +     spin_lock(&dist->irq_phys_map_lock);
>>>> +
>>>> +     list_for_each_entry(entry, root, entry) {
>>>> +             if (&entry->map == map && !map->deleted) {
>>>> +                     map->deleted = true;
>>>
>>> why do you need the 'deleted' flag?  You now search the list only after
>>> holding the lock, so the race I pointed out before doesn't exist
>>> anymore.  Is there an additional issue that the deleted flag is taking
>>> care of?
>>
>> This could still race with a destroy occuring before we take the lock,
>> and before we get get RCU to do the cleanup (the list would still be
>> populated).
>>
> 
> That's not how I understand list_del_rcu; I think it deletes the entry
> immediately with the right memory barriers.  It is only the free
> operation that happens on rcu sync and can be/is deferred.

Indeed. I stupidly though the two were deferred, but reading the code
definitely confirmed your point. I'll get rid of the delete stuff.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 06/11] KVM: arm/arm64: vgic: Allow dynamic mapping of physical/virtual interrupts
@ 2015-08-04 18:10           ` Marc Zyngier
  0 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-08-04 18:10 UTC (permalink / raw)
  To: linux-arm-kernel

On 04/08/15 18:36, Christoffer Dall wrote:
> On Tue, Aug 04, 2015 at 04:27:03PM +0100, Marc Zyngier wrote:
>> On 04/08/15 14:04, Christoffer Dall wrote:
>>> On Fri, Jul 24, 2015 at 04:55:04PM +0100, Marc Zyngier wrote:
>>>> In order to be able to feed physical interrupts to a guest, we need
>>>> to be able to establish the virtual-physical mapping between the two
>>>> worlds.
>>>>
>>>> The mappings are kept in a set of RCU lists, indexed by virtual interrupts.
>>>>
>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>>> ---
>>>>  arch/arm/kvm/arm.c     |   2 +
>>>>  include/kvm/arm_vgic.h |  26 +++++++
>>>>  virt/kvm/arm/vgic.c    | 189 ++++++++++++++++++++++++++++++++++++++++++++++++-
>>>>  3 files changed, 216 insertions(+), 1 deletion(-)
>>>>

[...]

>>>> +/**
>>>> + * kvm_vgic_unmap_phys_irq - Remove a virtual to physical IRQ mapping
>>>> + * @vcpu: The VCPU pointer
>>>> + * @map: The pointer to a mapping obtained through kvm_vgic_map_phys_irq
>>>> + *
>>>> + * Remove an existing mapping between virtual and physical interrupts.
>>>> + */
>>>> +int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map)
>>>> +{
>>>> +     struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>>>> +     struct irq_phys_map_entry *entry;
>>>> +     struct list_head *root;
>>>> +
>>>> +     if (!map)
>>>> +             return -EINVAL;
>>>> +
>>>> +     root = vgic_get_irq_phys_map_list(vcpu, map->virt_irq);
>>>> +
>>>> +     spin_lock(&dist->irq_phys_map_lock);
>>>> +
>>>> +     list_for_each_entry(entry, root, entry) {
>>>> +             if (&entry->map == map && !map->deleted) {
>>>> +                     map->deleted = true;
>>>
>>> why do you need the 'deleted' flag?  You now search the list only after
>>> holding the lock, so the race I pointed out before doesn't exist
>>> anymore.  Is there an additional issue that the deleted flag is taking
>>> care of?
>>
>> This could still race with a destroy occuring before we take the lock,
>> and before we get get RCU to do the cleanup (the list would still be
>> populated).
>>
> 
> That's not how I understand list_del_rcu; I think it deletes the entry
> immediately with the right memory barriers.  It is only the free
> operation that happens on rcu sync and can be/is deferred.

Indeed. I stupidly though the two were deferred, but reading the code
definitely confirmed your point. I'll get rid of the delete stuff.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 09/11] KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interrupt
  2015-08-04 16:44       ` Marc Zyngier
@ 2015-08-05  7:32         ` Eric Auger
  -1 siblings, 0 replies; 74+ messages in thread
From: Eric Auger @ 2015-08-05  7:32 UTC (permalink / raw)
  To: Marc Zyngier, kvm, kvmarm, linux-arm-kernel
  Cc: Christoffer Dall, Alex Bennée, Andre Przywara

Hi Marc,
On 08/04/2015 06:44 PM, Marc Zyngier wrote:
> On 04/08/15 17:21, Eric Auger wrote:
>> Hi Marc,
>> On 07/24/2015 05:55 PM, Marc Zyngier wrote:
>>> Virtual interrupts mapped to a HW interrupt should only be triggered
>>> from inside the kernel. Otherwise, you could end up confusing the
>>> kernel (and the GIC's) state machine.
>>>
>>> Rearrange the injection path so that kvm_vgic_inject_irq is
>>> used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is
>>> used for mapped interrupts. The latter should only be called from
>>> inside the kernel (timer, VFIO).
>> nit: I would replace VFIO by irqfd.
>> VFIO just triggers the eventfd/irqfd. This is KVM/irqfd that injects the
>> virtual irq upon the irqfd signaling and he irqfd adaptation/ARM
>> currently is implemented in vgic.c
> 
> Ah, thanks for reminding me of the right terminology, I tend to think of
> it as one big bag of nasty tricks... ;-)
> 
> I'll update the commit message.
> 
>>>
>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>> ---
>>>  include/kvm/arm_vgic.h |  2 +
>>>  virt/kvm/arm/vgic.c    | 99 ++++++++++++++++++++++++++++++++++----------------
>>>  2 files changed, 70 insertions(+), 31 deletions(-)
>>>
>>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>>> index 7306b4b..f6bfd79 100644
>>> --- a/include/kvm/arm_vgic.h
>>> +++ b/include/kvm/arm_vgic.h
>>> @@ -351,6 +351,8 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
>>>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
>>>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>>>  			bool level);
>>> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
>>> +			       struct irq_phys_map *map, bool level);
>>>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>>>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>>>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
>>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>>> index 3f7b690..e40ef70 100644
>>> --- a/virt/kvm/arm/vgic.c
>>> +++ b/virt/kvm/arm/vgic.c
>>> @@ -1533,7 +1533,8 @@ static int vgic_validate_injection(struct kvm_vcpu *vcpu, int irq, int level)
>>>  }
>>>  
>>>  static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>>> -				  unsigned int irq_num, bool level)
>>> +				   struct irq_phys_map *map,
>>> +				   unsigned int irq_num, bool level)
>>>  {
>> In vgic_update_irq_pending, I needed to modify the following line and
>> add the "&& !map".
>>
>>         if (!vgic_validate_injection(vcpu, irq_num, level) && !map) {
>>
>> Without that, the level being not properly modeled for level sensitive
>> forwarded IRQs, the 2d injection fails.
> 
> Ah! Is that because we never see the line being reset to zero, and the
> VGIC still sees the line as pending at the distributor level?
yes indeed

Eric
> 
> Thanks,
> 
> 	M.
> 


^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 09/11] KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interrupt
@ 2015-08-05  7:32         ` Eric Auger
  0 siblings, 0 replies; 74+ messages in thread
From: Eric Auger @ 2015-08-05  7:32 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Marc,
On 08/04/2015 06:44 PM, Marc Zyngier wrote:
> On 04/08/15 17:21, Eric Auger wrote:
>> Hi Marc,
>> On 07/24/2015 05:55 PM, Marc Zyngier wrote:
>>> Virtual interrupts mapped to a HW interrupt should only be triggered
>>> from inside the kernel. Otherwise, you could end up confusing the
>>> kernel (and the GIC's) state machine.
>>>
>>> Rearrange the injection path so that kvm_vgic_inject_irq is
>>> used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is
>>> used for mapped interrupts. The latter should only be called from
>>> inside the kernel (timer, VFIO).
>> nit: I would replace VFIO by irqfd.
>> VFIO just triggers the eventfd/irqfd. This is KVM/irqfd that injects the
>> virtual irq upon the irqfd signaling and he irqfd adaptation/ARM
>> currently is implemented in vgic.c
> 
> Ah, thanks for reminding me of the right terminology, I tend to think of
> it as one big bag of nasty tricks... ;-)
> 
> I'll update the commit message.
> 
>>>
>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>> ---
>>>  include/kvm/arm_vgic.h |  2 +
>>>  virt/kvm/arm/vgic.c    | 99 ++++++++++++++++++++++++++++++++++----------------
>>>  2 files changed, 70 insertions(+), 31 deletions(-)
>>>
>>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>>> index 7306b4b..f6bfd79 100644
>>> --- a/include/kvm/arm_vgic.h
>>> +++ b/include/kvm/arm_vgic.h
>>> @@ -351,6 +351,8 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
>>>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
>>>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>>>  			bool level);
>>> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
>>> +			       struct irq_phys_map *map, bool level);
>>>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>>>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>>>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
>>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>>> index 3f7b690..e40ef70 100644
>>> --- a/virt/kvm/arm/vgic.c
>>> +++ b/virt/kvm/arm/vgic.c
>>> @@ -1533,7 +1533,8 @@ static int vgic_validate_injection(struct kvm_vcpu *vcpu, int irq, int level)
>>>  }
>>>  
>>>  static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>>> -				  unsigned int irq_num, bool level)
>>> +				   struct irq_phys_map *map,
>>> +				   unsigned int irq_num, bool level)
>>>  {
>> In vgic_update_irq_pending, I needed to modify the following line and
>> add the "&& !map".
>>
>>         if (!vgic_validate_injection(vcpu, irq_num, level) && !map) {
>>
>> Without that, the level being not properly modeled for level sensitive
>> forwarded IRQs, the 2d injection fails.
> 
> Ah! Is that because we never see the line being reset to zero, and the
> VGIC still sees the line as pending at the distributor level?
yes indeed

Eric
> 
> Thanks,
> 
> 	M.
> 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 09/11] KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interrupt
  2015-08-05  7:32         ` Eric Auger
@ 2015-08-05  9:44           ` Marc Zyngier
  -1 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-08-05  9:44 UTC (permalink / raw)
  To: Eric Auger, kvm, kvmarm, linux-arm-kernel
  Cc: Christoffer Dall, Alex Bennée, Andre Przywara

On 05/08/15 08:32, Eric Auger wrote:
> Hi Marc,
> On 08/04/2015 06:44 PM, Marc Zyngier wrote:
>> On 04/08/15 17:21, Eric Auger wrote:
>>> Hi Marc,
>>> On 07/24/2015 05:55 PM, Marc Zyngier wrote:
>>>> Virtual interrupts mapped to a HW interrupt should only be triggered
>>>> from inside the kernel. Otherwise, you could end up confusing the
>>>> kernel (and the GIC's) state machine.
>>>>
>>>> Rearrange the injection path so that kvm_vgic_inject_irq is
>>>> used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is
>>>> used for mapped interrupts. The latter should only be called from
>>>> inside the kernel (timer, VFIO).
>>> nit: I would replace VFIO by irqfd.
>>> VFIO just triggers the eventfd/irqfd. This is KVM/irqfd that injects the
>>> virtual irq upon the irqfd signaling and he irqfd adaptation/ARM
>>> currently is implemented in vgic.c
>>
>> Ah, thanks for reminding me of the right terminology, I tend to think of
>> it as one big bag of nasty tricks... ;-)
>>
>> I'll update the commit message.
>>
>>>>
>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>>> ---
>>>>  include/kvm/arm_vgic.h |  2 +
>>>>  virt/kvm/arm/vgic.c    | 99 ++++++++++++++++++++++++++++++++++----------------
>>>>  2 files changed, 70 insertions(+), 31 deletions(-)
>>>>
>>>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>>>> index 7306b4b..f6bfd79 100644
>>>> --- a/include/kvm/arm_vgic.h
>>>> +++ b/include/kvm/arm_vgic.h
>>>> @@ -351,6 +351,8 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
>>>>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
>>>>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>>>>  			bool level);
>>>> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
>>>> +			       struct irq_phys_map *map, bool level);
>>>>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>>>>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>>>>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
>>>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>>>> index 3f7b690..e40ef70 100644
>>>> --- a/virt/kvm/arm/vgic.c
>>>> +++ b/virt/kvm/arm/vgic.c
>>>> @@ -1533,7 +1533,8 @@ static int vgic_validate_injection(struct kvm_vcpu *vcpu, int irq, int level)
>>>>  }
>>>>  
>>>>  static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>>>> -				  unsigned int irq_num, bool level)
>>>> +				   struct irq_phys_map *map,
>>>> +				   unsigned int irq_num, bool level)
>>>>  {
>>> In vgic_update_irq_pending, I needed to modify the following line and
>>> add the "&& !map".
>>>
>>>         if (!vgic_validate_injection(vcpu, irq_num, level) && !map) {
>>>
>>> Without that, the level being not properly modeled for level sensitive
>>> forwarded IRQs, the 2d injection fails.
>>
>> Ah! Is that because we never see the line being reset to zero, and the
>> VGIC still sees the line as pending at the distributor level?
> yes indeed

Then it is a bigger problem we need to solve, and your solution just
papers over the issue.

The main problem is that irqfd is essentially an edge-triggered
signalling. Fire and forget. Given that we're dealing with a level
triggered interrupt, we end up with the interrupt still marked as
pending (nobody took the signal down).

The usual way to get out of that mess in is to evaluate the state of the
level on EOI. But we can't trap on EOI for a HW interrupt.

So it raises the question: should we instead consider the HW pending
state instead of the software one for mapped interrupts? It is
expensive, but it feels more correct.

Thoughts?

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 09/11] KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interrupt
@ 2015-08-05  9:44           ` Marc Zyngier
  0 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-08-05  9:44 UTC (permalink / raw)
  To: linux-arm-kernel

On 05/08/15 08:32, Eric Auger wrote:
> Hi Marc,
> On 08/04/2015 06:44 PM, Marc Zyngier wrote:
>> On 04/08/15 17:21, Eric Auger wrote:
>>> Hi Marc,
>>> On 07/24/2015 05:55 PM, Marc Zyngier wrote:
>>>> Virtual interrupts mapped to a HW interrupt should only be triggered
>>>> from inside the kernel. Otherwise, you could end up confusing the
>>>> kernel (and the GIC's) state machine.
>>>>
>>>> Rearrange the injection path so that kvm_vgic_inject_irq is
>>>> used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is
>>>> used for mapped interrupts. The latter should only be called from
>>>> inside the kernel (timer, VFIO).
>>> nit: I would replace VFIO by irqfd.
>>> VFIO just triggers the eventfd/irqfd. This is KVM/irqfd that injects the
>>> virtual irq upon the irqfd signaling and he irqfd adaptation/ARM
>>> currently is implemented in vgic.c
>>
>> Ah, thanks for reminding me of the right terminology, I tend to think of
>> it as one big bag of nasty tricks... ;-)
>>
>> I'll update the commit message.
>>
>>>>
>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>>> ---
>>>>  include/kvm/arm_vgic.h |  2 +
>>>>  virt/kvm/arm/vgic.c    | 99 ++++++++++++++++++++++++++++++++++----------------
>>>>  2 files changed, 70 insertions(+), 31 deletions(-)
>>>>
>>>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>>>> index 7306b4b..f6bfd79 100644
>>>> --- a/include/kvm/arm_vgic.h
>>>> +++ b/include/kvm/arm_vgic.h
>>>> @@ -351,6 +351,8 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
>>>>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
>>>>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>>>>  			bool level);
>>>> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
>>>> +			       struct irq_phys_map *map, bool level);
>>>>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>>>>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>>>>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
>>>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>>>> index 3f7b690..e40ef70 100644
>>>> --- a/virt/kvm/arm/vgic.c
>>>> +++ b/virt/kvm/arm/vgic.c
>>>> @@ -1533,7 +1533,8 @@ static int vgic_validate_injection(struct kvm_vcpu *vcpu, int irq, int level)
>>>>  }
>>>>  
>>>>  static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>>>> -				  unsigned int irq_num, bool level)
>>>> +				   struct irq_phys_map *map,
>>>> +				   unsigned int irq_num, bool level)
>>>>  {
>>> In vgic_update_irq_pending, I needed to modify the following line and
>>> add the "&& !map".
>>>
>>>         if (!vgic_validate_injection(vcpu, irq_num, level) && !map) {
>>>
>>> Without that, the level being not properly modeled for level sensitive
>>> forwarded IRQs, the 2d injection fails.
>>
>> Ah! Is that because we never see the line being reset to zero, and the
>> VGIC still sees the line as pending at the distributor level?
> yes indeed

Then it is a bigger problem we need to solve, and your solution just
papers over the issue.

The main problem is that irqfd is essentially an edge-triggered
signalling. Fire and forget. Given that we're dealing with a level
triggered interrupt, we end up with the interrupt still marked as
pending (nobody took the signal down).

The usual way to get out of that mess in is to evaluate the state of the
level on EOI. But we can't trap on EOI for a HW interrupt.

So it raises the question: should we instead consider the HW pending
state instead of the software one for mapped interrupts? It is
expensive, but it feels more correct.

Thoughts?

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 09/11] KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interrupt
  2015-08-05  9:44           ` Marc Zyngier
@ 2015-08-05 10:53             ` Christoffer Dall
  -1 siblings, 0 replies; 74+ messages in thread
From: Christoffer Dall @ 2015-08-05 10:53 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Eric Auger, kvm, kvmarm, linux-arm-kernel, Alex Bennée,
	Andre Przywara

On Wed, Aug 05, 2015 at 10:44:09AM +0100, Marc Zyngier wrote:
> On 05/08/15 08:32, Eric Auger wrote:
> > Hi Marc,
> > On 08/04/2015 06:44 PM, Marc Zyngier wrote:
> >> On 04/08/15 17:21, Eric Auger wrote:
> >>> Hi Marc,
> >>> On 07/24/2015 05:55 PM, Marc Zyngier wrote:
> >>>> Virtual interrupts mapped to a HW interrupt should only be triggered
> >>>> from inside the kernel. Otherwise, you could end up confusing the
> >>>> kernel (and the GIC's) state machine.
> >>>>
> >>>> Rearrange the injection path so that kvm_vgic_inject_irq is
> >>>> used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is
> >>>> used for mapped interrupts. The latter should only be called from
> >>>> inside the kernel (timer, VFIO).
> >>> nit: I would replace VFIO by irqfd.
> >>> VFIO just triggers the eventfd/irqfd. This is KVM/irqfd that injects the
> >>> virtual irq upon the irqfd signaling and he irqfd adaptation/ARM
> >>> currently is implemented in vgic.c
> >>
> >> Ah, thanks for reminding me of the right terminology, I tend to think of
> >> it as one big bag of nasty tricks... ;-)
> >>
> >> I'll update the commit message.
> >>
> >>>>
> >>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >>>> ---
> >>>>  include/kvm/arm_vgic.h |  2 +
> >>>>  virt/kvm/arm/vgic.c    | 99 ++++++++++++++++++++++++++++++++++----------------
> >>>>  2 files changed, 70 insertions(+), 31 deletions(-)
> >>>>
> >>>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> >>>> index 7306b4b..f6bfd79 100644
> >>>> --- a/include/kvm/arm_vgic.h
> >>>> +++ b/include/kvm/arm_vgic.h
> >>>> @@ -351,6 +351,8 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
> >>>>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
> >>>>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
> >>>>  			bool level);
> >>>> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
> >>>> +			       struct irq_phys_map *map, bool level);
> >>>>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
> >>>>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
> >>>>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
> >>>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> >>>> index 3f7b690..e40ef70 100644
> >>>> --- a/virt/kvm/arm/vgic.c
> >>>> +++ b/virt/kvm/arm/vgic.c
> >>>> @@ -1533,7 +1533,8 @@ static int vgic_validate_injection(struct kvm_vcpu *vcpu, int irq, int level)
> >>>>  }
> >>>>  
> >>>>  static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
> >>>> -				  unsigned int irq_num, bool level)
> >>>> +				   struct irq_phys_map *map,
> >>>> +				   unsigned int irq_num, bool level)
> >>>>  {
> >>> In vgic_update_irq_pending, I needed to modify the following line and
> >>> add the "&& !map".
> >>>
> >>>         if (!vgic_validate_injection(vcpu, irq_num, level) && !map) {
> >>>
> >>> Without that, the level being not properly modeled for level sensitive
> >>> forwarded IRQs, the 2d injection fails.
> >>
> >> Ah! Is that because we never see the line being reset to zero, and the
> >> VGIC still sees the line as pending at the distributor level?
> > yes indeed
> 
> Then it is a bigger problem we need to solve, and your solution just
> papers over the issue.
> 
> The main problem is that irqfd is essentially an edge-triggered
> signalling. Fire and forget. Given that we're dealing with a level
> triggered interrupt, we end up with the interrupt still marked as
> pending (nobody took the signal down).
> 
> The usual way to get out of that mess in is to evaluate the state of the
> level on EOI. But we can't trap on EOI for a HW interrupt.
> 
> So it raises the question: should we instead consider the HW pending
> state instead of the software one for mapped interrupts? It is
> expensive, but it feels more correct.
> 
I thought we already covered this at LCA.  For mapped interrupts
(forwarded) we should never consider the software pending state, because
that state is managed by the hardware.  Or am I confusing concepts here?

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 09/11] KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interrupt
@ 2015-08-05 10:53             ` Christoffer Dall
  0 siblings, 0 replies; 74+ messages in thread
From: Christoffer Dall @ 2015-08-05 10:53 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 05, 2015 at 10:44:09AM +0100, Marc Zyngier wrote:
> On 05/08/15 08:32, Eric Auger wrote:
> > Hi Marc,
> > On 08/04/2015 06:44 PM, Marc Zyngier wrote:
> >> On 04/08/15 17:21, Eric Auger wrote:
> >>> Hi Marc,
> >>> On 07/24/2015 05:55 PM, Marc Zyngier wrote:
> >>>> Virtual interrupts mapped to a HW interrupt should only be triggered
> >>>> from inside the kernel. Otherwise, you could end up confusing the
> >>>> kernel (and the GIC's) state machine.
> >>>>
> >>>> Rearrange the injection path so that kvm_vgic_inject_irq is
> >>>> used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is
> >>>> used for mapped interrupts. The latter should only be called from
> >>>> inside the kernel (timer, VFIO).
> >>> nit: I would replace VFIO by irqfd.
> >>> VFIO just triggers the eventfd/irqfd. This is KVM/irqfd that injects the
> >>> virtual irq upon the irqfd signaling and he irqfd adaptation/ARM
> >>> currently is implemented in vgic.c
> >>
> >> Ah, thanks for reminding me of the right terminology, I tend to think of
> >> it as one big bag of nasty tricks... ;-)
> >>
> >> I'll update the commit message.
> >>
> >>>>
> >>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >>>> ---
> >>>>  include/kvm/arm_vgic.h |  2 +
> >>>>  virt/kvm/arm/vgic.c    | 99 ++++++++++++++++++++++++++++++++++----------------
> >>>>  2 files changed, 70 insertions(+), 31 deletions(-)
> >>>>
> >>>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> >>>> index 7306b4b..f6bfd79 100644
> >>>> --- a/include/kvm/arm_vgic.h
> >>>> +++ b/include/kvm/arm_vgic.h
> >>>> @@ -351,6 +351,8 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
> >>>>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
> >>>>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
> >>>>  			bool level);
> >>>> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
> >>>> +			       struct irq_phys_map *map, bool level);
> >>>>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
> >>>>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
> >>>>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
> >>>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> >>>> index 3f7b690..e40ef70 100644
> >>>> --- a/virt/kvm/arm/vgic.c
> >>>> +++ b/virt/kvm/arm/vgic.c
> >>>> @@ -1533,7 +1533,8 @@ static int vgic_validate_injection(struct kvm_vcpu *vcpu, int irq, int level)
> >>>>  }
> >>>>  
> >>>>  static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
> >>>> -				  unsigned int irq_num, bool level)
> >>>> +				   struct irq_phys_map *map,
> >>>> +				   unsigned int irq_num, bool level)
> >>>>  {
> >>> In vgic_update_irq_pending, I needed to modify the following line and
> >>> add the "&& !map".
> >>>
> >>>         if (!vgic_validate_injection(vcpu, irq_num, level) && !map) {
> >>>
> >>> Without that, the level being not properly modeled for level sensitive
> >>> forwarded IRQs, the 2d injection fails.
> >>
> >> Ah! Is that because we never see the line being reset to zero, and the
> >> VGIC still sees the line as pending at the distributor level?
> > yes indeed
> 
> Then it is a bigger problem we need to solve, and your solution just
> papers over the issue.
> 
> The main problem is that irqfd is essentially an edge-triggered
> signalling. Fire and forget. Given that we're dealing with a level
> triggered interrupt, we end up with the interrupt still marked as
> pending (nobody took the signal down).
> 
> The usual way to get out of that mess in is to evaluate the state of the
> level on EOI. But we can't trap on EOI for a HW interrupt.
> 
> So it raises the question: should we instead consider the HW pending
> state instead of the software one for mapped interrupts? It is
> expensive, but it feels more correct.
> 
I thought we already covered this at LCA.  For mapped interrupts
(forwarded) we should never consider the software pending state, because
that state is managed by the hardware.  Or am I confusing concepts here?

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 09/11] KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interrupt
  2015-08-05 10:53             ` Christoffer Dall
@ 2015-08-05 11:47               ` Eric Auger
  -1 siblings, 0 replies; 74+ messages in thread
From: Eric Auger @ 2015-08-05 11:47 UTC (permalink / raw)
  To: Christoffer Dall, Marc Zyngier
  Cc: kvm, kvmarm, linux-arm-kernel, Alex Bennée, Andre Przywara

On 08/05/2015 12:53 PM, Christoffer Dall wrote:
> On Wed, Aug 05, 2015 at 10:44:09AM +0100, Marc Zyngier wrote:
>> On 05/08/15 08:32, Eric Auger wrote:
>>> Hi Marc,
>>> On 08/04/2015 06:44 PM, Marc Zyngier wrote:
>>>> On 04/08/15 17:21, Eric Auger wrote:
>>>>> Hi Marc,
>>>>> On 07/24/2015 05:55 PM, Marc Zyngier wrote:
>>>>>> Virtual interrupts mapped to a HW interrupt should only be triggered
>>>>>> from inside the kernel. Otherwise, you could end up confusing the
>>>>>> kernel (and the GIC's) state machine.
>>>>>>
>>>>>> Rearrange the injection path so that kvm_vgic_inject_irq is
>>>>>> used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is
>>>>>> used for mapped interrupts. The latter should only be called from
>>>>>> inside the kernel (timer, VFIO).
>>>>> nit: I would replace VFIO by irqfd.
>>>>> VFIO just triggers the eventfd/irqfd. This is KVM/irqfd that injects the
>>>>> virtual irq upon the irqfd signaling and he irqfd adaptation/ARM
>>>>> currently is implemented in vgic.c
>>>>
>>>> Ah, thanks for reminding me of the right terminology, I tend to think of
>>>> it as one big bag of nasty tricks... ;-)
>>>>
>>>> I'll update the commit message.
>>>>
>>>>>>
>>>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>>>>> ---
>>>>>>  include/kvm/arm_vgic.h |  2 +
>>>>>>  virt/kvm/arm/vgic.c    | 99 ++++++++++++++++++++++++++++++++++----------------
>>>>>>  2 files changed, 70 insertions(+), 31 deletions(-)
>>>>>>
>>>>>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>>>>>> index 7306b4b..f6bfd79 100644
>>>>>> --- a/include/kvm/arm_vgic.h
>>>>>> +++ b/include/kvm/arm_vgic.h
>>>>>> @@ -351,6 +351,8 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
>>>>>>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
>>>>>>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>>>>>>  			bool level);
>>>>>> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
>>>>>> +			       struct irq_phys_map *map, bool level);
>>>>>>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>>>>>>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>>>>>>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
>>>>>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>>>>>> index 3f7b690..e40ef70 100644
>>>>>> --- a/virt/kvm/arm/vgic.c
>>>>>> +++ b/virt/kvm/arm/vgic.c
>>>>>> @@ -1533,7 +1533,8 @@ static int vgic_validate_injection(struct kvm_vcpu *vcpu, int irq, int level)
>>>>>>  }
>>>>>>  
>>>>>>  static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>>>>>> -				  unsigned int irq_num, bool level)
>>>>>> +				   struct irq_phys_map *map,
>>>>>> +				   unsigned int irq_num, bool level)
>>>>>>  {
>>>>> In vgic_update_irq_pending, I needed to modify the following line and
>>>>> add the "&& !map".
>>>>>
>>>>>         if (!vgic_validate_injection(vcpu, irq_num, level) && !map) {
>>>>>
>>>>> Without that, the level being not properly modeled for level sensitive
>>>>> forwarded IRQs, the 2d injection fails.
>>>>
>>>> Ah! Is that because we never see the line being reset to zero, and the
>>>> VGIC still sees the line as pending at the distributor level?
>>> yes indeed
>>
>> Then it is a bigger problem we need to solve, and your solution just
>> papers over the issue.
>>
>> The main problem is that irqfd is essentially an edge-triggered
>> signalling. Fire and forget. Given that we're dealing with a level
>> triggered interrupt, we end up with the interrupt still marked as
>> pending (nobody took the signal down).
this does not really relate to irqfd: irqfd also comes with the concept
of resamplefd. in case the IRQ is not forwarded, this is the
irqfd_resampler_ack function that toggles the IRQ down (eventfd.c). Its
execution is triggered in vgic_process_maintenance. With forwarding the
EOI is not trappable anymore so this disappears.


>> The usual way to get out of that mess in is to evaluate the state of the
>> level on EOI. But we can't trap on EOI for a HW interrupt.
>>
>> So it raises the question: should we instead consider the HW pending
>> state instead of the software one for mapped interrupts? It is
>> expensive, but it feels more correct.
>>
> I thought we already covered this at LCA.  For mapped interrupts
> (forwarded) we should never consider the software pending state, because
> that state is managed by the hardware.  Or am I confusing concepts here?

Yes we discussed we should bypass most of the SW states.

in my previous integration I proposed a patch "KVM: arm: vgic: fix state
machine for forwarded IRQ". What's wrong with the below approach?


    Fix multiple injection of level sensitive forwarded IRQs.
    With current code, the second injection fails since the
    state bitmaps are not reset (process_maintenance is not
    called anymore).

    New implementation follows those principles:
    - A forwarded IRQ only can be sampled when it is pending
    - when queueing the IRQ (programming the LR), the pending state
      is removed as for edge sensitive IRQs
    - an injection of a forwarded IRQ is considered always valid since
      coming from the HW and level always is 1.

Eric

> 
> Thanks,
> -Christoffer
> 


^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 09/11] KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interrupt
@ 2015-08-05 11:47               ` Eric Auger
  0 siblings, 0 replies; 74+ messages in thread
From: Eric Auger @ 2015-08-05 11:47 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/05/2015 12:53 PM, Christoffer Dall wrote:
> On Wed, Aug 05, 2015 at 10:44:09AM +0100, Marc Zyngier wrote:
>> On 05/08/15 08:32, Eric Auger wrote:
>>> Hi Marc,
>>> On 08/04/2015 06:44 PM, Marc Zyngier wrote:
>>>> On 04/08/15 17:21, Eric Auger wrote:
>>>>> Hi Marc,
>>>>> On 07/24/2015 05:55 PM, Marc Zyngier wrote:
>>>>>> Virtual interrupts mapped to a HW interrupt should only be triggered
>>>>>> from inside the kernel. Otherwise, you could end up confusing the
>>>>>> kernel (and the GIC's) state machine.
>>>>>>
>>>>>> Rearrange the injection path so that kvm_vgic_inject_irq is
>>>>>> used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is
>>>>>> used for mapped interrupts. The latter should only be called from
>>>>>> inside the kernel (timer, VFIO).
>>>>> nit: I would replace VFIO by irqfd.
>>>>> VFIO just triggers the eventfd/irqfd. This is KVM/irqfd that injects the
>>>>> virtual irq upon the irqfd signaling and he irqfd adaptation/ARM
>>>>> currently is implemented in vgic.c
>>>>
>>>> Ah, thanks for reminding me of the right terminology, I tend to think of
>>>> it as one big bag of nasty tricks... ;-)
>>>>
>>>> I'll update the commit message.
>>>>
>>>>>>
>>>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>>>>> ---
>>>>>>  include/kvm/arm_vgic.h |  2 +
>>>>>>  virt/kvm/arm/vgic.c    | 99 ++++++++++++++++++++++++++++++++++----------------
>>>>>>  2 files changed, 70 insertions(+), 31 deletions(-)
>>>>>>
>>>>>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>>>>>> index 7306b4b..f6bfd79 100644
>>>>>> --- a/include/kvm/arm_vgic.h
>>>>>> +++ b/include/kvm/arm_vgic.h
>>>>>> @@ -351,6 +351,8 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
>>>>>>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
>>>>>>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>>>>>>  			bool level);
>>>>>> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
>>>>>> +			       struct irq_phys_map *map, bool level);
>>>>>>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>>>>>>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>>>>>>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
>>>>>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>>>>>> index 3f7b690..e40ef70 100644
>>>>>> --- a/virt/kvm/arm/vgic.c
>>>>>> +++ b/virt/kvm/arm/vgic.c
>>>>>> @@ -1533,7 +1533,8 @@ static int vgic_validate_injection(struct kvm_vcpu *vcpu, int irq, int level)
>>>>>>  }
>>>>>>  
>>>>>>  static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>>>>>> -				  unsigned int irq_num, bool level)
>>>>>> +				   struct irq_phys_map *map,
>>>>>> +				   unsigned int irq_num, bool level)
>>>>>>  {
>>>>> In vgic_update_irq_pending, I needed to modify the following line and
>>>>> add the "&& !map".
>>>>>
>>>>>         if (!vgic_validate_injection(vcpu, irq_num, level) && !map) {
>>>>>
>>>>> Without that, the level being not properly modeled for level sensitive
>>>>> forwarded IRQs, the 2d injection fails.
>>>>
>>>> Ah! Is that because we never see the line being reset to zero, and the
>>>> VGIC still sees the line as pending at the distributor level?
>>> yes indeed
>>
>> Then it is a bigger problem we need to solve, and your solution just
>> papers over the issue.
>>
>> The main problem is that irqfd is essentially an edge-triggered
>> signalling. Fire and forget. Given that we're dealing with a level
>> triggered interrupt, we end up with the interrupt still marked as
>> pending (nobody took the signal down).
this does not really relate to irqfd: irqfd also comes with the concept
of resamplefd. in case the IRQ is not forwarded, this is the
irqfd_resampler_ack function that toggles the IRQ down (eventfd.c). Its
execution is triggered in vgic_process_maintenance. With forwarding the
EOI is not trappable anymore so this disappears.


>> The usual way to get out of that mess in is to evaluate the state of the
>> level on EOI. But we can't trap on EOI for a HW interrupt.
>>
>> So it raises the question: should we instead consider the HW pending
>> state instead of the software one for mapped interrupts? It is
>> expensive, but it feels more correct.
>>
> I thought we already covered this at LCA.  For mapped interrupts
> (forwarded) we should never consider the software pending state, because
> that state is managed by the hardware.  Or am I confusing concepts here?

Yes we discussed we should bypass most of the SW states.

in my previous integration I proposed a patch "KVM: arm: vgic: fix state
machine for forwarded IRQ". What's wrong with the below approach?


    Fix multiple injection of level sensitive forwarded IRQs.
    With current code, the second injection fails since the
    state bitmaps are not reset (process_maintenance is not
    called anymore).

    New implementation follows those principles:
    - A forwarded IRQ only can be sampled when it is pending
    - when queueing the IRQ (programming the LR), the pending state
      is removed as for edge sensitive IRQs
    - an injection of a forwarded IRQ is considered always valid since
      coming from the HW and level always is 1.

Eric

> 
> Thanks,
> -Christoffer
> 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 09/11] KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interrupt
  2015-08-05 11:47               ` Eric Auger
@ 2015-08-05 13:47                 ` Christoffer Dall
  -1 siblings, 0 replies; 74+ messages in thread
From: Christoffer Dall @ 2015-08-05 13:47 UTC (permalink / raw)
  To: Eric Auger; +Cc: kvm, Marc Zyngier, Andre Przywara, kvmarm, linux-arm-kernel

On Wed, Aug 05, 2015 at 01:47:27PM +0200, Eric Auger wrote:
> On 08/05/2015 12:53 PM, Christoffer Dall wrote:
> > On Wed, Aug 05, 2015 at 10:44:09AM +0100, Marc Zyngier wrote:
> >> On 05/08/15 08:32, Eric Auger wrote:
> >>> Hi Marc,
> >>> On 08/04/2015 06:44 PM, Marc Zyngier wrote:
> >>>> On 04/08/15 17:21, Eric Auger wrote:
> >>>>> Hi Marc,
> >>>>> On 07/24/2015 05:55 PM, Marc Zyngier wrote:
> >>>>>> Virtual interrupts mapped to a HW interrupt should only be triggered
> >>>>>> from inside the kernel. Otherwise, you could end up confusing the
> >>>>>> kernel (and the GIC's) state machine.
> >>>>>>
> >>>>>> Rearrange the injection path so that kvm_vgic_inject_irq is
> >>>>>> used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is
> >>>>>> used for mapped interrupts. The latter should only be called from
> >>>>>> inside the kernel (timer, VFIO).
> >>>>> nit: I would replace VFIO by irqfd.
> >>>>> VFIO just triggers the eventfd/irqfd. This is KVM/irqfd that injects the
> >>>>> virtual irq upon the irqfd signaling and he irqfd adaptation/ARM
> >>>>> currently is implemented in vgic.c
> >>>>
> >>>> Ah, thanks for reminding me of the right terminology, I tend to think of
> >>>> it as one big bag of nasty tricks... ;-)
> >>>>
> >>>> I'll update the commit message.
> >>>>
> >>>>>>
> >>>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >>>>>> ---
> >>>>>>  include/kvm/arm_vgic.h |  2 +
> >>>>>>  virt/kvm/arm/vgic.c    | 99 ++++++++++++++++++++++++++++++++++----------------
> >>>>>>  2 files changed, 70 insertions(+), 31 deletions(-)
> >>>>>>
> >>>>>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> >>>>>> index 7306b4b..f6bfd79 100644
> >>>>>> --- a/include/kvm/arm_vgic.h
> >>>>>> +++ b/include/kvm/arm_vgic.h
> >>>>>> @@ -351,6 +351,8 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
> >>>>>>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
> >>>>>>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
> >>>>>>  			bool level);
> >>>>>> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
> >>>>>> +			       struct irq_phys_map *map, bool level);
> >>>>>>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
> >>>>>>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
> >>>>>>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
> >>>>>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> >>>>>> index 3f7b690..e40ef70 100644
> >>>>>> --- a/virt/kvm/arm/vgic.c
> >>>>>> +++ b/virt/kvm/arm/vgic.c
> >>>>>> @@ -1533,7 +1533,8 @@ static int vgic_validate_injection(struct kvm_vcpu *vcpu, int irq, int level)
> >>>>>>  }
> >>>>>>  
> >>>>>>  static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
> >>>>>> -				  unsigned int irq_num, bool level)
> >>>>>> +				   struct irq_phys_map *map,
> >>>>>> +				   unsigned int irq_num, bool level)
> >>>>>>  {
> >>>>> In vgic_update_irq_pending, I needed to modify the following line and
> >>>>> add the "&& !map".
> >>>>>
> >>>>>         if (!vgic_validate_injection(vcpu, irq_num, level) && !map) {
> >>>>>
> >>>>> Without that, the level being not properly modeled for level sensitive
> >>>>> forwarded IRQs, the 2d injection fails.
> >>>>
> >>>> Ah! Is that because we never see the line being reset to zero, and the
> >>>> VGIC still sees the line as pending at the distributor level?
> >>> yes indeed
> >>
> >> Then it is a bigger problem we need to solve, and your solution just
> >> papers over the issue.
> >>
> >> The main problem is that irqfd is essentially an edge-triggered
> >> signalling. Fire and forget. Given that we're dealing with a level
> >> triggered interrupt, we end up with the interrupt still marked as
> >> pending (nobody took the signal down).
> this does not really relate to irqfd: irqfd also comes with the concept
> of resamplefd. in case the IRQ is not forwarded, this is the
> irqfd_resampler_ack function that toggles the IRQ down (eventfd.c). Its
> execution is triggered in vgic_process_maintenance. With forwarding the
> EOI is not trappable anymore so this disappears.
> 
> 
> >> The usual way to get out of that mess in is to evaluate the state of the
> >> level on EOI. But we can't trap on EOI for a HW interrupt.
> >>
> >> So it raises the question: should we instead consider the HW pending
> >> state instead of the software one for mapped interrupts? It is
> >> expensive, but it feels more correct.
> >>
> > I thought we already covered this at LCA.  For mapped interrupts
> > (forwarded) we should never consider the software pending state, because
> > that state is managed by the hardware.  Or am I confusing concepts here?
> 
> Yes we discussed we should bypass most of the SW states.
> 
> in my previous integration I proposed a patch "KVM: arm: vgic: fix state
> machine for forwarded IRQ". What's wrong with the below approach?
> 
> 
>     Fix multiple injection of level sensitive forwarded IRQs.
>     With current code, the second injection fails since the
>     state bitmaps are not reset (process_maintenance is not
>     called anymore).
> 
>     New implementation follows those principles:
>     - A forwarded IRQ only can be sampled when it is pending
>     - when queueing the IRQ (programming the LR), the pending state
>       is removed as for edge sensitive IRQs
>     - an injection of a forwarded IRQ is considered always valid since
>       coming from the HW and level always is 1.
> 
I don't see anything wrong, and I thought this was what we discussed.

-Christoffer

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 09/11] KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interrupt
@ 2015-08-05 13:47                 ` Christoffer Dall
  0 siblings, 0 replies; 74+ messages in thread
From: Christoffer Dall @ 2015-08-05 13:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 05, 2015 at 01:47:27PM +0200, Eric Auger wrote:
> On 08/05/2015 12:53 PM, Christoffer Dall wrote:
> > On Wed, Aug 05, 2015 at 10:44:09AM +0100, Marc Zyngier wrote:
> >> On 05/08/15 08:32, Eric Auger wrote:
> >>> Hi Marc,
> >>> On 08/04/2015 06:44 PM, Marc Zyngier wrote:
> >>>> On 04/08/15 17:21, Eric Auger wrote:
> >>>>> Hi Marc,
> >>>>> On 07/24/2015 05:55 PM, Marc Zyngier wrote:
> >>>>>> Virtual interrupts mapped to a HW interrupt should only be triggered
> >>>>>> from inside the kernel. Otherwise, you could end up confusing the
> >>>>>> kernel (and the GIC's) state machine.
> >>>>>>
> >>>>>> Rearrange the injection path so that kvm_vgic_inject_irq is
> >>>>>> used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is
> >>>>>> used for mapped interrupts. The latter should only be called from
> >>>>>> inside the kernel (timer, VFIO).
> >>>>> nit: I would replace VFIO by irqfd.
> >>>>> VFIO just triggers the eventfd/irqfd. This is KVM/irqfd that injects the
> >>>>> virtual irq upon the irqfd signaling and he irqfd adaptation/ARM
> >>>>> currently is implemented in vgic.c
> >>>>
> >>>> Ah, thanks for reminding me of the right terminology, I tend to think of
> >>>> it as one big bag of nasty tricks... ;-)
> >>>>
> >>>> I'll update the commit message.
> >>>>
> >>>>>>
> >>>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >>>>>> ---
> >>>>>>  include/kvm/arm_vgic.h |  2 +
> >>>>>>  virt/kvm/arm/vgic.c    | 99 ++++++++++++++++++++++++++++++++++----------------
> >>>>>>  2 files changed, 70 insertions(+), 31 deletions(-)
> >>>>>>
> >>>>>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> >>>>>> index 7306b4b..f6bfd79 100644
> >>>>>> --- a/include/kvm/arm_vgic.h
> >>>>>> +++ b/include/kvm/arm_vgic.h
> >>>>>> @@ -351,6 +351,8 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
> >>>>>>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
> >>>>>>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
> >>>>>>  			bool level);
> >>>>>> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
> >>>>>> +			       struct irq_phys_map *map, bool level);
> >>>>>>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
> >>>>>>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
> >>>>>>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
> >>>>>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> >>>>>> index 3f7b690..e40ef70 100644
> >>>>>> --- a/virt/kvm/arm/vgic.c
> >>>>>> +++ b/virt/kvm/arm/vgic.c
> >>>>>> @@ -1533,7 +1533,8 @@ static int vgic_validate_injection(struct kvm_vcpu *vcpu, int irq, int level)
> >>>>>>  }
> >>>>>>  
> >>>>>>  static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
> >>>>>> -				  unsigned int irq_num, bool level)
> >>>>>> +				   struct irq_phys_map *map,
> >>>>>> +				   unsigned int irq_num, bool level)
> >>>>>>  {
> >>>>> In vgic_update_irq_pending, I needed to modify the following line and
> >>>>> add the "&& !map".
> >>>>>
> >>>>>         if (!vgic_validate_injection(vcpu, irq_num, level) && !map) {
> >>>>>
> >>>>> Without that, the level being not properly modeled for level sensitive
> >>>>> forwarded IRQs, the 2d injection fails.
> >>>>
> >>>> Ah! Is that because we never see the line being reset to zero, and the
> >>>> VGIC still sees the line as pending at the distributor level?
> >>> yes indeed
> >>
> >> Then it is a bigger problem we need to solve, and your solution just
> >> papers over the issue.
> >>
> >> The main problem is that irqfd is essentially an edge-triggered
> >> signalling. Fire and forget. Given that we're dealing with a level
> >> triggered interrupt, we end up with the interrupt still marked as
> >> pending (nobody took the signal down).
> this does not really relate to irqfd: irqfd also comes with the concept
> of resamplefd. in case the IRQ is not forwarded, this is the
> irqfd_resampler_ack function that toggles the IRQ down (eventfd.c). Its
> execution is triggered in vgic_process_maintenance. With forwarding the
> EOI is not trappable anymore so this disappears.
> 
> 
> >> The usual way to get out of that mess in is to evaluate the state of the
> >> level on EOI. But we can't trap on EOI for a HW interrupt.
> >>
> >> So it raises the question: should we instead consider the HW pending
> >> state instead of the software one for mapped interrupts? It is
> >> expensive, but it feels more correct.
> >>
> > I thought we already covered this at LCA.  For mapped interrupts
> > (forwarded) we should never consider the software pending state, because
> > that state is managed by the hardware.  Or am I confusing concepts here?
> 
> Yes we discussed we should bypass most of the SW states.
> 
> in my previous integration I proposed a patch "KVM: arm: vgic: fix state
> machine for forwarded IRQ". What's wrong with the below approach?
> 
> 
>     Fix multiple injection of level sensitive forwarded IRQs.
>     With current code, the second injection fails since the
>     state bitmaps are not reset (process_maintenance is not
>     called anymore).
> 
>     New implementation follows those principles:
>     - A forwarded IRQ only can be sampled when it is pending
>     - when queueing the IRQ (programming the LR), the pending state
>       is removed as for edge sensitive IRQs
>     - an injection of a forwarded IRQ is considered always valid since
>       coming from the HW and level always is 1.
> 
I don't see anything wrong, and I thought this was what we discussed.

-Christoffer

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 09/11] KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interrupt
  2015-08-05 13:47                 ` Christoffer Dall
@ 2015-08-06 16:44                   ` Marc Zyngier
  -1 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-08-06 16:44 UTC (permalink / raw)
  To: Christoffer Dall, Eric Auger
  Cc: kvm, kvmarm, linux-arm-kernel, Alex Bennée, Andre Przywara

On 05/08/15 14:47, Christoffer Dall wrote:
> On Wed, Aug 05, 2015 at 01:47:27PM +0200, Eric Auger wrote:
>> On 08/05/2015 12:53 PM, Christoffer Dall wrote:
>>> On Wed, Aug 05, 2015 at 10:44:09AM +0100, Marc Zyngier wrote:
>>>> On 05/08/15 08:32, Eric Auger wrote:
>>>>> Hi Marc,
>>>>> On 08/04/2015 06:44 PM, Marc Zyngier wrote:
>>>>>> On 04/08/15 17:21, Eric Auger wrote:
>>>>>>> Hi Marc,
>>>>>>> On 07/24/2015 05:55 PM, Marc Zyngier wrote:
>>>>>>>> Virtual interrupts mapped to a HW interrupt should only be triggered
>>>>>>>> from inside the kernel. Otherwise, you could end up confusing the
>>>>>>>> kernel (and the GIC's) state machine.
>>>>>>>>
>>>>>>>> Rearrange the injection path so that kvm_vgic_inject_irq is
>>>>>>>> used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is
>>>>>>>> used for mapped interrupts. The latter should only be called from
>>>>>>>> inside the kernel (timer, VFIO).
>>>>>>> nit: I would replace VFIO by irqfd.
>>>>>>> VFIO just triggers the eventfd/irqfd. This is KVM/irqfd that injects the
>>>>>>> virtual irq upon the irqfd signaling and he irqfd adaptation/ARM
>>>>>>> currently is implemented in vgic.c
>>>>>>
>>>>>> Ah, thanks for reminding me of the right terminology, I tend to think of
>>>>>> it as one big bag of nasty tricks... ;-)
>>>>>>
>>>>>> I'll update the commit message.
>>>>>>
>>>>>>>>
>>>>>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>>>>>>> ---
>>>>>>>>  include/kvm/arm_vgic.h |  2 +
>>>>>>>>  virt/kvm/arm/vgic.c    | 99 ++++++++++++++++++++++++++++++++++----------------
>>>>>>>>  2 files changed, 70 insertions(+), 31 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>>>>>>>> index 7306b4b..f6bfd79 100644
>>>>>>>> --- a/include/kvm/arm_vgic.h
>>>>>>>> +++ b/include/kvm/arm_vgic.h
>>>>>>>> @@ -351,6 +351,8 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
>>>>>>>>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
>>>>>>>>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>>>>>>>>  			bool level);
>>>>>>>> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
>>>>>>>> +			       struct irq_phys_map *map, bool level);
>>>>>>>>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>>>>>>>>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>>>>>>>>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
>>>>>>>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>>>>>>>> index 3f7b690..e40ef70 100644
>>>>>>>> --- a/virt/kvm/arm/vgic.c
>>>>>>>> +++ b/virt/kvm/arm/vgic.c
>>>>>>>> @@ -1533,7 +1533,8 @@ static int vgic_validate_injection(struct kvm_vcpu *vcpu, int irq, int level)
>>>>>>>>  }
>>>>>>>>  
>>>>>>>>  static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>>>>>>>> -				  unsigned int irq_num, bool level)
>>>>>>>> +				   struct irq_phys_map *map,
>>>>>>>> +				   unsigned int irq_num, bool level)
>>>>>>>>  {
>>>>>>> In vgic_update_irq_pending, I needed to modify the following line and
>>>>>>> add the "&& !map".
>>>>>>>
>>>>>>>         if (!vgic_validate_injection(vcpu, irq_num, level) && !map) {
>>>>>>>
>>>>>>> Without that, the level being not properly modeled for level sensitive
>>>>>>> forwarded IRQs, the 2d injection fails.
>>>>>>
>>>>>> Ah! Is that because we never see the line being reset to zero, and the
>>>>>> VGIC still sees the line as pending at the distributor level?
>>>>> yes indeed
>>>>
>>>> Then it is a bigger problem we need to solve, and your solution just
>>>> papers over the issue.
>>>>
>>>> The main problem is that irqfd is essentially an edge-triggered
>>>> signalling. Fire and forget. Given that we're dealing with a level
>>>> triggered interrupt, we end up with the interrupt still marked as
>>>> pending (nobody took the signal down).
>> this does not really relate to irqfd: irqfd also comes with the concept
>> of resamplefd. in case the IRQ is not forwarded, this is the
>> irqfd_resampler_ack function that toggles the IRQ down (eventfd.c). Its
>> execution is triggered in vgic_process_maintenance. With forwarding the
>> EOI is not trappable anymore so this disappears.
>>
>>
>>>> The usual way to get out of that mess in is to evaluate the state of the
>>>> level on EOI. But we can't trap on EOI for a HW interrupt.
>>>>
>>>> So it raises the question: should we instead consider the HW pending
>>>> state instead of the software one for mapped interrupts? It is
>>>> expensive, but it feels more correct.
>>>>
>>> I thought we already covered this at LCA.  For mapped interrupts
>>> (forwarded) we should never consider the software pending state, because
>>> that state is managed by the hardware.  Or am I confusing concepts here?
>>
>> Yes we discussed we should bypass most of the SW states.
>>
>> in my previous integration I proposed a patch "KVM: arm: vgic: fix state
>> machine for forwarded IRQ". What's wrong with the below approach?
>>
>>
>>     Fix multiple injection of level sensitive forwarded IRQs.
>>     With current code, the second injection fails since the
>>     state bitmaps are not reset (process_maintenance is not
>>     called anymore).
>>
>>     New implementation follows those principles:
>>     - A forwarded IRQ only can be sampled when it is pending
>>     - when queueing the IRQ (programming the LR), the pending state
>>       is removed as for edge sensitive IRQs
>>     - an injection of a forwarded IRQ is considered always valid since
>>       coming from the HW and level always is 1.
>>
> I don't see anything wrong, and I thought this was what we discussed.

I think I just lost track of what we've discussed at LCA. Ignore me.

I'll repost the updated first 10 patches tomorrow. Eric, are you willing
to take custody of patch 11 as part of one of your series?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 09/11] KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interrupt
@ 2015-08-06 16:44                   ` Marc Zyngier
  0 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-08-06 16:44 UTC (permalink / raw)
  To: linux-arm-kernel

On 05/08/15 14:47, Christoffer Dall wrote:
> On Wed, Aug 05, 2015 at 01:47:27PM +0200, Eric Auger wrote:
>> On 08/05/2015 12:53 PM, Christoffer Dall wrote:
>>> On Wed, Aug 05, 2015 at 10:44:09AM +0100, Marc Zyngier wrote:
>>>> On 05/08/15 08:32, Eric Auger wrote:
>>>>> Hi Marc,
>>>>> On 08/04/2015 06:44 PM, Marc Zyngier wrote:
>>>>>> On 04/08/15 17:21, Eric Auger wrote:
>>>>>>> Hi Marc,
>>>>>>> On 07/24/2015 05:55 PM, Marc Zyngier wrote:
>>>>>>>> Virtual interrupts mapped to a HW interrupt should only be triggered
>>>>>>>> from inside the kernel. Otherwise, you could end up confusing the
>>>>>>>> kernel (and the GIC's) state machine.
>>>>>>>>
>>>>>>>> Rearrange the injection path so that kvm_vgic_inject_irq is
>>>>>>>> used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is
>>>>>>>> used for mapped interrupts. The latter should only be called from
>>>>>>>> inside the kernel (timer, VFIO).
>>>>>>> nit: I would replace VFIO by irqfd.
>>>>>>> VFIO just triggers the eventfd/irqfd. This is KVM/irqfd that injects the
>>>>>>> virtual irq upon the irqfd signaling and he irqfd adaptation/ARM
>>>>>>> currently is implemented in vgic.c
>>>>>>
>>>>>> Ah, thanks for reminding me of the right terminology, I tend to think of
>>>>>> it as one big bag of nasty tricks... ;-)
>>>>>>
>>>>>> I'll update the commit message.
>>>>>>
>>>>>>>>
>>>>>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>>>>>>> ---
>>>>>>>>  include/kvm/arm_vgic.h |  2 +
>>>>>>>>  virt/kvm/arm/vgic.c    | 99 ++++++++++++++++++++++++++++++++++----------------
>>>>>>>>  2 files changed, 70 insertions(+), 31 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>>>>>>>> index 7306b4b..f6bfd79 100644
>>>>>>>> --- a/include/kvm/arm_vgic.h
>>>>>>>> +++ b/include/kvm/arm_vgic.h
>>>>>>>> @@ -351,6 +351,8 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
>>>>>>>>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
>>>>>>>>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>>>>>>>>  			bool level);
>>>>>>>> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
>>>>>>>> +			       struct irq_phys_map *map, bool level);
>>>>>>>>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>>>>>>>>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>>>>>>>>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
>>>>>>>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>>>>>>>> index 3f7b690..e40ef70 100644
>>>>>>>> --- a/virt/kvm/arm/vgic.c
>>>>>>>> +++ b/virt/kvm/arm/vgic.c
>>>>>>>> @@ -1533,7 +1533,8 @@ static int vgic_validate_injection(struct kvm_vcpu *vcpu, int irq, int level)
>>>>>>>>  }
>>>>>>>>  
>>>>>>>>  static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>>>>>>>> -				  unsigned int irq_num, bool level)
>>>>>>>> +				   struct irq_phys_map *map,
>>>>>>>> +				   unsigned int irq_num, bool level)
>>>>>>>>  {
>>>>>>> In vgic_update_irq_pending, I needed to modify the following line and
>>>>>>> add the "&& !map".
>>>>>>>
>>>>>>>         if (!vgic_validate_injection(vcpu, irq_num, level) && !map) {
>>>>>>>
>>>>>>> Without that, the level being not properly modeled for level sensitive
>>>>>>> forwarded IRQs, the 2d injection fails.
>>>>>>
>>>>>> Ah! Is that because we never see the line being reset to zero, and the
>>>>>> VGIC still sees the line as pending at the distributor level?
>>>>> yes indeed
>>>>
>>>> Then it is a bigger problem we need to solve, and your solution just
>>>> papers over the issue.
>>>>
>>>> The main problem is that irqfd is essentially an edge-triggered
>>>> signalling. Fire and forget. Given that we're dealing with a level
>>>> triggered interrupt, we end up with the interrupt still marked as
>>>> pending (nobody took the signal down).
>> this does not really relate to irqfd: irqfd also comes with the concept
>> of resamplefd. in case the IRQ is not forwarded, this is the
>> irqfd_resampler_ack function that toggles the IRQ down (eventfd.c). Its
>> execution is triggered in vgic_process_maintenance. With forwarding the
>> EOI is not trappable anymore so this disappears.
>>
>>
>>>> The usual way to get out of that mess in is to evaluate the state of the
>>>> level on EOI. But we can't trap on EOI for a HW interrupt.
>>>>
>>>> So it raises the question: should we instead consider the HW pending
>>>> state instead of the software one for mapped interrupts? It is
>>>> expensive, but it feels more correct.
>>>>
>>> I thought we already covered this at LCA.  For mapped interrupts
>>> (forwarded) we should never consider the software pending state, because
>>> that state is managed by the hardware.  Or am I confusing concepts here?
>>
>> Yes we discussed we should bypass most of the SW states.
>>
>> in my previous integration I proposed a patch "KVM: arm: vgic: fix state
>> machine for forwarded IRQ". What's wrong with the below approach?
>>
>>
>>     Fix multiple injection of level sensitive forwarded IRQs.
>>     With current code, the second injection fails since the
>>     state bitmaps are not reset (process_maintenance is not
>>     called anymore).
>>
>>     New implementation follows those principles:
>>     - A forwarded IRQ only can be sampled when it is pending
>>     - when queueing the IRQ (programming the LR), the pending state
>>       is removed as for edge sensitive IRQs
>>     - an injection of a forwarded IRQ is considered always valid since
>>       coming from the HW and level always is 1.
>>
> I don't see anything wrong, and I thought this was what we discussed.

I think I just lost track of what we've discussed at LCA. Ignore me.

I'll repost the updated first 10 patches tomorrow. Eric, are you willing
to take custody of patch 11 as part of one of your series?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 09/11] KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interrupt
  2015-08-06 16:44                   ` Marc Zyngier
@ 2015-08-07  7:05                     ` Eric Auger
  -1 siblings, 0 replies; 74+ messages in thread
From: Eric Auger @ 2015-08-07  7:05 UTC (permalink / raw)
  To: Marc Zyngier, Christoffer Dall
  Cc: kvm, kvmarm, linux-arm-kernel, Alex Bennée, Andre Przywara

Hi Marc,
On 08/06/2015 06:44 PM, Marc Zyngier wrote:
> On 05/08/15 14:47, Christoffer Dall wrote:
>> On Wed, Aug 05, 2015 at 01:47:27PM +0200, Eric Auger wrote:
>>> On 08/05/2015 12:53 PM, Christoffer Dall wrote:
>>>> On Wed, Aug 05, 2015 at 10:44:09AM +0100, Marc Zyngier wrote:
>>>>> On 05/08/15 08:32, Eric Auger wrote:
>>>>>> Hi Marc,
>>>>>> On 08/04/2015 06:44 PM, Marc Zyngier wrote:
>>>>>>> On 04/08/15 17:21, Eric Auger wrote:
>>>>>>>> Hi Marc,
>>>>>>>> On 07/24/2015 05:55 PM, Marc Zyngier wrote:
>>>>>>>>> Virtual interrupts mapped to a HW interrupt should only be triggered
>>>>>>>>> from inside the kernel. Otherwise, you could end up confusing the
>>>>>>>>> kernel (and the GIC's) state machine.
>>>>>>>>>
>>>>>>>>> Rearrange the injection path so that kvm_vgic_inject_irq is
>>>>>>>>> used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is
>>>>>>>>> used for mapped interrupts. The latter should only be called from
>>>>>>>>> inside the kernel (timer, VFIO).
>>>>>>>> nit: I would replace VFIO by irqfd.
>>>>>>>> VFIO just triggers the eventfd/irqfd. This is KVM/irqfd that injects the
>>>>>>>> virtual irq upon the irqfd signaling and he irqfd adaptation/ARM
>>>>>>>> currently is implemented in vgic.c
>>>>>>>
>>>>>>> Ah, thanks for reminding me of the right terminology, I tend to think of
>>>>>>> it as one big bag of nasty tricks... ;-)
>>>>>>>
>>>>>>> I'll update the commit message.
>>>>>>>
>>>>>>>>>
>>>>>>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>>>>>>>> ---
>>>>>>>>>  include/kvm/arm_vgic.h |  2 +
>>>>>>>>>  virt/kvm/arm/vgic.c    | 99 ++++++++++++++++++++++++++++++++++----------------
>>>>>>>>>  2 files changed, 70 insertions(+), 31 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>>>>>>>>> index 7306b4b..f6bfd79 100644
>>>>>>>>> --- a/include/kvm/arm_vgic.h
>>>>>>>>> +++ b/include/kvm/arm_vgic.h
>>>>>>>>> @@ -351,6 +351,8 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
>>>>>>>>>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
>>>>>>>>>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>>>>>>>>>  			bool level);
>>>>>>>>> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
>>>>>>>>> +			       struct irq_phys_map *map, bool level);
>>>>>>>>>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>>>>>>>>>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>>>>>>>>>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
>>>>>>>>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>>>>>>>>> index 3f7b690..e40ef70 100644
>>>>>>>>> --- a/virt/kvm/arm/vgic.c
>>>>>>>>> +++ b/virt/kvm/arm/vgic.c
>>>>>>>>> @@ -1533,7 +1533,8 @@ static int vgic_validate_injection(struct kvm_vcpu *vcpu, int irq, int level)
>>>>>>>>>  }
>>>>>>>>>  
>>>>>>>>>  static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>>>>>>>>> -				  unsigned int irq_num, bool level)
>>>>>>>>> +				   struct irq_phys_map *map,
>>>>>>>>> +				   unsigned int irq_num, bool level)
>>>>>>>>>  {
>>>>>>>> In vgic_update_irq_pending, I needed to modify the following line and
>>>>>>>> add the "&& !map".
>>>>>>>>
>>>>>>>>         if (!vgic_validate_injection(vcpu, irq_num, level) && !map) {
>>>>>>>>
>>>>>>>> Without that, the level being not properly modeled for level sensitive
>>>>>>>> forwarded IRQs, the 2d injection fails.
>>>>>>>
>>>>>>> Ah! Is that because we never see the line being reset to zero, and the
>>>>>>> VGIC still sees the line as pending at the distributor level?
>>>>>> yes indeed
>>>>>
>>>>> Then it is a bigger problem we need to solve, and your solution just
>>>>> papers over the issue.
>>>>>
>>>>> The main problem is that irqfd is essentially an edge-triggered
>>>>> signalling. Fire and forget. Given that we're dealing with a level
>>>>> triggered interrupt, we end up with the interrupt still marked as
>>>>> pending (nobody took the signal down).
>>> this does not really relate to irqfd: irqfd also comes with the concept
>>> of resamplefd. in case the IRQ is not forwarded, this is the
>>> irqfd_resampler_ack function that toggles the IRQ down (eventfd.c). Its
>>> execution is triggered in vgic_process_maintenance. With forwarding the
>>> EOI is not trappable anymore so this disappears.
>>>
>>>
>>>>> The usual way to get out of that mess in is to evaluate the state of the
>>>>> level on EOI. But we can't trap on EOI for a HW interrupt.
>>>>>
>>>>> So it raises the question: should we instead consider the HW pending
>>>>> state instead of the software one for mapped interrupts? It is
>>>>> expensive, but it feels more correct.
>>>>>
>>>> I thought we already covered this at LCA.  For mapped interrupts
>>>> (forwarded) we should never consider the software pending state, because
>>>> that state is managed by the hardware.  Or am I confusing concepts here?
>>>
>>> Yes we discussed we should bypass most of the SW states.
>>>
>>> in my previous integration I proposed a patch "KVM: arm: vgic: fix state
>>> machine for forwarded IRQ". What's wrong with the below approach?
>>>
>>>
>>>     Fix multiple injection of level sensitive forwarded IRQs.
>>>     With current code, the second injection fails since the
>>>     state bitmaps are not reset (process_maintenance is not
>>>     called anymore).
>>>
>>>     New implementation follows those principles:
>>>     - A forwarded IRQ only can be sampled when it is pending
>>>     - when queueing the IRQ (programming the LR), the pending state
>>>       is removed as for edge sensitive IRQs
>>>     - an injection of a forwarded IRQ is considered always valid since
>>>       coming from the HW and level always is 1.
>>>
>> I don't see anything wrong, and I thought this was what we discussed.
> 
> I think I just lost track of what we've discussed at LCA. Ignore me.
> 
> I'll repost the updated first 10 patches tomorrow. Eric, are you willing
> to take custody of patch 11 as part of one of your series?
Yes no problem I can take this over if you prefer. I can easily test it.
May I put it in the irq forwarding series?

Best Regards

Eric
> 
> Thanks,
> 
> 	M.
> 


^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 09/11] KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interrupt
@ 2015-08-07  7:05                     ` Eric Auger
  0 siblings, 0 replies; 74+ messages in thread
From: Eric Auger @ 2015-08-07  7:05 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Marc,
On 08/06/2015 06:44 PM, Marc Zyngier wrote:
> On 05/08/15 14:47, Christoffer Dall wrote:
>> On Wed, Aug 05, 2015 at 01:47:27PM +0200, Eric Auger wrote:
>>> On 08/05/2015 12:53 PM, Christoffer Dall wrote:
>>>> On Wed, Aug 05, 2015 at 10:44:09AM +0100, Marc Zyngier wrote:
>>>>> On 05/08/15 08:32, Eric Auger wrote:
>>>>>> Hi Marc,
>>>>>> On 08/04/2015 06:44 PM, Marc Zyngier wrote:
>>>>>>> On 04/08/15 17:21, Eric Auger wrote:
>>>>>>>> Hi Marc,
>>>>>>>> On 07/24/2015 05:55 PM, Marc Zyngier wrote:
>>>>>>>>> Virtual interrupts mapped to a HW interrupt should only be triggered
>>>>>>>>> from inside the kernel. Otherwise, you could end up confusing the
>>>>>>>>> kernel (and the GIC's) state machine.
>>>>>>>>>
>>>>>>>>> Rearrange the injection path so that kvm_vgic_inject_irq is
>>>>>>>>> used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is
>>>>>>>>> used for mapped interrupts. The latter should only be called from
>>>>>>>>> inside the kernel (timer, VFIO).
>>>>>>>> nit: I would replace VFIO by irqfd.
>>>>>>>> VFIO just triggers the eventfd/irqfd. This is KVM/irqfd that injects the
>>>>>>>> virtual irq upon the irqfd signaling and he irqfd adaptation/ARM
>>>>>>>> currently is implemented in vgic.c
>>>>>>>
>>>>>>> Ah, thanks for reminding me of the right terminology, I tend to think of
>>>>>>> it as one big bag of nasty tricks... ;-)
>>>>>>>
>>>>>>> I'll update the commit message.
>>>>>>>
>>>>>>>>>
>>>>>>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>>>>>>>> ---
>>>>>>>>>  include/kvm/arm_vgic.h |  2 +
>>>>>>>>>  virt/kvm/arm/vgic.c    | 99 ++++++++++++++++++++++++++++++++++----------------
>>>>>>>>>  2 files changed, 70 insertions(+), 31 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>>>>>>>>> index 7306b4b..f6bfd79 100644
>>>>>>>>> --- a/include/kvm/arm_vgic.h
>>>>>>>>> +++ b/include/kvm/arm_vgic.h
>>>>>>>>> @@ -351,6 +351,8 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
>>>>>>>>>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
>>>>>>>>>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>>>>>>>>>  			bool level);
>>>>>>>>> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
>>>>>>>>> +			       struct irq_phys_map *map, bool level);
>>>>>>>>>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>>>>>>>>>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>>>>>>>>>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
>>>>>>>>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>>>>>>>>> index 3f7b690..e40ef70 100644
>>>>>>>>> --- a/virt/kvm/arm/vgic.c
>>>>>>>>> +++ b/virt/kvm/arm/vgic.c
>>>>>>>>> @@ -1533,7 +1533,8 @@ static int vgic_validate_injection(struct kvm_vcpu *vcpu, int irq, int level)
>>>>>>>>>  }
>>>>>>>>>  
>>>>>>>>>  static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>>>>>>>>> -				  unsigned int irq_num, bool level)
>>>>>>>>> +				   struct irq_phys_map *map,
>>>>>>>>> +				   unsigned int irq_num, bool level)
>>>>>>>>>  {
>>>>>>>> In vgic_update_irq_pending, I needed to modify the following line and
>>>>>>>> add the "&& !map".
>>>>>>>>
>>>>>>>>         if (!vgic_validate_injection(vcpu, irq_num, level) && !map) {
>>>>>>>>
>>>>>>>> Without that, the level being not properly modeled for level sensitive
>>>>>>>> forwarded IRQs, the 2d injection fails.
>>>>>>>
>>>>>>> Ah! Is that because we never see the line being reset to zero, and the
>>>>>>> VGIC still sees the line as pending at the distributor level?
>>>>>> yes indeed
>>>>>
>>>>> Then it is a bigger problem we need to solve, and your solution just
>>>>> papers over the issue.
>>>>>
>>>>> The main problem is that irqfd is essentially an edge-triggered
>>>>> signalling. Fire and forget. Given that we're dealing with a level
>>>>> triggered interrupt, we end up with the interrupt still marked as
>>>>> pending (nobody took the signal down).
>>> this does not really relate to irqfd: irqfd also comes with the concept
>>> of resamplefd. in case the IRQ is not forwarded, this is the
>>> irqfd_resampler_ack function that toggles the IRQ down (eventfd.c). Its
>>> execution is triggered in vgic_process_maintenance. With forwarding the
>>> EOI is not trappable anymore so this disappears.
>>>
>>>
>>>>> The usual way to get out of that mess in is to evaluate the state of the
>>>>> level on EOI. But we can't trap on EOI for a HW interrupt.
>>>>>
>>>>> So it raises the question: should we instead consider the HW pending
>>>>> state instead of the software one for mapped interrupts? It is
>>>>> expensive, but it feels more correct.
>>>>>
>>>> I thought we already covered this at LCA.  For mapped interrupts
>>>> (forwarded) we should never consider the software pending state, because
>>>> that state is managed by the hardware.  Or am I confusing concepts here?
>>>
>>> Yes we discussed we should bypass most of the SW states.
>>>
>>> in my previous integration I proposed a patch "KVM: arm: vgic: fix state
>>> machine for forwarded IRQ". What's wrong with the below approach?
>>>
>>>
>>>     Fix multiple injection of level sensitive forwarded IRQs.
>>>     With current code, the second injection fails since the
>>>     state bitmaps are not reset (process_maintenance is not
>>>     called anymore).
>>>
>>>     New implementation follows those principles:
>>>     - A forwarded IRQ only can be sampled when it is pending
>>>     - when queueing the IRQ (programming the LR), the pending state
>>>       is removed as for edge sensitive IRQs
>>>     - an injection of a forwarded IRQ is considered always valid since
>>>       coming from the HW and level always is 1.
>>>
>> I don't see anything wrong, and I thought this was what we discussed.
> 
> I think I just lost track of what we've discussed at LCA. Ignore me.
> 
> I'll repost the updated first 10 patches tomorrow. Eric, are you willing
> to take custody of patch 11 as part of one of your series?
Yes no problem I can take this over if you prefer. I can easily test it.
May I put it in the irq forwarding series?

Best Regards

Eric
> 
> Thanks,
> 
> 	M.
> 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 09/11] KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interrupt
  2015-08-07  7:05                     ` Eric Auger
@ 2015-08-07  8:29                       ` Marc Zyngier
  -1 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-08-07  8:29 UTC (permalink / raw)
  To: Eric Auger, Christoffer Dall
  Cc: linux-arm-kernel, Andre Przywara, kvmarm, kvm

On 07/08/15 08:05, Eric Auger wrote:
> Hi Marc,
> On 08/06/2015 06:44 PM, Marc Zyngier wrote:
>> On 05/08/15 14:47, Christoffer Dall wrote:
>>> On Wed, Aug 05, 2015 at 01:47:27PM +0200, Eric Auger wrote:
>>>> On 08/05/2015 12:53 PM, Christoffer Dall wrote:
>>>>> On Wed, Aug 05, 2015 at 10:44:09AM +0100, Marc Zyngier wrote:
>>>>>> On 05/08/15 08:32, Eric Auger wrote:
>>>>>>> Hi Marc,
>>>>>>> On 08/04/2015 06:44 PM, Marc Zyngier wrote:
>>>>>>>> On 04/08/15 17:21, Eric Auger wrote:
>>>>>>>>> Hi Marc,
>>>>>>>>> On 07/24/2015 05:55 PM, Marc Zyngier wrote:
>>>>>>>>>> Virtual interrupts mapped to a HW interrupt should only be triggered
>>>>>>>>>> from inside the kernel. Otherwise, you could end up confusing the
>>>>>>>>>> kernel (and the GIC's) state machine.
>>>>>>>>>>
>>>>>>>>>> Rearrange the injection path so that kvm_vgic_inject_irq is
>>>>>>>>>> used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is
>>>>>>>>>> used for mapped interrupts. The latter should only be called from
>>>>>>>>>> inside the kernel (timer, VFIO).
>>>>>>>>> nit: I would replace VFIO by irqfd.
>>>>>>>>> VFIO just triggers the eventfd/irqfd. This is KVM/irqfd that injects the
>>>>>>>>> virtual irq upon the irqfd signaling and he irqfd adaptation/ARM
>>>>>>>>> currently is implemented in vgic.c
>>>>>>>>
>>>>>>>> Ah, thanks for reminding me of the right terminology, I tend to think of
>>>>>>>> it as one big bag of nasty tricks... ;-)
>>>>>>>>
>>>>>>>> I'll update the commit message.
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>>>>>>>>> ---
>>>>>>>>>>  include/kvm/arm_vgic.h |  2 +
>>>>>>>>>>  virt/kvm/arm/vgic.c    | 99 ++++++++++++++++++++++++++++++++++----------------
>>>>>>>>>>  2 files changed, 70 insertions(+), 31 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>>>>>>>>>> index 7306b4b..f6bfd79 100644
>>>>>>>>>> --- a/include/kvm/arm_vgic.h
>>>>>>>>>> +++ b/include/kvm/arm_vgic.h
>>>>>>>>>> @@ -351,6 +351,8 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
>>>>>>>>>>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
>>>>>>>>>>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>>>>>>>>>>  			bool level);
>>>>>>>>>> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
>>>>>>>>>> +			       struct irq_phys_map *map, bool level);
>>>>>>>>>>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>>>>>>>>>>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>>>>>>>>>>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
>>>>>>>>>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>>>>>>>>>> index 3f7b690..e40ef70 100644
>>>>>>>>>> --- a/virt/kvm/arm/vgic.c
>>>>>>>>>> +++ b/virt/kvm/arm/vgic.c
>>>>>>>>>> @@ -1533,7 +1533,8 @@ static int vgic_validate_injection(struct kvm_vcpu *vcpu, int irq, int level)
>>>>>>>>>>  }
>>>>>>>>>>  
>>>>>>>>>>  static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>>>>>>>>>> -				  unsigned int irq_num, bool level)
>>>>>>>>>> +				   struct irq_phys_map *map,
>>>>>>>>>> +				   unsigned int irq_num, bool level)
>>>>>>>>>>  {
>>>>>>>>> In vgic_update_irq_pending, I needed to modify the following line and
>>>>>>>>> add the "&& !map".
>>>>>>>>>
>>>>>>>>>         if (!vgic_validate_injection(vcpu, irq_num, level) && !map) {
>>>>>>>>>
>>>>>>>>> Without that, the level being not properly modeled for level sensitive
>>>>>>>>> forwarded IRQs, the 2d injection fails.
>>>>>>>>
>>>>>>>> Ah! Is that because we never see the line being reset to zero, and the
>>>>>>>> VGIC still sees the line as pending at the distributor level?
>>>>>>> yes indeed
>>>>>>
>>>>>> Then it is a bigger problem we need to solve, and your solution just
>>>>>> papers over the issue.
>>>>>>
>>>>>> The main problem is that irqfd is essentially an edge-triggered
>>>>>> signalling. Fire and forget. Given that we're dealing with a level
>>>>>> triggered interrupt, we end up with the interrupt still marked as
>>>>>> pending (nobody took the signal down).
>>>> this does not really relate to irqfd: irqfd also comes with the concept
>>>> of resamplefd. in case the IRQ is not forwarded, this is the
>>>> irqfd_resampler_ack function that toggles the IRQ down (eventfd.c). Its
>>>> execution is triggered in vgic_process_maintenance. With forwarding the
>>>> EOI is not trappable anymore so this disappears.
>>>>
>>>>
>>>>>> The usual way to get out of that mess in is to evaluate the state of the
>>>>>> level on EOI. But we can't trap on EOI for a HW interrupt.
>>>>>>
>>>>>> So it raises the question: should we instead consider the HW pending
>>>>>> state instead of the software one for mapped interrupts? It is
>>>>>> expensive, but it feels more correct.
>>>>>>
>>>>> I thought we already covered this at LCA.  For mapped interrupts
>>>>> (forwarded) we should never consider the software pending state, because
>>>>> that state is managed by the hardware.  Or am I confusing concepts here?
>>>>
>>>> Yes we discussed we should bypass most of the SW states.
>>>>
>>>> in my previous integration I proposed a patch "KVM: arm: vgic: fix state
>>>> machine for forwarded IRQ". What's wrong with the below approach?
>>>>
>>>>
>>>>     Fix multiple injection of level sensitive forwarded IRQs.
>>>>     With current code, the second injection fails since the
>>>>     state bitmaps are not reset (process_maintenance is not
>>>>     called anymore).
>>>>
>>>>     New implementation follows those principles:
>>>>     - A forwarded IRQ only can be sampled when it is pending
>>>>     - when queueing the IRQ (programming the LR), the pending state
>>>>       is removed as for edge sensitive IRQs
>>>>     - an injection of a forwarded IRQ is considered always valid since
>>>>       coming from the HW and level always is 1.
>>>>
>>> I don't see anything wrong, and I thought this was what we discussed.
>>
>> I think I just lost track of what we've discussed at LCA. Ignore me.
>>
>> I'll repost the updated first 10 patches tomorrow. Eric, are you willing
>> to take custody of patch 11 as part of one of your series?
> Yes no problem I can take this over if you prefer. I can easily test it.
> May I put it in the irq forwarding series?

That would be its natural location.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 09/11] KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interrupt
@ 2015-08-07  8:29                       ` Marc Zyngier
  0 siblings, 0 replies; 74+ messages in thread
From: Marc Zyngier @ 2015-08-07  8:29 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/08/15 08:05, Eric Auger wrote:
> Hi Marc,
> On 08/06/2015 06:44 PM, Marc Zyngier wrote:
>> On 05/08/15 14:47, Christoffer Dall wrote:
>>> On Wed, Aug 05, 2015 at 01:47:27PM +0200, Eric Auger wrote:
>>>> On 08/05/2015 12:53 PM, Christoffer Dall wrote:
>>>>> On Wed, Aug 05, 2015 at 10:44:09AM +0100, Marc Zyngier wrote:
>>>>>> On 05/08/15 08:32, Eric Auger wrote:
>>>>>>> Hi Marc,
>>>>>>> On 08/04/2015 06:44 PM, Marc Zyngier wrote:
>>>>>>>> On 04/08/15 17:21, Eric Auger wrote:
>>>>>>>>> Hi Marc,
>>>>>>>>> On 07/24/2015 05:55 PM, Marc Zyngier wrote:
>>>>>>>>>> Virtual interrupts mapped to a HW interrupt should only be triggered
>>>>>>>>>> from inside the kernel. Otherwise, you could end up confusing the
>>>>>>>>>> kernel (and the GIC's) state machine.
>>>>>>>>>>
>>>>>>>>>> Rearrange the injection path so that kvm_vgic_inject_irq is
>>>>>>>>>> used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is
>>>>>>>>>> used for mapped interrupts. The latter should only be called from
>>>>>>>>>> inside the kernel (timer, VFIO).
>>>>>>>>> nit: I would replace VFIO by irqfd.
>>>>>>>>> VFIO just triggers the eventfd/irqfd. This is KVM/irqfd that injects the
>>>>>>>>> virtual irq upon the irqfd signaling and he irqfd adaptation/ARM
>>>>>>>>> currently is implemented in vgic.c
>>>>>>>>
>>>>>>>> Ah, thanks for reminding me of the right terminology, I tend to think of
>>>>>>>> it as one big bag of nasty tricks... ;-)
>>>>>>>>
>>>>>>>> I'll update the commit message.
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>>>>>>>>> ---
>>>>>>>>>>  include/kvm/arm_vgic.h |  2 +
>>>>>>>>>>  virt/kvm/arm/vgic.c    | 99 ++++++++++++++++++++++++++++++++++----------------
>>>>>>>>>>  2 files changed, 70 insertions(+), 31 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>>>>>>>>>> index 7306b4b..f6bfd79 100644
>>>>>>>>>> --- a/include/kvm/arm_vgic.h
>>>>>>>>>> +++ b/include/kvm/arm_vgic.h
>>>>>>>>>> @@ -351,6 +351,8 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
>>>>>>>>>>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
>>>>>>>>>>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>>>>>>>>>>  			bool level);
>>>>>>>>>> +int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid,
>>>>>>>>>> +			       struct irq_phys_map *map, bool level);
>>>>>>>>>>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>>>>>>>>>>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>>>>>>>>>>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
>>>>>>>>>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>>>>>>>>>> index 3f7b690..e40ef70 100644
>>>>>>>>>> --- a/virt/kvm/arm/vgic.c
>>>>>>>>>> +++ b/virt/kvm/arm/vgic.c
>>>>>>>>>> @@ -1533,7 +1533,8 @@ static int vgic_validate_injection(struct kvm_vcpu *vcpu, int irq, int level)
>>>>>>>>>>  }
>>>>>>>>>>  
>>>>>>>>>>  static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>>>>>>>>>> -				  unsigned int irq_num, bool level)
>>>>>>>>>> +				   struct irq_phys_map *map,
>>>>>>>>>> +				   unsigned int irq_num, bool level)
>>>>>>>>>>  {
>>>>>>>>> In vgic_update_irq_pending, I needed to modify the following line and
>>>>>>>>> add the "&& !map".
>>>>>>>>>
>>>>>>>>>         if (!vgic_validate_injection(vcpu, irq_num, level) && !map) {
>>>>>>>>>
>>>>>>>>> Without that, the level being not properly modeled for level sensitive
>>>>>>>>> forwarded IRQs, the 2d injection fails.
>>>>>>>>
>>>>>>>> Ah! Is that because we never see the line being reset to zero, and the
>>>>>>>> VGIC still sees the line as pending at the distributor level?
>>>>>>> yes indeed
>>>>>>
>>>>>> Then it is a bigger problem we need to solve, and your solution just
>>>>>> papers over the issue.
>>>>>>
>>>>>> The main problem is that irqfd is essentially an edge-triggered
>>>>>> signalling. Fire and forget. Given that we're dealing with a level
>>>>>> triggered interrupt, we end up with the interrupt still marked as
>>>>>> pending (nobody took the signal down).
>>>> this does not really relate to irqfd: irqfd also comes with the concept
>>>> of resamplefd. in case the IRQ is not forwarded, this is the
>>>> irqfd_resampler_ack function that toggles the IRQ down (eventfd.c). Its
>>>> execution is triggered in vgic_process_maintenance. With forwarding the
>>>> EOI is not trappable anymore so this disappears.
>>>>
>>>>
>>>>>> The usual way to get out of that mess in is to evaluate the state of the
>>>>>> level on EOI. But we can't trap on EOI for a HW interrupt.
>>>>>>
>>>>>> So it raises the question: should we instead consider the HW pending
>>>>>> state instead of the software one for mapped interrupts? It is
>>>>>> expensive, but it feels more correct.
>>>>>>
>>>>> I thought we already covered this at LCA.  For mapped interrupts
>>>>> (forwarded) we should never consider the software pending state, because
>>>>> that state is managed by the hardware.  Or am I confusing concepts here?
>>>>
>>>> Yes we discussed we should bypass most of the SW states.
>>>>
>>>> in my previous integration I proposed a patch "KVM: arm: vgic: fix state
>>>> machine for forwarded IRQ". What's wrong with the below approach?
>>>>
>>>>
>>>>     Fix multiple injection of level sensitive forwarded IRQs.
>>>>     With current code, the second injection fails since the
>>>>     state bitmaps are not reset (process_maintenance is not
>>>>     called anymore).
>>>>
>>>>     New implementation follows those principles:
>>>>     - A forwarded IRQ only can be sampled when it is pending
>>>>     - when queueing the IRQ (programming the LR), the pending state
>>>>       is removed as for edge sensitive IRQs
>>>>     - an injection of a forwarded IRQ is considered always valid since
>>>>       coming from the HW and level always is 1.
>>>>
>>> I don't see anything wrong, and I thought this was what we discussed.
>>
>> I think I just lost track of what we've discussed at LCA. Ignore me.
>>
>> I'll repost the updated first 10 patches tomorrow. Eric, are you willing
>> to take custody of patch 11 as part of one of your series?
> Yes no problem I can take this over if you prefer. I can easily test it.
> May I put it in the irq forwarding series?

That would be its natural location.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 74+ messages in thread

end of thread, other threads:[~2015-08-07  8:29 UTC | newest]

Thread overview: 74+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-24 15:54 [PATCH v3 00/11] arm/arm64: KVM: Active interrupt state switching for shared devices Marc Zyngier
2015-07-24 15:54 ` Marc Zyngier
2015-07-24 15:54 ` [PATCH v3 01/11] arm/arm64: KVM: Fix ordering of timer/GIC on guest entry Marc Zyngier
2015-07-24 15:54   ` Marc Zyngier
2015-07-24 15:55 ` [PATCH v3 02/11] arm/arm64: KVM: Move vgic handling to a non-preemptible section Marc Zyngier
2015-07-24 15:55   ` Marc Zyngier
2015-07-24 15:55 ` [PATCH v3 03/11] KVM: arm/arm64: vgic: Convert struct vgic_lr to use bitfields Marc Zyngier
2015-07-24 15:55   ` Marc Zyngier
2015-07-24 15:55 ` [PATCH v3 04/11] KVM: arm/arm64: vgic: Allow HW irq to be encoded in LR Marc Zyngier
2015-07-24 15:55   ` Marc Zyngier
2015-08-04 14:33   ` Christoffer Dall
2015-08-04 14:33     ` Christoffer Dall
2015-07-24 15:55 ` [PATCH v3 05/11] KVM: arm/arm64: vgic: Relax vgic_can_sample_irq for edge IRQs Marc Zyngier
2015-07-24 15:55   ` Marc Zyngier
2015-07-24 15:55 ` [PATCH v3 06/11] KVM: arm/arm64: vgic: Allow dynamic mapping of physical/virtual interrupts Marc Zyngier
2015-07-24 15:55   ` Marc Zyngier
2015-08-04 13:04   ` Christoffer Dall
2015-08-04 13:04     ` Christoffer Dall
2015-08-04 15:27     ` Marc Zyngier
2015-08-04 15:27       ` Marc Zyngier
2015-08-04 17:36       ` Christoffer Dall
2015-08-04 17:36         ` Christoffer Dall
2015-08-04 18:10         ` Marc Zyngier
2015-08-04 18:10           ` Marc Zyngier
2015-07-24 15:55 ` [PATCH v3 07/11] KVM: arm/arm64: vgic: Allow HW interrupts to be queued to a guest Marc Zyngier
2015-07-24 15:55   ` Marc Zyngier
2015-08-04 14:33   ` Christoffer Dall
2015-08-04 14:33     ` Christoffer Dall
2015-07-24 15:55 ` [PATCH v3 08/11] KVM: arm/arm64: vgic: Add vgic_{get,set}_phys_irq_active Marc Zyngier
2015-07-24 15:55   ` [PATCH v3 08/11] KVM: arm/arm64: vgic: Add vgic_{get, set}_phys_irq_active Marc Zyngier
2015-07-24 15:55 ` [PATCH v3 09/11] KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interrupt Marc Zyngier
2015-07-24 15:55   ` Marc Zyngier
2015-08-04 13:45   ` Christoffer Dall
2015-08-04 13:45     ` Christoffer Dall
2015-08-04 16:02     ` Marc Zyngier
2015-08-04 16:02       ` Marc Zyngier
2015-08-04 17:38       ` Christoffer Dall
2015-08-04 17:38         ` Christoffer Dall
2015-08-04 16:21   ` Eric Auger
2015-08-04 16:21     ` Eric Auger
2015-08-04 16:44     ` Marc Zyngier
2015-08-04 16:44       ` Marc Zyngier
2015-08-05  7:32       ` Eric Auger
2015-08-05  7:32         ` Eric Auger
2015-08-05  9:44         ` Marc Zyngier
2015-08-05  9:44           ` Marc Zyngier
2015-08-05 10:53           ` Christoffer Dall
2015-08-05 10:53             ` Christoffer Dall
2015-08-05 11:47             ` Eric Auger
2015-08-05 11:47               ` Eric Auger
2015-08-05 13:47               ` Christoffer Dall
2015-08-05 13:47                 ` Christoffer Dall
2015-08-06 16:44                 ` Marc Zyngier
2015-08-06 16:44                   ` Marc Zyngier
2015-08-07  7:05                   ` Eric Auger
2015-08-07  7:05                     ` Eric Auger
2015-08-07  8:29                     ` Marc Zyngier
2015-08-07  8:29                       ` Marc Zyngier
2015-07-24 15:55 ` [PATCH v3 10/11] KVM: arm/arm64: timer: Allow the timer to control the active state Marc Zyngier
2015-07-24 15:55   ` Marc Zyngier
2015-08-04 13:56   ` Christoffer Dall
2015-08-04 13:56     ` Christoffer Dall
2015-08-04 16:14     ` Marc Zyngier
2015-08-04 16:14       ` Marc Zyngier
2015-08-04 17:40       ` Christoffer Dall
2015-08-04 17:40         ` Christoffer Dall
2015-07-24 15:55 ` [PATCH v3 11/11] KVM: arm/arm64: vgic: Allow HW interrupts for non-shared devices Marc Zyngier
2015-07-24 15:55   ` Marc Zyngier
2015-08-04 14:32   ` Christoffer Dall
2015-08-04 14:32     ` Christoffer Dall
2015-08-04 17:08     ` Marc Zyngier
2015-08-04 17:08       ` Marc Zyngier
2015-08-04 18:07       ` Christoffer Dall
2015-08-04 18:07         ` Christoffer Dall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.